There are a lot of articles around the web concerning Python performance. The first thing you read is concatenating strings should not be done using ‘+’; avoid s1 + s2 + s3, and instead use str.join
I tried the following: concatenating two strings as part of a directory path: three approaches:
- ‘+’ which I should not do
- str.join
- os.path.join
Here is my code:
JavaScript
x
21
21
1
import os, time
2
3
s1 = '/part/one/of/dir'
4
s2 = 'part/two/of/dir'
5
N = 10000
6
7
t = time.clock()
8
for i in xrange(N):
9
s = s1 + os.sep + s2
10
print time.clock() - t
11
12
t = time.clock()
13
for i in xrange(N):
14
s = os.sep.join((s1, s2))
15
print time.clock() - t
16
17
t = time.clock()
18
for i in xrange(N):
19
s = os.path.join(s1, s2)
20
print time.clock() - t
21
Here the results (Python 2.5 on Windows XP):
JavaScript
1
4
1
0.0182201927899
2
0.0262544541275
3
0.120238186697
4
Shouldn’t it be exactly the other way around?
Advertisement
Answer
It is true you should not use ‘+’. Your example is quite special. Try the same code with:
JavaScript
1
3
1
s1 = '*' * 100000
2
s2 = '+' * 100000
3
Then the second version (str.join) is much faster.