Is there an efficient mass string concatenation method in Python (like StringBuilder in C# or StringBuffer in Java)?
I found following methods here:
- Simple concatenation using
+
- Using a string list and the
join
method - Using
UserString
from theMutableString
module - Using a character array and the
array
module - Using
cStringIO
from theStringIO
module
What should be used and why?
Advertisement
Answer
If you know all components beforehand once, use the literal string interpolation, also known as f
-strings or formatted strings, introduced in Python 3.6.
Given the test case from mkoistinen’s answer, having strings
domain = 'some_really_long_example.com' lang = 'en' path = 'some/really/long/path/'
The contenders and their execution time on my computer using Python 3.6 on Linux as timed by IPython and the timeit module are
f'http://{domain}/{lang}/{path}'
– 0.151 µs'http://%s/%s/%s' % (domain, lang, path)
– 0.321 µs'http://' + domain + '/' + lang + '/' + path
– 0.356 µs''.join(('http://', domain, '/', lang, '/', path))
– 0.249 µs (notice that building a constant-length tuple is slightly faster than building a constant-length list).
Thus the shortest and the most beautiful code possible is also fastest.
The speed can be contrasted with the fastest method for Python 2, which is +
concatenation on my computer; and that takes 0.203 µs with 8-bit strings, and 0.259 µs if the strings are all Unicode.
(In alpha versions of Python 3.6 the implementation of f''
strings was the slowest possible – actually the generated byte code is pretty much equivalent to the ''.join()
case with unnecessary calls to str.__format__
which without arguments would just return self
unchanged. These inefficiencies were addressed before 3.6 final.)