Python concatenate (multiple) strings to number

Question

I have a python list ['a1', 'b1', 'a2', 'b2','a3', 'b3']. Set m=3 and I want get this list using loops, because here m=3 could be a larger number such as m=100. Since we can have then I try to get ['a1', 'b1', 'a2', 'b2','a3', 'b3'] using and have TypeError: can only concatenate list (not "str") to list Then I try

Accepted Answer

A simple combined list comprehension would work as pointed out in the @j1-lee&#8217;s answer (and later in other answers).import stringdef letter_number_loop(n, m):    letters = string.ascii_letters[:n]    numbers = range(1, m + 1)    return [f"{letter}{number}" for number in numbers for letter in letters]Similarly, one could use itertools.product(), as evidenced in Nick&#8217;s answer, to obtain substantially the same:import itertoolsdef letter_number_it(n, m):    letters = string.ascii_letters[:n]    numbers = range(1, m + 1)    return [        f"{letter}{number}"        for number, letter in itertools.product(numbers, letters)]However, it is possible to write a NumPy-vectorized approach, making use of the fact that if the dtype is object, the operations do follow the Python semantics.import numpy as npdef letter_number_np(n, m):    letters = np.array(list(string.ascii_letters[:n]), dtype=object)    numbers = np.array([f"{i}" for i in range(1, m + 1)], dtype=object)    return (letters[None, :] + numbers[:, None]).ravel().tolist()Note that the final numpy.ndarray.tolist() could be avoided if whatever will consume the output is capable of dealing with the NumPy array itself, thus saving some relatively small but definitely appreciable time.Inspecting OutputThe following do indicate that the functions are equivalent:funcs = letter_number_loop, letter_number_it, letter_number_npn, m = 2, 3for func in funcs:    print(f"{func.__name__!s:>32}  {func(n, m)}")              letter_number_loop  ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']                letter_number_it  ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']                letter_number_np  ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']BenchmarksFor larger inputs, this is substantially faster, as evidenced by these benchmarks:timings = {}k = 16for n in (2, 20):    for k in range(1, 10):        m = 2 ** k        print(f"n = {n}, m = {m}")        timings[n, m] = []        base = funcs[0](n, m)        for func in funcs:            res = func(n, m)            is_good = base == res            timed = %timeit -r 64 -n 64 -q -o func(n, m)            timing = timed.best * 1e6            timings[n, m].append(timing if is_good else None)            print(f"{func.__name__:>24}  {is_good}  {timing:10.3f} µs")to be plotted with:import matplotlib.pyplot as pltimport pandas as pdn_s = (2, 20)fig, axs = plt.subplots(1, len(n_s), figsize=(12, 4))for i, n in enumerate(n_s):    partial_timings = {k[1]: v for k, v in timings.items() if k[0] == n}    df = pd.DataFrame(data=partial_timings, index=[func.__name__ for func in funcs]).transpose()    df.plot(marker='o', xlabel='Input size / #', ylabel='Best timing / µs', ax=axs[i], title=f"n = {n}")These show that the explicitly looped versions (letter_number_loop() and letter_number_it()) are somewhat comparable, while the NumPy-vectorized (letter_number_np()) fares much better relatively quickly for larger inputs, up to ~2x speed-up.

Advertisement

Answer

Inspecting Output

Benchmarks