Skip to content
Advertisement

Fastest way to transform a continuous string of hex into base 10 in Python

I have a 100M 3600 character length strings of hexadecimal digits that I want to split into blocks of three and then convert into base 10. Strictly speaking, I want to transform these into signed 4 byte numbers.

JavaScript

As I have 100M of these strings to process, my main aim is code efficiency/speed.

For splitting the strings I am using regex:

JavaScript

For converting from hex I am using Pythons built-in int converter:

JavaScript

For the overall code I am combining things with a list comprehension:

JavaScript

I have run cProfile (over a smaller sample of 3000), and it seems the total time for splitting and converting are approximately equal, although converting happens far more (1200 times per string), whereas splitting only occurs once.

JavaScript

Are there any ways I can improve the speed of this code?

Advertisement

Answer

Are there any ways I can improve the speed of this code?

You might try using functools.lru_cache decorator which will save and then use values for already seen input rather than computing it again following way

JavaScript

where 4096 is number of all possible inputs. Note that it will increase memory usage.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement