Optimize conversion of numpy ndarray to string

I am currently doing a python program to convert from image to hex string and the other way around. I need two functions, one that takes an image and returns a hex string that corresponds to the RGB values of each pixel, and another function that takes a hex string, two ints, and generates a visible image of that size corresponding to that hex string.

I currently use imageio to get an RGB matrix from the image and then convert that to hex. I’m trying to optimize the Image to bytes part, as it takes around 2.5 seconds for a 442KB image of 918 x 575 pixels.

How could I make it quicker?

Here’s the code:

def rgb2hex(rgb):

    """
    convert a list or tuple of RGB values
    to a string in hex
    """

    r,g,b = rgb
    return '{:02x}{:02x}{:02x}'.format(r, g, b)


def arrayToString(array):
    """
    convert an array to a string
    """

    string = ""
    for element in array:
        string += str(element)

    return string


def sliceStr(string,sliceLenght):
    """
    slice a string in chunks of sliceLenght lenght
    """

    string = str(string)
    array = np.array([string[i:i+sliceLenght] for i in range(0,len(string),sliceLenght)])
    return array



def hexToRGB(hexadecimal):
    """
    convert a hex string to an array of RGB values
    """
    h = hexadecimal.lstrip('#')
    if len(h)!=6:
        return
    return [int(h[i:i+2], 16) for i in (0, 2, 4)]

def ImageToBytes(image):
    """
    Image to convert from image to bytes
    """
    dataToEncrypt =imageio.imread(image)

    if dataToEncrypt.shape[2] ==4:
        dataToEncrypt = np.delete(dataToEncrypt,3,2)

    originalRows, originalColumns,_ = dataToEncrypt.shape


    #converting rgb to hex
    hexVal = np.apply_along_axis(rgb2hex, 2, dataToEncrypt)
    hexVal = np.apply_along_axis(arrayToString, 1, hexVal)
    hexVal = str(np.apply_along_axis(arrayToString, 0, hexVal))

    byteImage = bytes.fromhex(hexVal)

    return (byteImage, [originalRows,originalColumns])

Answer

One simple approach is to use tobytes on the numpy array. E.g.,

image = imageio.imread(filename)
# Drop the alpha channel.
if image.shape[2] == 4:
    image = image[..., :3]
# Convert to bytes directly.
byte_image = image.tobytes()

On my machine, this gives a 250x speed up compared with converting to strings first. Note: this will only work if the dtype of the array is uint8. But that’s luckily the default provided by imread.

Advertisement

Answer