Skip to content
Advertisement

On AWS Lambda, Openpyxl doesn’t keep track of the image

when I have a model.xlsx with an image and this code is working perfectly on windows. (keeping the image in output.xlsx)

import openpyxl

wb = openpyxl.load_workbook('model.xlsx')
#doing some stuff on the wb
wb.save('output.xlsx')

Now when I do this on my AWS Lambda everything works perfectly BUT I don’t have the image on the output.xlsx. No error message raised.

import json
import openpyxl
from tempfile import NamedTemporaryFile
import boto3
import botocore


def lambda_handler(event, context):
    s3_client = boto3.client('s3', aws_access_key_id='*****', aws_secret_access_key='*****')
    wb = openpyxl.load_workbook('model.xlsx')
    #doing some stuff on the wb
    with NamedTemporaryFile() as tmp:
        wb.save(tmp.name)
        tmp.seek(0)
        s3_client.upload_file(tmp.name, "my-bucket-name", "filename-in-the-bucket")

    return {
        'statusCode': 200,
        'body': json.dumps("Hello World")
    }

Should I raise a ticket to AWS ? openpyxl ? Why is there no error message ?

Advertisement

Answer

Ok, I’m not sure 100% but I am gonna assume that import like pip3 install --target ./theholypath openpyxl can work but will fail with tricky dependencies (files that you don’t have and are store elsewhere) like it failed today with Pillow.

So some heros did the work for us and packages perfectly the libraries into something that we call layer. For each librarie that you need, you may found a corresponding layer already build by good people. (if you are lucky, not every library has a layer).

For my part by example, I added 2 layers:

  • arn:aws:lambda:eu-west-3:770693421928:layer:Klayers-python38-Pillow:8
  • arn:aws:lambda:eu-west-3:770693421928:layer:Klayers-python38-openpyxl:5

And… it works ! my output.xlsx has now images from model.xlsx, you don’t even need to do the pip install --target my/path my_lib_name

Now, you can find your own needed layers and the way to implement it here.

WARNING:

Some layer versions will have a expiry_date field. This is the date for when the layers > will be deleted.

In general, layers are scheduled for deletion 60 days after a new layer version has been > published for that package. If you use that latest version of a layer, you’re guaranteed > at least 60 days before the layer is deleted.

All functions deployed with a layer will still work indefinitely, but you won’t be able > to deploy new functions referencing a deleted layer version.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement