Skip to content
Advertisement

AWS s3, lambda. How do you download image from tmp that has a prefix?

I am currently learning AWS, mostly s3 and lambda services. The idea is to save an image in one bucket, resize it and move to another bucket. I have searched for dozen tutorials and finally made it work. However, I have not found(or don`t know how to search) for an example of how to deal with images with prefixes.

This is the code I am using:

def resize_image(image_path, resized_path):
    with Image.open(image_path) as image:
        # image.thumbnail((128, 128))
        image.save(resized_path,optimize=True,quality=20)

def lambda_handler(event, context):
    
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key'] 
        
        download_path = '/tmp/'+key
        upload_path = '/tmp/resized-{}'.format(key)

        s3_client.download_file(bucket, key, download_path)
        resize_image(download_path, upload_path)
        s3_client.upload_file(upload_path, 'bucket2', key)

It all works perfectly if my image is named just ‘test.jpg’. However, my real images are stored in multiple directories seperated by year, month, day. And it looks something like this: ‘2020/06/10/test.jpg’. But even if I upload an image with one prefix, for example: ‘test/test.jpg’ and try to use my resize function, I get this error:

[ERROR] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/test/test.jpg.fDAe2cFd'

Obviously this is not correct, because tmp folder does not have folders itself. But how do I get the image then? I tried using just the image name to check if the file exists like this:

    download_path = '/tmp/'+os.path.basename(key)
    upload_path = '/tmp/resized-{}'.format(os.path.basename(key))

Obviously the image does not exist:

[ERROR] UnidentifiedImageError: cannot identify image file '/tmp/test.jpg'

So what is the correct solution to this problem? I am fairly new to this whole AWS thing and getting stuck constantly… Its starting to get really annoying and im losing hope.

Advertisement

Answer

The syntax for upload_file() is:

upload_file(Filename, Bucket, Key)

Therefore, this line in your code:

s3_client.upload_file(upload_path, 'bucket2', key)

should be:

s3_client.upload_file(download_path, 'bucket2', upload_path)

So, I’m surprised that your code is working at all, even with no directories.

The reason that your code is failing with multiple directories is due to these lines:

download_path = '/tmp/'+key

s3_client.download_file(bucket, key, download_path)        

If the key is foo/bar then it will attempt to download a file to /tmp/foo/bar. However, the /tmp/foo directory does not exist. Unlike Amazon S3, operating systems normally want a directory to exist before writing a file to that location.

I notice that your code is based on Tutorial: Using an Amazon S3 trigger to create thumbnail images – AWS Lambda. In that sample code, you’ll notice that it contains this line:

tmpkey = key.replace('/', '')

This removes subdirectories from the path, thus avoiding the problem.

Advertisement