I am currently learning AWS, mostly s3 and lambda services. The idea is to save an image in one bucket, resize it and move to another bucket. I have searched for dozen tutorials and finally made it work. However, I have not found(or don`t know how to search) for an example of how to deal with images with prefixes.
This is the code I am using:
def resize_image(image_path, resized_path): with Image.open(image_path) as image: # image.thumbnail((128, 128)) image.save(resized_path,optimize=True,quality=20) def lambda_handler(event, context): for record in event['Records']: bucket = record['s3']['bucket']['name'] key = record['s3']['object']['key'] download_path = '/tmp/'+key upload_path = '/tmp/resized-{}'.format(key) s3_client.download_file(bucket, key, download_path) resize_image(download_path, upload_path) s3_client.upload_file(upload_path, 'bucket2', key)
It all works perfectly if my image is named just ‘test.jpg’. However, my real images are stored in multiple directories seperated by year, month, day. And it looks something like this: ‘2020/06/10/test.jpg’. But even if I upload an image with one prefix, for example: ‘test/test.jpg’ and try to use my resize function, I get this error:
[ERROR] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/test/test.jpg.fDAe2cFd'
Obviously this is not correct, because tmp folder does not have folders itself. But how do I get the image then? I tried using just the image name to check if the file exists like this:
download_path = '/tmp/'+os.path.basename(key) upload_path = '/tmp/resized-{}'.format(os.path.basename(key))
Obviously the image does not exist:
[ERROR] UnidentifiedImageError: cannot identify image file '/tmp/test.jpg'
So what is the correct solution to this problem? I am fairly new to this whole AWS thing and getting stuck constantly… Its starting to get really annoying and im losing hope.
Advertisement
Answer
The syntax for upload_file()
is:
upload_file(Filename, Bucket, Key)
Therefore, this line in your code:
s3_client.upload_file(upload_path, 'bucket2', key)
should be:
s3_client.upload_file(download_path, 'bucket2', upload_path)
So, I’m surprised that your code is working at all, even with no directories.
The reason that your code is failing with multiple directories is due to these lines:
download_path = '/tmp/'+key s3_client.download_file(bucket, key, download_path)
If the key is foo/bar
then it will attempt to download a file to /tmp/foo/bar
. However, the /tmp/foo
directory does not exist. Unlike Amazon S3, operating systems normally want a directory to exist before writing a file to that location.
I notice that your code is based on Tutorial: Using an Amazon S3 trigger to create thumbnail images – AWS Lambda. In that sample code, you’ll notice that it contains this line:
tmpkey = key.replace('/', '')
This removes subdirectories from the path, thus avoiding the problem.