AWS Lambda and S3: a quick and dirty tutorial
Figure out the basics without tearing your hair out
Why am I doing this?
I wanted to setup an example of how to use AWS Lambda with S3 for two reasons: do a talk on using these features at the Tucson Python Meetup (TuPLE), and help a a TuPLE member get started with a prototype for his (totally awesome) radiology image-processing functions.
BTW: this is a great reason to go to user group meetings. You can learn stuff and you are sure to find people like me who will help you do things.
The problem
I have a function that takes an image file as input. The function takes the image, processes it, analyzes it, and then creates some data as output. I want to make this function available to others. I want them to be able to upload an image and get the data back.
This seems like a problem for AWS Lambda with S3.
What is AWS Lambda and S3 anyway?
Whaa? Haven't you heard? It's like totally awesome.
AWS, as in Amazon Web Services, Lambda is a thing in AWS where you can upload a bundle of code and they will run it for you. It will only run one function (which can call other functions) though, so you can't use it to run a website or anything, but you would use it to offload a processing step, like in our problem above, and you can do it without having to setup virtual machines or anything. You just upload your code. This can significantly reduce costs, too, because you only pay for what gets run. I know a guy who is putting through 2-3 requests per second and pays like $1 a month. I know!
AWS S3 stands for simple storage service. This is basically storage in the cloud. You can use it to store files and serve static files (like a simple .html page). And AWS provides a number of ways to integrate it with Lambda. You can setup Lambda functions to respond to events in your S3 bucket, and and you can use Lambda functions to save files to your S3 bucket. An event would be something like a file was uploaded, a file was changed, a file was deleted.
How it works
Lambda will react to events in your bucket. The event is a json object with lots of details about the event. Your function would then use this event data to do whatever it needs to do. Part of the data is the name of the file that ended up in the bucket. Your function can use that to get the file and do the things.
AWS provides a Python SDK called boto3 that makes it easy to integrate your functions with AWS services, including S3.
Why not just use the AWS tutorials?
This tutorial says a lot of things. Too many things, and a lot of it you don't really need to do if you just want a simple example. For example, we won't be using the AWS CLI. (But if you really want to install the AWS CLI, and you are on Windows 10, I suggest enabling the Ubuntu sub-system and using that. You can use apt-get to install it.)
You can use that tutorial and that is what I did, sorta, but I was frustrated and sometimes befuddled.
This is good if you are not familiar with the Lambda service.
This is good if you are not familiar with the S3 service.
So here is a quick and dirty getting started guide with some thoughts on where to go next.
This will work with UI versions as of 7/2/2017.
Sign up with AWS
Get yourself an account with AWS if you haven't already. And a word to the wise, protect yourself and setup multi-factor authentication when you can.
Create an S3 bucket
This is where all your files are going to go.
- Login to AWS and go to the S3 console. You might have to click Get started to go to the actual console.
- Click +Create bucket
- Enter a bucket name.
- Select a region. Make a note of what you select here. In this quick example it doesn't really matter what you select, but you usually want to select a region closest to you or your users. The default is US East (N. Virginia)
- Click Next
- This is where you select some bucket properties. We can version our files, setup access logs, and tags. We won't use any of these things here.
- Click Next
- Set permissions. This is how you configure what AWS users can access the S3 bucket. Take a look at the permissions by expanding each section. You can leave the defaults.
- Click Next
- Here you can review the settings.
- Click Create bucket
- Upload a happy face image to your bucket. Make sure the file is called HappyFace.jpg. You can grab this one:
We will use to test our Lambda function.
Now that you have an S3 bucket you can create one of the built-in (blueprint) Lambda functions and integrate it with your S3 bucket.
Setup a blueprint Lambda function
This is really nice. AWS provides a number of sample Lambda functions you can setup. AWS calls these blueprints. Here we will use the s3-get-object-python
blueprint.
It is worth a moment to checkout the other blueprints. You might find something that you could use.
- Go to the Lambda console. On your AWS console home, you should be able to search for Lambda to find the link. Before you get to the Dashboard you might have to click Get started.
- Make sure you select a region in the top menu bar next to your username. Select the same region that you selected in your S3 bucket. If your function is not created in the same region as your S3 bucket you will get an error something like
does not compute header got us-west-1 but expected us-east-1
- On the AWS Lambda Dashboard click Create a Lambda function.
- You will get the Select blueprint page. Here you get a bunch of options. We are going to use the S3 example.
- Click
s3-get-object-python
- On the next page you will setup your function and configre a trigger. It should look like you are going to associate S3 bucket events with this Lambda function like this
- Select the bucket you created above.
- In Event Type select Object Created. This will cover any event related to creating and updating a file in the bucket.
- Prefix -- leave blank. If you had created a folder in your bucket you could put that here.
- Suffix -- leave blank. If you put a file extension then the Lambda function will only care about files with that extension.
- Check the Enable trigger box
- Click Next
- Enter a Name of your function
- See the code section. Look it through. This is the function that Lambda will run when an event is triggered. This code will read the content type of the file (e.g. image, plain text) and print and output the type.
This is what the code looks like:
from __future__ import print_function
import json
import urllib
import boto3
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
#print("Received event: " + json.dumps(event, indent=2))
# Get the object from the event and show its content type
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8'))
try:
response = s3.get_object(Bucket=bucket, Key=key)
print("CONTENT TYPE: " + response['ContentType'])
return response['ContentType']
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
raise e
- Scroll down to the Lambda function handler and role section
- Notice the handler. This is the function in your code that Lambda will call.
- Select a Role. The Lambda console will create the exact Role you need to run your function. Select
Create new role from templates
- Enter the Role name. I named mine something related to the name of my function.
- In Policy templates select
S3 object read-only permissions
- There are other settings but we don't need those right now.
- Click *Next(
- You will see the review page. Make sure you entered all the right things. Click Create function.
- You should see your function listed in the dashboard.
Now let's check the IAM to see what Lambda created for us. (IAM is the place where you manage access permissions for your services).
- Go to the IAM management console
- On the left hand menu click Roles
- You should see the Role you created on the Lambda console. Click it in the list
- Lambda automatically created a policy for this role. Under the Permissions tab you should see a policy that looks like
AWSLambdaBasicExecutionRole-blahblahblah
- This is not enough to successfully run the function. We have to give this role permission to access S3. We can do that with a policy. Stay on the Permissions tab and click Attach Policy. Select
AmazonS3ReadOnlyAccess
- Wait about a minute to let the changes propagate through AWS and then we can go on to the next steps.
Use the built-in test in Lambda to try it out
- Back on the Lambda console, click the Functions menu and then click on your function.
- Click the Actions selector and select Configure test event
- Select a Sample event template. You will want to select S3 Put.
- Notice the sample event data that the test provides. This won't quite work for our test. We will need to make some changes to point it to our S3 bucket and use our new role.
- Select and copy the event data sample and paste it into a text editor. I like Visual Studio Code because it is awesome.
- If you are using a text editor with syntax highlighting, make sure to save your file as .json to take advantage of that.
- Where it says
principalId
replace the value with your role name. - Where we have the
bucket
object, find thename
field and replace the value with the name of your bucket. - In the
s3 object
replace theeTag
value with the ETag property of your HappyFace.jpg file in your bucket. You can find it if you go to your bucket an then click HappyFace.jpg. You should see the properties. Copy and paste the ETag into the json data. - Where we have the
bucket
object, find thearn
field and replace the value last part after ::: with the name of your bucket.
My own event data looks like this:
{
"Records": [
{
"eventVersion": "2.0",
"eventTime": "1970-01-01T00:00:00.000Z",
"requestParameters": {
"sourceIPAddress": "127.0.0.1"
},
"s3": {
"configurationId": "testConfigRule",
"object": {
"eTag": "c1e0946156638cad4c6ac699e90f54d2",
"sequencer": "0A1B2C3D4E5F678901",
"key": "HappyFace.jpg",
"size": 1024
},
"bucket": {
"arn": "arn:aws:s3:::tuple-source",
"name": "tuple-source",
"ownerIdentity": {
"principalId": "tuple-demo"
}
},
"s3SchemaVersion": "1.0"
},
"responseElements": {
"x-amz-id-2": "EXAMPLE123/5678abcdefghijklambdaisawesome/mnopqrstuvwxyzABCDEFGH",
"x-amz-request-id": "EXAMPLE123456789"
},
"awsRegion": "us-west-2",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "tuple-demo"
},
"eventSource": "aws:s3"
}
]
}
Save your file for posterity and then copy and paste the whole of the json data over the sample event template where we got it.
Click Save and test.
Wait for it to run and you should see success. Congrats you did it!
What's next?
This example is nice and all but you probably don't care about a file's type. Maybe you even have a function you think would work great as a Lambda function. This tutorial will take you through creating a deployment package for Lambda in Python.
Learn more about Lambda
- Using the AWS CLI
- Tags
- Monitoring
Learn more about S3
Once you have a handle on S3 and Lambda you can build a Python application that will upload files to the S3 bucket. Here is a simple example of how to use the boto3 SDK to do it. You could incorporate this logic in a Python module in a bigger system, like a Flask app or a web API.
Tweet me if you have any questions or problems.
Full Stack .NET Programmer and Ham