Articles

Transcoding Audio Files with AWS Lambda and SQS without Elastic Transcoder


If you find that using AWS Elastic Transcoder is too expensive, and you don't want to use an EC2 instance for transcoding, perhaps this might help you out.

In the previous post we created a static SOX binary that can be executed inside an AWS Lambda Container. In this post we are going to use it to trigger it when uploading an audio file to an S3 bucket, transcode it, and upload the result to another bucket. Here's how this will work:

  • When a file is uploaded to a specific S3 bucket, S3 will send a message to an SQS queue.
  • Every minute a CloudWatch event will execute our Lambda.
  • In each execution, our Lambda will read a few messages from the SQS Queue and use the static SOX binary file to transcode the audio.
  • After the transcoding is done, the resulting file will be uploaded to a different S3 bucket.

The code is also available in this GitHub Repo.

Setting up the S3 buckets

Go to your S3 console, and create two buckets:

  • The "source" bucket: This is where we are going to upload the files to be transcoded.
  • The "target" bucket: After the files are transcoded, our Lambda will upload the result here.

We're going to setup some events for them in a bit.

Setting up the SQS queue

Let's create the SQS queue that will connect your source S3 bucket to your transcoder Lambda. The source S3 bucket will publish a message for every new object created and your Lambda will react to them.

In your SQS console, create a new queue, set the Default Visibility Timeout to 1 second, you can leave the other values with their defaults.

Select your queue, and in the bottom of the screen go to the Permissions tab, and click Edit Policy Document.

You will need a policy like the one below to allow your "source" bucket to publish messages. Note that you should change some values to match the AWS region of your SQS, the SQS queue name, S3 bucket name (the "source" bucket name), and AWS account id.

Setup the event sources in the Source S3 Bucket

Now that we have the SQS queue in place, we can configure an event that will be triggered for every interesting new object uploaded to this bucket. Go to your S3 Console:

  • Select your "source" bucket, click "Properties", and then click on "Events".
  • Select "Send to SQS queue".
  • In the "SQS Queue" dropdown, select the SQS queue that we created in the step above.
  • As the suffix, you can put for example ".gsm" or ".wav", or just leave it blank and you will get an SQS message for every file regardless of its extension.
  • You can also specify a prefix for the files you're interested in, or just leave it blank.
  • In the "Events" dropdown, select "ObjectCreated" / "Put".
  • Write a name for this new event.

Create a Lambda function

This is it! Everything's ready now, so we only have to upload the right code to do the job. Go to your Lambda Console and create a new Lambda, choose NodeJS 4.3 as the runtime.

You should use a role with a policy like the following, in order to allow the new Lambda to read messages from SQS and also read and write to the source and target buckets:

Download the code from GitHub at https://github.com/marcelog/aws-sqs-lambda-audio-transcoding and then run:

In your Lambda Console, select your function and then select the "Code" tab and upload the .ZIP file just created.

Setup a CloudWatch event to periodically trigger your Lambda

In your Lambda console select your function, and then click the "Triggers" tab. Click "Add trigger" and choose "CloudWatch Event", you can then select the 1 minute rate event option.

Testing your new AWS Lambda audio transcoder

After all of this, the result is that your code will be automatically executed every minute, and will wait a few seconds for SQS messages coming in from your source S3 bucket, will transcode your audio files and save them into another S3 bucket. Enjoy!