Creating AWS EC2 Instances is effortless these days, either by the web console or through Amazon’s APIs. Just as easy (but oftentimes an afterthought) is the termination of these Instances while testing or in a lab environment. Manual termination is possible, but why do the work when it can be automated? Better yet, event-driven automation. Given an AWS region name (or list of regions!), a Lambda function will find and terminate “tagless” Instances.
All the code used in this post can be found on our
GitHub repo.
Summary
This walkthrough will demonstrate:
- Define “tagless” and event parameters
- Node.js Implementation of logic (yay asynchronous API calls!)
- AWS Identity and Access Management (IAM) Policy, and Role definition for Lambda function
- AWS Lambda function creation
- Examples
1) Conventions
AWS EC2 implements Tags as a list of 0 to 10 dictionaries associated with an Instance. Each dictionary is guaranteed to have two keys,
Key
and
Value
. The preferred definition of “tagless” is an Instance with an empty list
[]
of Tags, or a 1-item list, with a
Key
of
Name
and a
Value
of
""
. The following examples are “tagless”:
[]
[{Key:"Name", Value:""}]
AWS EC2 will strip whitespace from Tags upon creation, so Value: " "
will be tagged as Value: ""
.
While the event that drives this function can come from any AWS Lambda source, it still needs to be defined as it contains the necessary parameters. JSON, being the ubiquitous format it is, will be used with a single key which contains a list of regions upon which to operate:
{
"region": ["us-west-1", ...]
}
Warming up the Node.js interpreter takes ~50ms and network requests can be non-blocking. So taking action against multiple regions in a single invocation has potential to save time over repeated Lambda function executions against a single region.
2) Node.js Implementation
Give the code found in
index.js an eye over and adjust
DryRun
flags as desired.
When DryRun
is set to true, the AWS API will return a DryRunOperation
exception. This is why having error handlers is useful.
If a different definition of “tagless” is desired, the isTagless()
function logic can be changed without affecting the rest of the code.
3) Required IAM Policy and Role
AWS Lambda can invoke AWS APIs through a number of different SDKs, but only if it has proper permissions to do so. Permissions in this context are two-fold:
- Allow AWS Lambda to call AWS APIs on your behalf (Role)
- Allow specific API actions for the Role (Policy)
First create the necessary policy. From the AWS console, navigate to
Identity and Access Management (IAM) and find the
Policies tab. From here, create a Policy by using the
Policy Generator with
Amazon EC2 describeInstances
and terminateInstances
checked. You must select “Add Statement” in order to stage this Policy. Continue on and review the Policy in full after providing a name and description.
The Sid
key is generated dynamically and is likely to be different for other implementations.
Next create the Role which is associated with this new Policy. In IAM, with the
Roles tab selected,
create a new Role and specify
AWS Lambda as the Service Role. Next,
attach the recently created Policy made in the step previous and review before creating.
By this point, the proper API actions (
describeInstances
,
terminateInstances
) have been marked as allowed in a new Policy, which is now attached to a new Role. Time to move on to creating the AWS Lambda function which will consume this new Role.
4) AWS Lambda Function
As to be expected by Amazon’s APIs, Lambda functions
can be created from the commandline/SDK(s). While not difficult to do so, it’s still useful to see and use the AWS Lambda web console.
From the main AWS console page,
locate Lambda and continue on to
create a function without using a blueprint. Configure the function, pasting in the code and specifying the Role under which this function will execute. In the Advanced Settings menu, the timeout operation has been increased to 10 seconds – this will offset network latency in API calls from AWS Lambda. The function will exit appropriately if it finishes before 10 seconds and you will not be billed for remaining time.
It has been observed that API calls from AWS Lambda itself are tens to hundreds of miliseconds slower than when invoking from a local development environment – hence the large increase to timeout. Mileage may vary…
Configure a test event with specific regions to execute – a
single region or
multiple regions may be supplied. Be sure to check that
DryRun
is set as deemed appropriate while testing.
The terminateInstances
API call is idempotent. That is, multiple calls with the same set of Instance Ids will continue to return success/true for up to an hour. Feel free to invoke many times, even if the previous run has started the termination process.
5) Benchmarks
Provided are AWS Lambda web console results for different invocations of this function:
- A single region with a single Instance.
- Two AWS regions with only one region having a single Instance to terminate.
- All AWS regions with only one region having a single Instance to terminate.
- All AWS regions with nearly all having 1 to 7 Instances to be terminated.
Note that there is an increase in function runtime when adding multiple regions as this means more API calls. However, as more regions are specified in the event the expense is amortized due to asynchronous network calls being non-blocking.
Closing Thoughts
- While AWS Lambda functions can call APIs for other regions, Lambda itself is only available in a specific few regions. There might be some optimizations gained from deploying this function to multiple regions to handle events for other regions near, rather than having one region call API frontends for all regions.
- The Policy created here has two actions. There is arguably more modularity in creating two separate Policies, each having a single action.
- While there is no danger in using the semaphore demonstrated, the Promise library might be considered if more asynchronous calls needed to be coordinated.
- Be sure to review AWS Lambda function logs in CloudWatch as they sometimes can be too large for the small log output buffer on the Lambda test page.
- Currently the function will call
context.succeed()
even if a single API call fails. A global boolean could be used to mark if any request fails and call context.fail()
appropriately. (Any failure is returned as cleanly as possible as can be seen in the benchmarks with DryRunOperation
exception messages.)