Optimizing AWS Spending: A Guide to Remove Stale Resources for Cost Efficiency

Optimizing AWS Spending: A Guide to Remove Stale Resources for Cost Efficiency

🗣️DevOps Enthusiasts ♾️ you should know this definitely .

Concepts to be known :

  • What is AWS Lambda?

    AWS Lambda is a serverless computing service offered by Amazon Web Services (AWS). It enables developers to run code in response to events without provisioning or managing servers. Lambda scales automatically and charges based on the compute time consumed, allowing for efficient and cost-effective execution of applications.

  • What is the difference between AWS EC2 and AWS Lambda?

    AWS EC2 (Elastic Compute Cloud) provides virtual servers that users can manage and scale as needed, offering full control over infrastructure. AWS Lambda is a serverless service where users upload code to execute in response to events, abstracting server management and scaling, and charging only for compute time.

  • Why is Cost Optimization very important?

    Cost optimization in the cloud is vital due to its pay-as-you-go model. Without careful management, cloud costs can escalate rapidly, affecting budgets and profitability. Effective optimization strategies ensure efficient resource utilization, prevent wastage, and enable businesses to leverage cloud benefits while controlling expenses for sustainable growth and competitiveness.

  • What is EBS volume?

    Amazon Elastic Block Store (EBS) provides persistent block-level storage volumes for use with Amazon EC2 instances. EBS volumes are like virtual hard drives and can be attached to EC2 instances to store data persistently. They offer durability, scalability, and the ability to be backed up and restored easily.

  • What is a Snapshot?

    A snapshot is a point-in-time copy of data stored in Amazon Web Services (AWS) services such as Amazon Elastic Block Store (EBS) volumes or Amazon Redshift clusters. It captures the entire state of the data at the time the snapshot is created, enabling easy backups, replication, and recovery.

  • What are Stale Resources?

    Stale resources refer to unused or outdated resources within a system, typically in the context of cloud computing environments or IT infrastructure. These can include unused virtual machines, storage volumes, databases, or other resources that were provisioned but are no longer actively utilized or needed for operations.

Overview of the Project:

  • High level overview

Explanation:

First user creates a AWS EC2 Instance ,whenever you create a Instance a EBS Volume is created with it, suppose user has something important data to keep track of it user usually creates a Snapshot ,moreover user after completion of his work usually deletes the Instance and automatically Volume(EBS) will be deleted, what if user forgets to delete the snapshot, AWS will also charge you for it.

Steps to reproduce:

1.Create a EC2 instance ,by default it creates a EBS volume.

2.Create a snapshot of volume (for creating snapshot : go to instance > create snapshot >select the volume).

3.You will be having the following resources (one instance, one volume, one snapshots).

4.What if user after completion of his work forgets to delete the snapshots , No worries , Here's a Script to delete the stale resource.

Creating Lambda Function:

1.Go to Services and type in lambda.

2.As a beginner ,you are creating your first lambda function the console like this.

3.Click on create function with the option "Author from scratch" and you can give your prefered function name and select Runtime as Python 3.10 and rest leave it as is and create a lambda function.

4.Now Paste the below code in the lambda function's code section.

import boto3

def lambda_handler(event, context):
    ec2 = boto3.client('ec2')

    # Get all EBS snapshots
    response = ec2.describe_snapshots(OwnerIds=['self'])

    # Get all active EC2 instance IDs
    instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
    active_instance_ids = set()

    for reservation in instances_response['Reservations']:
        for instance in reservation['Instances']:
            active_instance_ids.add(instance['InstanceId'])

    # Iterate through each snapshot and delete if it's not attached to any volume or the volume is not attached to a running instance
    for snapshot in response['Snapshots']:
        snapshot_id = snapshot['SnapshotId']
        volume_id = snapshot.get('VolumeId')

        if not volume_id:
            # Delete the snapshot if it's not attached to any volume
            ec2.delete_snapshot(SnapshotId=snapshot_id)
            print(f"Deleted EBS snapshot {snapshot_id} as it was not attached to any volume.")
        else:
            # Check if the volume still exists
            try:
                volume_response = ec2.describe_volumes(VolumeIds=[volume_id])
                if not volume_response['Volumes'][0]['Attachments']:
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as it was taken from a volume not attached to any running instance.")
            except ec2.exceptions.ClientError as e:
                if e.response['Error']['Code'] == 'InvalidVolume.NotFound':
                    # The volume associated with the snapshot is not found (it might have been deleted)
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as its associated volume was not found.")

5.Now click on deploy and then click on test , it pops window asking event name ,give the event name and then save it.

6.Then move to configuration section > go to General configuration and increase the execution time of Lambda function to 10 sec.

7.By default the execution time of lambda function is 3 sec.

8.And then move to code and click on test, you will be getting the below error.

9.The error is you don't have permission to access "Describe Snapshots" to grant permission to the above error.

10.For that first we need to go to the role, who is executing this task and change or add permissions.

11.To do so, got to configuration section in lambda > Permissions > and then click on Role name.

12.After clicking on the Role name , you will be redirected to IAM console then move to Policies section and Create Policy > Asks you service Select EC2 service > Select the policies "Delete Snapshot " and Describe Snapshots".

13.And then give it policy name and create policy.

14.Now we need to add the policy to the Role who is executing the lambda function.

15.Go to IAM console by clicking on the role, and add the permission after that you will be seeing the following permissions in IAM console.

16.Now again move to the code and click on test, you will be again getting an error as below.

17.It says you don't have permission to access "Describe Instance"

18.Again we need to create a policy as we made as earlier and add "Describe volumes and also Describe Instances".

19.Again we need to add the policies to the role, now you will be seeing the following permissions to the role.

20.Now click on test, you wont get an error, the output looks like this.

21.Delete the EC2 Instance by default EBS volume will also be deleted, you see the volume is deleted but the instance is also deleted, but it did not reflect the console(still the snapshot remains).

22.Now again click on the test ,it will delete your Snapshot which is not associated with any volume.

23.Now see the snapshot it is also deleted.

24.After completing all the task , Delete the Lambda function and it is also good to delete the policies.

Note: Don't forget to delete the resources whatever you created.

Credits: I followed this tutorialhttps://www.youtube.com/watch?v=3ExnySHBO6k&list=PLdpzxOOAlwvKwTyYNJCUwGPvql0TrsPgv&index=14to make this blog.

Pls followAbhishek Veeramalla for more interesting topics in DevOps, his teaching is simply superb.

Thank you.