Monitoring EC2 Health and SSM Status with AWS Lambda and EventBridge: A Step-by-Step Guide


Ensuring your Amazon EC2 instances are healthy and reachable is vital for infrastructure stability and application uptime. In this guide, we’ll walk through how to monitor EC2 instance health and AWS Systems Manager (SSM) availability using AWS Lambda and Amazon EventBridge to automate real-time detection and alerting.


Why Monitor EC2 and SSM Status?

  • EC2 Health: Unhealthy EC2 instances can lead to service degradation or downtime.

  • SSM Status: AWS Systems Manager allows remote access and automation. If it's not working, managing your infrastructure becomes difficult.

By integrating EventBridge and Lambda, you can monitor and respond to these issues proactively.


 Architecture Overview

The system comprises the following components:

  1. Amazon EventBridge will capture EC2 state changes and SSM availability events.

  2. AWS Lambda function triggered by EventBridge rules.

  3. Amazon SNS or other notification mechanisms for alerting.


Step 1: Create an EventBridge Rule for EC2 State Changes

Create a rule that triggers when an EC2 instance changes state.

Sample Event Pattern:


{

  "source": ["aws.ec2"],

  "detail-type": ["EC2 Instance State-change Notification"],

  "detail": {

    "state": ["stopped", "terminated", "shutting-down"]

  }

}


This pattern catches instances being stopped or terminated.


Step 2: Monitor SSM Agent Status with EventBridge

To monitor SSM connectivity (e.g., “Managed” vs “Unmanaged”):

  1. Enable AWS Config to track AWS:SSMManagedInstanceInventory.

  2. Use EventBridge to detect changes in the PingStatus.

Example pattern for SSM connectivity:


{

  "source": ["aws.ssm"],

  "detail-type": ["EC2 Instance State Change Notification"],

  "detail": {

    "status": ["ConnectionLost", "Inactive"]

  }

}


Alternatively, use AWS CloudWatch Metrics for SSMManagedInstanceAvailability.


 Step 3: Create the AWS Lambda Function

The Lambda function will process the events and trigger alerts.

Sample Python Code:


import json

import boto3


def lambda_handler(event, context):

    print("Received event:", json.dumps(event))

    

    instance_id = event['detail']['instance-id']

    state = event['detail']['state']

    

    message = f"Instance {instance_id} is in state: {state}"

    

    sns = boto3.client('sns')

    sns.publish(

        TopicArn='arn:aws:sns:your-region:your-account-id:YourTopic',

        Message=message,

        Subject='EC2 Instance Alert'

    )

    

    return {'statusCode': 200, 'body': json.dumps('Notification sent')}


Make sure the Lambda has permissions to publish to SNS and access EventBridge.


Step 4: Connect Lambda with EventBridge

  • Go to EventBridge → Rules.

  • Select your rule and choose Target → Lambda Function.

  • Attach the Lambda you created earlier.


 Step 5: (Optional) Add CloudWatch Alarms for SSM Status

If you want to track SSM availability proactively:

  1. Go to CloudWatch → Metrics → SSM → Instance Information.

  2. Create alarms on the PingStatus metric.

  3. Use SNS to send alerts.


 Benefits of This Setup

  • Automated Monitoring: No manual intervention required.

  • Real-Time Alerts: Immediate notifications for degraded infrastructure.

  • Scalable: Works across multiple accounts and regions with minimal changes.


 Conclusion

By leveraging EventBridge and Lambda, you can build a lightweight yet powerful EC2 and SSM monitoring solution that integrates seamlessly with your AWS infrastructure. This proactive approach ensures that potential issues are detected and addressed swiftly, boosting your system’s reliability and maintainability.


Comments

YouTube Channel

Follow us on X