AWS SQS
Lambda Integration
with Resiliency Patterns
Hi there, I’m Sanjay 👋!
About me
- 🔭 Working: Lead Software Engineer - working on Spring Boot, Reactive Programming, Microservices, Kafka, Cassandra, Kubernetes, AWS.
- 🖥️ Interests: I love building cool Software & Systems, Self-Hosting, Gaming
- 🌱 Learning:
Go | Rust | Scala | Design Patterns
- 💬 Ask me about: Java | Reactive Spring | Containers | AWS
- 🧑🤝🧑 Collaboration: Looking to collaborate on several projects over here, check out my GitHub
What is AWS SQS
- SQS offers a Secure, Durable, and Highly Available Hosted Queue
- Used to integrate and decouple distributed software systems and components
- Standard queues support at-least-once message delivery, and FIFO queues support exactly-once message processing and high-throughput mode
- Offers Dead Letter Queues
SQS Concepts w.r.t Lambda
- Queue Types: Standard and FIFO
- Visibility timeout: wait time after which message is visible again if not deleted after processing
- Message retention period: 1 minute to 14 days (defaults to 4 days)
- Delivery delay: 0 seconds to 15 minutes
- SQS Event Source Mapping: Lambda Service reads items from a SQS & invokes Lambda Function
- Enable trigger: Enable or disable SQS-Lambda Integration
- Batch size: number of records to send to the function in each batch. Standard: 10,000(max). FIFO: 10(max)
- Batch window: wait time (in second) to gather records before invoking the function, applicable for Standard
Message Lifecycle
SQS Visibility Timeout
- To avoid message loss, it's consumers responsibility to delete the message after processing
- Message remains in queue after it is received, but SQS sets a visibility timeout to prevent other consumer from processing same message again
- Default visibility timeout is 30 seconds. Can be set between 0 seconds to 12 hours
- Can be set a Queue level or dynamically changed per message
- Caution: When using FIFO make sure to use Message GroupId which provides high distribution to avoid blocking processing due to error
Note:
- For Standard queues, the visibility timeout isn't a guarantee against receiving a message twice.
- FIFO queues allow the producer or consumer to attempt multiple retries: producers can retry send using deduplicationId and consumers doesn't receive messages for same message groupId unless deleted or timed-out
In-Flight Messages & Scaling
- Messages that are received by the consumer but not deleted are known as In-Flight messages
- For Standard queues: there can be a maximum of approximately 120,000 in flight messages
- For FIFO queues: there can be a maximum of 20,000 in flight messages
- Standard Queues: Lambda uses long polling & reads up to 5 batches to invoke your function.
- Up until 06 Nov 2023, the Lambda was adding up to 60 concurrent executions/minute, scaling up to a maximum of 1,250 concurrent executions in approximately 20 minutes.
- Now Lambda functions can scale up to 5x faster, adding up to 300 concurrent executions/minute.
- Even at peak performance, the Maximum number of messages processed concurrently by Lambda = 10 messages in a batch * 1250 executions = 12,500
- FIFO queues: Lambda sends messages to your function in the order that it receives them and ensures that messages in the same group are delivered to Lambda in order.
- Lambda sorts the messages into groups and sends only one batch at a time for a group.
- Your function can scale in concurrency to the number of active message groups.
SQS Message Event in Lambda Function - Standard
{
"Records": [
{
"messageId": "059f36b4-87a3-44ab-83d2-661975830a7d",
"receiptHandle": "AQEBwJnKyrHigUMZj6rYigCgxlaS3SLy0a...",
"body": "Test message.",
"attributes": {
"ApproximateReceiveCount": "1",
"SentTimestamp": "1545082649183",
"SenderId": "AIDAIENQZJOLO23YVJ4VO",
"ApproximateFirstReceiveTimestamp": "1545082649185"
},
"messageAttributes": {},
"md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
"eventSource": "aws:sqs",
"eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:my-queue",
"awsRegion": "us-east-2"
},
...
]
}
SQS Message Event in Lambda Function - FIFO
{
"Records": [
{
"messageId": "11d6ee51-4cc7-4302-9e22-7cd8afdaadf5",
"receiptHandle": "AQEBBX8nesZEXmkhsmZeyIE8iQAMig7qw...",
"body": "Test message.",
"attributes": {
"ApproximateReceiveCount": "1",
"SentTimestamp": "1573251510774",
"SequenceNumber": "18849496460467696128",
"MessageGroupId": "1",
"SenderId": "AIDAIO23YVJENQZJOL4VO",
"MessageDeduplicationId": "1",
"ApproximateFirstReceiveTimestamp": "1573251510774"
},
"messageAttributes": {},
"md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
"eventSource": "aws:sqs",
"eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:fifo.fifo",
"awsRegion": "us-east-2"
}
]
}
SQS Resiliency Pattern with Lambda
Enable Batch Error Reporting by running:
aws lambda update-event-source-mapping \
--uuid "a1b2c3d4-5678-90ab-cdef-123401" \
--function-response-types "ReportBatchItemFailures"
Update your Lambda Function code to collect messageId of failed messages:
{
"batchItemFailures": [
{
"itemIdentifier": "message-id-1"
},
{
"itemIdentifier": "message-id-4"
}
]
}
Blog Post on Batch Error Reporting by srcecde