AWS SQS

Lambda Integration

with Resiliency Patterns

Documentation

Hi there, I’m Sanjay 👋!

LinkedIn Badge Website Badge Twitter Badge Github Badge Stack Overflow

About me

  • 🔭 Working: Lead Software Engineer - working on Spring Boot, Reactive Programming, Microservices, Kafka, Cassandra, Kubernetes, AWS.
  • 🖥️ Interests: I love building cool Software & Systems, Self-Hosting, Gaming
  • 🌱 Learning: Go | Rust | Scala | Design Patterns
  • 💬 Ask me about: Java | Reactive Spring | Containers | AWS
  • 🧑‍🤝‍🧑 Collaboration: Looking to collaborate on several projects over here, check out my GitHub

Languages, Frameworks and Platforms

Java Spring Kotlin Project Reactor Kafka Cassandra AWS Kubernetes

What is AWS SQS

  • SQS offers a Secure, Durable, and Highly Available Hosted Queue
  • Used to integrate and decouple distributed software systems and components
  • Standard queues support at-least-once message delivery, and FIFO queues support exactly-once message processing and high-throughput mode
  • Offers Dead Letter Queues

SQS Concepts w.r.t Lambda

  • Queue Types: Standard and FIFO
  • Visibility timeout: wait time after which message is visible again if not deleted after processing
  • Message retention period: 1 minute to 14 days (defaults to 4 days)
  • Delivery delay: 0 seconds to 15 minutes
  • SQS Event Source Mapping: Lambda Service reads items from a SQS & invokes Lambda Function
  • Enable trigger: Enable or disable SQS-Lambda Integration
  • Batch size: number of records to send to the function in each batch. Standard: 10,000(max). FIFO: 10(max)
  • Batch window: wait time (in second) to gather records before invoking the function, applicable for Standard

Message Lifecycle

sqs-message-lifecycle

SQS Visibility Timeout

SQS Visibility Timeout
  • To avoid message loss, it's consumers responsibility to delete the message after processing
  • Message remains in queue after it is received, but SQS sets a visibility timeout to prevent other consumer from processing same message again
  • Default visibility timeout is 30 seconds. Can be set between 0 seconds to 12 hours
  • Can be set a Queue level or dynamically changed per message
  • Caution: When using FIFO make sure to use Message GroupId which provides high distribution to avoid blocking processing due to error
    Note:
  • For Standard queues, the visibility timeout isn't a guarantee against receiving a message twice.
  • FIFO queues allow the producer or consumer to attempt multiple retries: producers can retry send using deduplicationId and consumers doesn't receive messages for same message groupId unless deleted or timed-out

In-Flight Messages & Scaling

  • Messages that are received by the consumer but not deleted are known as In-Flight messages
  • For Standard queues: there can be a maximum of approximately 120,000 in flight messages
  • For FIFO queues: there can be a maximum of 20,000 in flight messages
  • Standard Queues: Lambda uses long polling & reads up to 5 batches to invoke your function.
  • Up until 06 Nov 2023, the Lambda was adding up to 60 concurrent executions/minute, scaling up to a maximum of 1,250 concurrent executions in approximately 20 minutes.
  • Now Lambda functions can scale up to 5x faster, adding up to 300 concurrent executions/minute.
  • Even at peak performance, the Maximum number of messages processed concurrently by Lambda = 10 messages in a batch * 1250 executions = 12,500
  • FIFO queues: Lambda sends messages to your function in the order that it receives them and ensures that messages in the same group are delivered to Lambda in order.
  • Lambda sorts the messages into groups and sends only one batch at a time for a group.
  • Your function can scale in concurrency to the number of active message groups.

SQS Message Event in Lambda Function - Standard

{
  "Records": [
    {
      "messageId": "059f36b4-87a3-44ab-83d2-661975830a7d",
      "receiptHandle": "AQEBwJnKyrHigUMZj6rYigCgxlaS3SLy0a...",
      "body": "Test message.",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1545082649183",
        "SenderId": "AIDAIENQZJOLO23YVJ4VO",
        "ApproximateFirstReceiveTimestamp": "1545082649185"
      },
      "messageAttributes": {},
      "md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:my-queue",
      "awsRegion": "us-east-2"
    },
    ...
  ]
}

SQS Message Event in Lambda Function - FIFO

{
  "Records": [
    {
      "messageId": "11d6ee51-4cc7-4302-9e22-7cd8afdaadf5",
      "receiptHandle": "AQEBBX8nesZEXmkhsmZeyIE8iQAMig7qw...",
      "body": "Test message.",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1573251510774",
        "SequenceNumber": "18849496460467696128",
        "MessageGroupId": "1",
        "SenderId": "AIDAIO23YVJENQZJOL4VO",
        "MessageDeduplicationId": "1",
        "ApproximateFirstReceiveTimestamp": "1573251510774"
      },
      "messageAttributes": {},
      "md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:fifo.fifo",
      "awsRegion": "us-east-2"
    }
  ]
}

SQS Resiliency Pattern with Lambda

Enable Batch Error Reporting by running:
aws lambda update-event-source-mapping \
    --uuid "a1b2c3d4-5678-90ab-cdef-123401" \
    --function-response-types "ReportBatchItemFailures"
Update your Lambda Function code to collect messageId of failed messages:
{
  "batchItemFailures": [
    {
      "itemIdentifier": "message-id-1"
    },
    {
      "itemIdentifier": "message-id-4"
    }
  ]
}

Blog Post on Batch Error Reporting by srcecde

Code

Questions?