Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address Scale Items for Lambda Processor and Sink #5031

Open
srikanthjg opened this issue Oct 8, 2024 · 0 comments · May be fixed by #5032
Open

Address Scale Items for Lambda Processor and Sink #5031

srikanthjg opened this issue Oct 8, 2024 · 0 comments · May be fixed by #5032
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@srikanthjg
Copy link
Contributor

Is your feature request related to a problem? Please describe.
It would be beneficial to have the ability to offload tasks asynchronously to AWS Lambda functions, especially when handling large volumes of data in Data Prepper. Currently, the synchronous Lambda invocation can limit concurrency and performance. Having an async client will allow Data Prepper to handle Lambda invocations concurrently, improving throughput and scalability.

Describe the solution you'd like
I propose adding support for

  1. AWS Lambda Async Client by default in Data Prepper's Lambda-related components. This will enable non-blocking Lambda invocations for more efficient handling of high throughput data streams. The LambdaAsyncClient from the AWS SDK will be integrated for all Lambda invocations, making the system more scalable.

  2. SDK defaults the connection timeout to 60secs. This means that if the lambda processing takes >60sec, the requests would fail causing all the records to drop. We should give this as a tunable parameter to the user.

Key changes needed:

Replace the synchronous Lambda client with LambdaAsyncClient in the Lambda Processor and Lambda Sink components.
Make SDK timeout an option that the user can configure

Additional context
This enhancement allows for improved scalability, better error handling, and non-blocking invocations of Lambda functions, which is crucial for high-throughput systems.

@srikanthjg srikanthjg linked a pull request Oct 8, 2024 that will close this issue
4 tasks
@srikanthjg srikanthjg changed the title Support Async Client for Lambda Processor and Sink Address Scale Items for Lambda Processor and Sink Oct 8, 2024
@oeyh oeyh added enhancement New feature or request and removed untriaged labels Oct 8, 2024
@oeyh oeyh added this to the v2.11 milestone Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

2 participants