You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
It would be beneficial to have the ability to offload tasks asynchronously to AWS Lambda functions, especially when handling large volumes of data in Data Prepper. Currently, the synchronous Lambda invocation can limit concurrency and performance. Having an async client will allow Data Prepper to handle Lambda invocations concurrently, improving throughput and scalability.
Describe the solution you'd like
I propose adding support for
AWS Lambda Async Client by default in Data Prepper's Lambda-related components. This will enable non-blocking Lambda invocations for more efficient handling of high throughput data streams. The LambdaAsyncClient from the AWS SDK will be integrated for all Lambda invocations, making the system more scalable.
SDK defaults the connection timeout to 60secs. This means that if the lambda processing takes >60sec, the requests would fail causing all the records to drop. We should give this as a tunable parameter to the user.
Key changes needed:
Replace the synchronous Lambda client with LambdaAsyncClient in the Lambda Processor and Lambda Sink components.
Make SDK timeout an option that the user can configure
Additional context
This enhancement allows for improved scalability, better error handling, and non-blocking invocations of Lambda functions, which is crucial for high-throughput systems.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
It would be beneficial to have the ability to offload tasks asynchronously to AWS Lambda functions, especially when handling large volumes of data in Data Prepper. Currently, the synchronous Lambda invocation can limit concurrency and performance. Having an async client will allow Data Prepper to handle Lambda invocations concurrently, improving throughput and scalability.
Describe the solution you'd like
I propose adding support for
AWS Lambda Async Client by default in Data Prepper's Lambda-related components. This will enable non-blocking Lambda invocations for more efficient handling of high throughput data streams. The LambdaAsyncClient from the AWS SDK will be integrated for all Lambda invocations, making the system more scalable.
SDK defaults the connection timeout to 60secs. This means that if the lambda processing takes >60sec, the requests would fail causing all the records to drop. We should give this as a tunable parameter to the user.
Key changes needed:
Replace the synchronous Lambda client with LambdaAsyncClient in the Lambda Processor and Lambda Sink components.
Make SDK timeout an option that the user can configure
Additional context
This enhancement allows for improved scalability, better error handling, and non-blocking invocations of Lambda functions, which is crucial for high-throughput systems.
The text was updated successfully, but these errors were encountered: