You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This task consists of implementing the Django command to perform the initial batch work. It basically consists of two parts:
A command that pulls up Opinion texts from the database and creates a batch of opinion texts to send to the microservice.
This is related to your comment:
I'd guess this is less about how many opinions to do at once and more about how long those opinions are.
I'm thinking that the command can only iterate over all the Opinion pks in DB so we can just send a chunk of pks to the Celery task that will have to do the work:
Request opinion texts from the pks passed.
Create the request body for these opinion texts
Send the request to the microservice
Wait for the microservice response
Store embeddings into S3
In that way, the task would need to only hold opinion pks. However, the disadvantage is that some requests can be just a few KBs while others can be many MBs.
The alternative is to request the opinion texts within the command and set a batch threshold that we determine performs better to generate embeddings in the microservice, say 1MB or 10MB, wherever we determine. So we extract opinion texts and when we're close to this limit, we pass the batch of texts to the task.
The disadvantage is that tasks will require holding opinion texts instead of retrieving them within the task.
But probably this is OK since we don't expect to have many tasks in the queue. Considering we'd just set up an equivalent number of Celery workers as we have microservice instances. If we use queue throttling, the queue should be small all the time.
The output of this issue would be:
A PR that includes the Django command to request opinion texts in batches.
Some questions:
@legaltextai, in terms of performance and efficient use of resources on our GPU/CPU machines used for embedding work, would it affect performance to have requests with varying amounts of text to embed? For example, sometimes we send 1MB for embedding, and other times 100KB. Is it more efficient to always request a fixed amount of text for embedding?
If so, is it possible to determine the ideal size of text to request in a single embedding request?
The text was updated successfully, but these errors were encountered:
This task consists of implementing the Django command to perform the initial batch work. It basically consists of two parts:
This is related to your comment:
I'm thinking that the command can only iterate over all the Opinion pks in DB so we can just send a chunk of pks to the Celery task that will have to do the work:
In that way, the task would need to only hold opinion pks. However, the disadvantage is that some requests can be just a few KBs while others can be many MBs.
The output of this issue would be:
Some questions:
@legaltextai, in terms of performance and efficient use of resources on our GPU/CPU machines used for embedding work, would it affect performance to have requests with varying amounts of text to embed? For example, sometimes we send 1MB for embedding, and other times 100KB. Is it more efficient to always request a fixed amount of text for embedding?
If so, is it possible to determine the ideal size of text to request in a single embedding request?
The text was updated successfully, but these errors were encountered: