You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Overall logic:
Each time a pdf file is created or modified in the epd-raw-data-prod-eu-west-3 bucket, it sends an event to a SNS.
A SQS listens to this SNS and triggers a lambda. This lambda will call an Embedding model to create vector representations of the PDF:
One BERT-like embedding of the product name
One ColBERT embedding of the product name
One BERT-like embedding of the product description
One ColBERT embedding of the product description
For v0, we will focus only on BERT-like embeddings of the product name and use a pre-trained model from either Mistral or Voyage AI APIs
UUID can be used as a primary key for the DB.
Maybe for v0 we can load everything into the same table.
High-level design:
Overall logic:
Each time a
pdf
file is created or modified in theepd-raw-data-prod-eu-west-3
bucket, it sends an event to a SNS.A SQS listens to this SNS and triggers a lambda. This lambda will call an Embedding model to create vector representations of the
PDF
:For v0, we will focus only on BERT-like embeddings of the product name and use a pre-trained model from either Mistral or Voyage AI APIs
UUID can be used as a primary key for the DB.
Maybe for v0 we can load everything into the same table.
Resources to create with
cdk
:The text was updated successfully, but these errors were encountered: