Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3版本 规划 #29

Open
lloydzhou opened this issue Apr 25, 2024 · 3 comments
Open

V3版本 规划 #29

lloydzhou opened this issue Apr 25, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@lloydzhou
Copy link
Contributor

  1. 使用inference API GPTS 支持 向量搜索
    a. 通过pipeline自动处理向量化过程(支持huggingface api 以及openai api)
    b. 通过query_vector_builder在knn查询阶段也使用inference进行向量化
  2. 使用huggingface/text-embeddings-inference 作为api移除之前内置pytouch做向量化的过程,这里可以提升之前知识库做向量化的性能
  3. 利用es 支持 inner hits的特性 改进存储结构
  4. 文档拆分https://github.com/Filimoa/open-parse
    a. 使用这个开源的open parse项目。对pdf支持挺好。
    b. 另外就是这个项目readme提到的 google document ai,以及aws的相关api,还有一个公司的产品(这些都是付费的 $10 / 1000page)
    • Typically priced at ≈ $10 / 1k pages. See here, here and here.
@lloydzhou lloydzhou pinned this issue Apr 25, 2024
@lloydzhou lloydzhou added the enhancement New feature or request label Apr 25, 2024
@lloydzhou
Copy link
Contributor Author

@lloydzhou
Copy link
Contributor Author

@lloydzhou
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant