Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search with Context Similarity #2

Draft
wants to merge 63 commits into
base: development
Choose a base branch
from

Conversation

sshivaditya2019
Copy link
Collaborator

@sshivaditya2019 sshivaditya2019 commented Oct 5, 2024

Resolves #50

  • Database Backfilling with the issue and the comments data.
  • Builds on the existing open PR @ubiquityos gpt command #1
  • New Adapters for voyageai and supabase
  • Updated Prompt for the OpenAI completions
  • Added Rerankers for reranking the similar search results
  • Similarity Search Functions for the DB
  • QA (Testing)
  • QA (Multiple Models)
  • Improve the Data Quality
  • Optimize the ReRanking and Retrieval Process
  • Optimize the existing issue retrieval and formatting

Results for Database fetching backfilling:

  • A total of 146 issues were identified.
  • A comprehensive total of 1,238 comments was collected, including comments from pull requests (PRs), PR reviews, and comments specifically related to the identified issues.
  • Embeddings were generated using Voyage AI for enhanced data analysis.
  • The data was then converted into CSV format and loaded into Supabase for further use.

@sshivaditya2019
Copy link
Collaborator Author

QA: Issue

@sshivaditya2019
Copy link
Collaborator Author

I tested around with few models, Claude 3.5 Sonnet and OpenAI GPT4o performed the best, other model hallucinated even with very low temperature and top_p value of 0.5 where ever possible.

@sshivaditya2019
Copy link
Collaborator Author

@0x4007 Could you please check the model responses? And are there any questions that could judge the retrieval performance on topics that are discussed very rarely or only once?

@0x4007
Copy link
Member

0x4007 commented Oct 6, 2024

@0x4007 Could you please check the model responses? And are there any questions that could judge the retrieval performance on topics that are discussed very rarely or only once?

Gold star? Non established. Will need to work on this asap.

DM me we can collaborate on this.

"description": "Ubiquibot plugin template repository with TypeScript support.",
"author": "Ubiquity DAO",
"description": "A highly context aware organization integrated chatbot",
"author": "Ubiquity OS",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"author": "Ubiquity OS",
"author": "Ubiquity DAO",
  • DAO is the organization.
  • OS is the software.
  • DevPool is the community.

repo: repo || payload.repository.name,
issue_number: issueNum || payload.issue.number,
})
.then(({ data }) => data as Issue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty unusual syntax to mix async await and then


const issue = await fetchIssue(params);

let comments: IssueComments | ReviewComments = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have two separate arrays for each data type?

export async function runPlugin(context: Context) {
const {
logger,
env: { UBIQUITY_OS_APP_SLUG },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be renamed to

Suggested change
env: { UBIQUITY_OS_APP_SLUG },
env: { UBIQUITY_OS_APP_NAME },

@sshivaditya2019
Copy link
Collaborator Author

Model Cost Comparison
This table displays the cost per response for various models based on a avg total of 3,000 tokens (input and output combined).

Model Cost per Response Tokens per Second (TPS)
GPT-4o $0.018775 68
O1 Mini $0.0232 170
Phi 3.5 Mini $0.000256 42.3
Gemini Flash 1.5 $0.0000896 177
Claude Sonnet 3.5 $0.0138 66

Copy link

github-actions bot commented Oct 7, 2024

Unused files (7)

src/handlers/comments.ts, src/helpers/format-chat-history.ts, src/helpers/issue-fetching.ts, src/helpers/issue-handling.ts, src/helpers/issue.ts, src/types/github.ts, src/types/gpt.ts

Unlisted dependencies (5)

Filename unlisted
src/plugin.ts @supabase/supabase-js
src/adapters/index.ts @supabase/supabase-js
src/adapters/supabase/helpers/comment.ts @supabase/supabase-js
src/adapters/supabase/helpers/supabase.ts @supabase/supabase-js
src/adapters/supabase/helpers/issues.ts @supabase/supabase-js

@0x4007
Copy link
Member

0x4007 commented Oct 7, 2024

Seems like we get what we pay for :)

@sshivaditya2019
Copy link
Collaborator Author

QA:

Can parse through Linked code files in Issue spec and answer questions based on that.

Code Parse #1
Code Parse #2

@0x4007
Copy link
Member

0x4007 commented Oct 9, 2024

Your QA results are quite interesting. We should prompt and focus on brevity.

Can you display (add an extra comment) which shows the entire passed in context? I would like to audit this.

Once this is set up I would like to try asking a couple questions.

@sshivaditya2019
Copy link
Collaborator Author

Your QA results are quite interesting. We should prompt and focus on brevity.

Can you display (add an extra comment) which shows the entire passed in context? I would like to audit this.

Once this is set up I would like to try asking a couple questions.

The plugin is running at test-public repo. You can try it over there. As for the context I think it takes in around 2400 Tokens on avg.

@sshivaditya2019
Copy link
Collaborator Author

On average, these responses cost approximately $0.22, based on an input token count of 2,500 and an output token count of 3,300 on the o1-mini model. While these responses, are quite expensive, these provide a good overview for task.

@0x4007
Copy link
Member

0x4007 commented Oct 9, 2024

Thats mostly fine. Any price these models charge us are orders of magnitude cheaper than developer time, particularly those on base pay.

@sshivaditya2019
Copy link
Collaborator Author

sshivaditya2019 commented Oct 9, 2024

II don't have access to o1-preview model, but I think its responses should be better than o1-mini. I think next step would be parsing pull requests and their review comments. I think this PR should be broken into multiple iterations.

@0x4007
Copy link
Member

0x4007 commented Oct 9, 2024

Actually we should use mini because it has a much larger usable context length. Preview has a lot more internal reasoning tokens spend. I can borrow a key but as I understand both o1 models are available for the same tier of OpenAI account, meaning, if you have access to one, you should have access to both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search
4 participants