Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sweep: once the chatmodel generates the output. ask the ChatModel to create a patch file of changes #2

Open
2 tasks done
vivasvan1 opened this issue Oct 7, 2023 · 1 comment · May be fixed by #4
Open
2 tasks done
Labels
sweep Sweep your software chores

Comments

@vivasvan1
Copy link
Owner

vivasvan1 commented Oct 7, 2023

Checklist
  • utils/github_utils.py:run_query ✅ Commit 461e279
  • main.py:solve_problem ✅ Commit a25a38c
@sweep-ai sweep-ai bot added the sweep Sweep your software chores label Oct 7, 2023
@sweep-ai
Copy link

sweep-ai bot commented Oct 7, 2023

Here's the PR! #4.

⚡ Sweep Free Trial: I'm creating this ticket using GPT-4. You have 1 GPT-4 tickets left for the month and -1 for the day. For more GPT-4 tickets, visit our payment portal.

Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description.

fixThisChris/main.py

Lines 75 to 319 in 412ff66

print(f"Accepted repository invitation with ID {invitation_id}")
else:
print(f"Failed to accept repository invitation with ID {invitation_id}")
time.sleep(1)
def fetch_unread_mentions():
notifications_url = "https://api.github.com/notifications"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
notifications = send_github_request(notifications_url, "GET", headers)
if not notifications:
print("Failed to fetch notifications")
return []
notifications = notifications.json()
unread_mentions = [
notification
for notification in notifications
if (("mention" in notification["reason"]) and notification["unread"])
]
return unread_mentions
def mark_issue_notification_as_read(id, issue_number):
mark_as_read_url = f"https://api.github.com/notifications/threads/{id}"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
response = send_github_request(mark_as_read_url, "PATCH", headers)
if response:
print(f"Marked issue {issue_number} as read")
else:
print(f"Failed to mark issue {issue_number} as read")
def fetch_issue(repo_owner, repo_name, issue_number):
fetch_issue_url = (
f"https://api.github.com/repos/{repo_owner}/{repo_name}/issues/{issue_number}"
)
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
issue_data = send_github_request(fetch_issue_url, "GET", headers)
if issue_data:
return issue_data.json()
else:
print(f"Failed to fetch issue #{issue_number}")
return None
def fetch_issue_comments(comments_url):
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
comments = send_github_request(comments_url, "GET", headers)
if not comments:
print("Failed to fetch issue comments")
return None
comments = comments.json()
if comments:
print(f"Fetched {len(comments)} comments")
return comments
def generate_gpt_prompt(issue_data):
issue_title = issue_data.get("title")
issue_body = issue_data.get("body")
issue_comments_url = issue_data.get("comments_url")
issue_comments_prompt = ""
issue_comments = fetch_issue_comments(issue_comments_url)
if issue_comments:
for comment in issue_comments:
if not (
comment.get("user").get("login") == "fixThisChris"
or comment.get("user").get("login") == "github-actions[bot]"
or comment.get("user").get("login") == "dependabot[bot]"
):
issue_comments_prompt += f"\nuser: {comment.get('user').get('login')} comment:{comment.get('body')}"
gpt_prompt = f"Issue title: {issue_title}\nIssue Description: {issue_body}\n\n"
if issue_comments_prompt != "":
gpt_prompt += f"Issue Comments: {issue_comments_prompt}\n"
return gpt_prompt
class FailedToFetchIssueException(Exception):
pass
def solve_problem(mention, issue_number, issue_description):
issue_data = fetch_issue(
mention["repository"]["owner"]["login"],
mention["repository"]["name"],
issue_number,
)
if not issue_data:
raise FailedToFetchIssueException()
prompt = generate_gpt_prompt(issue_data)
out = run_query(
prompt,
mention["repository"]["owner"]["login"],
mention["repository"]["name"],
)
# response = generate_response(issue_description)
# print(response)
post_comment_to_issue(
issue_number,
out,
mention["repository"]["owner"]["login"],
mention["repository"]["name"],
)
# Mark notification as read
mark_issue_notification_as_read(mention["id"], issue_number)
def time_remaining_to_reset():
# Get the current time
now = datetime.now()
# Construct a datetime object for the next midnight
next_midnight = datetime(now.year, now.month, now.day) + timedelta(days=1)
# Calculate the difference between the current time and the next midnight
remaining_time = next_midnight - now
# Return the remaining time as a timedelta object
return remaining_time
def respond_to_unread_issues():
unread_mentions = fetch_unread_mentions()
# print(unread_mentions)
for mention in unread_mentions:
issue_number = mention["subject"]["url"].split("/")[-1]
issue_description = mention["subject"]["title"]
print(f"Issue {issue_number} is unread")
rate_limit_exceeded = is_rate_limit_reached(mention["repository"]["name"])
if not rate_limit_exceeded:
try:
solve_problem(mention, issue_number, issue_description)
increment_usage_limit(mention["repository"]["name"])
except FailedToFetchIssueException:
continue
else:
remaining_time = time_remaining_to_reset()
hours, remainder = divmod(remaining_time.total_seconds(), 3600)
minutes, seconds = divmod(remainder, 60)
print(f"Rate limit exceeded for {mention['repository']['name']}.")
post_comment_to_issue(
issue_number=issue_number,
comment_text=f"""#### 🛑 **Rate Limit Exceeded!**
⌛ **Limit:** {USAGE_LIMIT} requests / day / repository
🔒 **Refreshes In:** {int(hours)} hours, {int(minutes)} minutes
<!-- To continue using the service, please consider upgrading to our **Pro Plan**.
##### 🚀 **Upgrade to Pro**
Upgrade to the Pro Plan to enjoy enhanced access, faster response times, and priority support. Click [here](Upgrade_Link) to upgrade now! -->
📬 For any inquiries for support or rate limit extension, please contact <a href="https://discord.gg/T6Hz6zpK7D" target="_blank">Support</a>.""",
OWNER=mention["repository"]["owner"]["login"],
REPO=mention["repository"]["name"],
)
# Mark notification as read
mark_issue_notification_as_read(mention["id"], issue_number)
scheduler = BackgroundScheduler()
scheduler.add_job(func=accept_github_invitations, trigger="interval", seconds=30)
scheduler.add_job(func=respond_to_unread_issues, trigger="interval", seconds=30)
# Schedule the task to reset limits
scheduler.add_job(func=reset_usage_limits, trigger="cron", hour=0, minute=0)
scheduler.start()
# Shut down the scheduler when exiting the app
atexit.register(lambda: scheduler.shutdown())
# @app.route("/webhook", methods=["POST"])
# def webhook():
# data = request.json
# # Check if the comment mentions the AI keyword
# if "@solveitjim" in data["comment"]["body"]:
# issue_number = data["issue"]["number"]
# problem_description = data["comment"]["body"].split("@solveitjim")[1].strip()
# # Use ChatGPT to generate a response
# response = generate_response(problem_description)
# # Post the response as a comment on the issue
# post_comment_to_issue(issue_number, response)
# return jsonify({"message": "Webhook processed successfully"})
def generate_response(prompt):
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": prompt},
],
)
return completion.choices[0].message.content # type: ignore
def post_comment_to_issue(issue_number, comment_text, OWNER, REPO):
headers = {
"Authorization": f"token {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
payload = {
"body": comment_text,
}
url = f"https://api.github.com/repos/{OWNER}/{REPO}/issues/{issue_number}/comments"
response = requests.post(url, json=payload, headers=headers)
return response.json()
if __name__ == "__main__":

f"Error fetching tree for {owner}/{repo}. Status code: {response.status_code}"
)
return None
tree = response.json()
file_paths = [item["path"] for item in tree["tree"] if item["type"] == "blob"]
return file_paths
def get_default_branch(owner, repo):
url = f"https://api.github.com/repos/{owner}/{repo}"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
repo_info = response.json()
return repo_info.get("default_branch", "")
else:
print(f"Failed to fetch repository info | Status Code: {response.status_code}")
return None
def create_embedding_of_repo(repo_url: str, default_branch: str):
owner = repo_url.split("/")[3]
repo_name = repo_url.split("/")[4].split(".git")[0]
repo_url_with_token = repo_url.replace('https://', f'https://{GITHUB_ACCESS_TOKEN}@')
loader = GitLoader(
clone_url=repo_url_with_token,
repo_path="repo",
branch=default_branch,
)
loader.load()
# configure these to fit your needs
exclude_dir = [".git", "node_modules", "public", "assets"]
exclude_files = ["package-lock.json", ".DS_Store"]
exclude_extensions = [
".jpg",
".jpeg",
".png",
".gif",
".bmp",
".tiff",
".ico",
".svg",
".webp",
".mp3",
".wav",
]
documents = []
for dirpath, dirnames, filenames in os.walk("repo"):
# skip directories in exclude_dir
dirnames[:] = [d for d in dirnames if d not in exclude_dir]
for file in filenames:
_, file_extension = os.path.splitext(file)
# skip files in exclude_files
if file not in exclude_files and file_extension not in exclude_extensions:
file_path = os.path.join(dirpath, file)
loader = TextLoader(file_path, encoding="ISO-8859-1")
documents.extend(loader.load())
text_splitter = CharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
docs = text_splitter.split_documents(documents)
for doc in docs:
doc.metadata["repo_url"] = repo_url
doc.metadata["owner"] = owner
doc.metadata["repo_name"] = repo_name
doc.metadata["inserted_at"] = datetime.now().isoformat()
source = doc.metadata["source"]
cleaned_source = "/".join(source.split("/")[1:])
doc.page_content = (
"FILE NAME: "
+ cleaned_source
+ "\n###\n"
+ doc.page_content.replace("\u0000", "")
)
embeddings = OpenAIEmbeddings()
vector_store = SupabaseVectorStore.from_documents(
docs,
embeddings,
client=supabase,
table_name="documents",
)
shutil.rmtree("repo")
def setup_repo(owner, repo_name):
repo_url = f"https://github.com/{owner}/{repo_name}.git"
default_branch = get_default_branch(owner, repo_name)
if default_branch is None:
# TODO: report proper error message
default_branch = "main"
# Query the table and filter based on the repo_url
query = (
supabase.table("documents") # Replace with your table name
.select("metadata")
.contains("metadata", {"repo_url": repo_url})
.limit(5)
)
query_res = query.execute()
if len(query_res.data) == 0:
create_embedding_of_repo(repo_url, default_branch)
else:
print("Repo already exists")
def run_query(query: str, owner: str, repo_name: str):
setup_repo(owner, repo_name)
matched_docs = vector_store.similarity_search(
query, filter={"repo_name": repo_name}
)
code_str = ""
MAX_TOKENS = 3500
current_tokens = 0
for doc in matched_docs:
doc_content = doc.page_content + "\n\n"
doc_tokens = num_tokens_from_string(doc_content)
print(matched_docs.index(doc), doc_tokens)
if current_tokens + doc_tokens < MAX_TOKENS:
code_str += doc_content
current_tokens += doc_tokens
else:
break # stop adding more content if it exceeds the max token limit
# print("\n\033[35m" + code_str + "\n\033[32m")
template = """
You are Codebase AI. You are a super intelligent AI that answers questions about code bases.
You are:
- helpful & friendly
- good at answering complex questions in simple language
- an expert in all programming languages
- able to infer the intent of the user's question
The user will ask a question about their codebase, and you will answer it.
When the user asks their question, you will answer it by searching the codebase for the answer.
Here is the user's question and code file(s) you found to answer the question:
Question:
{query}
Code file(s):
{code}
[END OF CODE FILE(S)]w
Now answer the question using the code file(s) above.
"""
chat = ChatOpenAI(
streaming=True,
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
verbose=True,
temperature=0.5,
)
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt])
chain = LLMChain(llm=chat, prompt=chat_prompt)
print("running chain...")
ai_said = chain.run(code=code_str, query=query)
print("chain output...")
return ai_said
# print(
# run_query(
# "How can i filter metadata by using filter argument in vector_store.similarity_search",
# "mckaywrigley",
# "repo-chat",
# )
# )
# setup_repo("mckaywrigley", "repo-chat")
# def get_files(owner, repo, file_paths):
# url = f"https://api.github.com/repos/{owner}/{repo}/contents/{file_paths}"
# headers = {
# "Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
# "Accept": "application/vnd.github.v3+json",
# }
# # Example usage
# owner = "vivasvan1"
# repo = "front_greenberg_hammed"
# file_paths = fetch_all_files_in_repo(owner, repo)
# if file_paths:
# schema = {
# "type": "object",
# "properties": {"files": {"type": "array", "items": {"type": "string"}}},
# }
# prompt = """I will provide you a list of file names and a description of a bug i want you to guess top 5 files will be need to debug it
# List of files:"""
# for path in file_paths:
# prompt = prompt + path + "\n"
# prompt = (
# prompt
# + """Bug:
# add link to blog for each stream #130
# outputs as JSON of format {"files":["file1","file2"]}
# """
# )
# response = openai.ChatCompletion.create(
# model="gpt-3.5-turbo",
# messages=[
# {"role": "user", "content": prompt},
# ],
# )
# try:
# print(response.choices[0].message) # type: ignore
# print(json.loads(response.choices[0].message.content)) # type: ignore
# except Exception as error:

def increment_usage_limit(repo: str) -> int:
if repo is None or repo.strip() == "":
raise ValueError("Repository name must be provided.")
usage_limit = get_usage_limit(repo)
print("Usage limit:", usage_limit)
if not usage_limit:
raise Exception("Repository not found in usage-limits table")
usage_limit = usage_limit[0]
updated_count = usage_limit["number_of_times_used_today"] + 1
update_result = (
supabase.table("usage-limits")
.update({"number_of_times_used_today": updated_count})
.eq("repo", repo)
.execute()
)
print("Updated Limit:", update_result)
if not update_result.data:
raise Exception("Failed to update usage-limits table")
return updated_count
def reset_usage_limits():
# Reset the number_of_times_used_today for all repos at the end of the day
result = supabase.table('usage-limits').update({'number_of_times_used_today': 0}).filter("uuid","neq","00000000-0000-0000-0000-000000000000").execute()
def fetch_all_files_in_repo(owner, repo):
default_branch = get_default_branch(owner, repo)
if default_branch is None:
return None
base_url = f"https://api.github.com/repos/{owner}/{repo}/git/trees/{default_branch}?recursive=1"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
response = requests.get(base_url, headers=headers)
if response.status_code != 200:
print(
f"Error fetching tree for {owner}/{repo}. Status code: {response.status_code}"
)
return None
tree = response.json()
file_paths = [item["path"] for item in tree["tree"] if item["type"] == "blob"]
return file_paths
def get_default_branch(owner, repo):
url = f"https://api.github.com/repos/{owner}/{repo}"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
repo_info = response.json()
return repo_info.get("default_branch", "")
else:
print(f"Failed to fetch repository info | Status Code: {response.status_code}")
return None
def create_embedding_of_repo(repo_url: str, default_branch: str):
owner = repo_url.split("/")[3]
repo_name = repo_url.split("/")[4].split(".git")[0]
repo_url_with_token = repo_url.replace('https://', f'https://{GITHUB_ACCESS_TOKEN}@')
loader = GitLoader(
clone_url=repo_url_with_token,
repo_path="repo",
branch=default_branch,

fixThisChris/main.py

Lines 1 to 85 in 412ff66

from datetime import datetime, timedelta
from flask import Flask, request, jsonify
import openai
import requests
import os
import logging
from env import OPENAI_API_KEY, GITHUB_ACCESS_TOKEN
from utils.github_utils import (
USAGE_LIMIT,
increment_usage_limit,
is_rate_limit_reached,
reset_usage_limits,
run_query,
)
app = Flask(__name__)
openai.api_key = OPENAI_API_KEY
# Initialize logging
logging.basicConfig(level=logging.INFO)
from commons import send_github_request
import time
import atexit
from apscheduler.schedulers.background import BackgroundScheduler
import requests
def fetch_repository_invites():
fetch_invites_url = "https://api.github.com/user/repository_invitations"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
}
invites = send_github_request(fetch_invites_url, "GET", headers)
if invites:
return invites.json()
else:
logging.error("Failed to fetch repository invitations")
return None
def accept_repository_invitation(invitation_id):
accept_url = f"https://api.github.com/user/repository_invitations/{invitation_id}"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",
"X-GitHub-Api-Version": "2022-11-28", # Use the appropriate API version
}
response = send_github_request(accept_url, "PATCH", headers)
return response
def accept_github_invitations():
repository_invites = fetch_repository_invites()
if not repository_invites:
return
for invite in repository_invites:
invitation_id = invite.get("id")
response = accept_repository_invitation(invitation_id)
if not response:
continue
if response.status_code == 204:
print(f"Accepted repository invitation with ID {invitation_id}")
else:
print(f"Failed to accept repository invitation with ID {invitation_id}")
time.sleep(1)
def fetch_unread_mentions():
notifications_url = "https://api.github.com/notifications"
headers = {
"Authorization": f"Bearer {GITHUB_ACCESS_TOKEN}",
"Accept": "application/vnd.github.v3+json",


Step 2: ⌨️ Coding

  • utils/github_utils.py:run_query ✅ Commit 461e279
• Modify the `run_query` function to return a tuple containing the AI response and the patch file. The patch file should be a string containing the changes suggested by the AI.
• To generate the patch file, you can use the `difflib` module in Python. This module provides classes and functions for comparing sequences, including HTML and context and unified diffs.
• The patch file should contain the differences between the original code and the code after applying the AI's suggestions. You can use the `difflib.unified_diff` function to generate this.
• The `unified_diff` function takes two lists of strings as input (the original code and the modified code). You can split the code into lines using the `splitlines` method of strings.
• The `unified_diff` function returns a generator that produces the diff lines. You can join these lines into a string using the `join` method of strings.
• The AI's suggestions can be applied to the code using string replacement. For example, if the AI suggests replacing a function call with another one, you can use the `replace` method of strings to apply this suggestion.
  • main.py:solve_problem ✅ Commit a25a38c
• Modify the `solve_problem` function to handle the patch file returned by the `run_query` function.
• After calling `run_query`, you should have a tuple containing the AI response and the patch file. Unpack this tuple into two variables.
• Post the patch file to the GitHub issue along with the AI response. You can do this by appending the patch file to the AI response, or by posting it as a separate comment.
• To post the patch file as a separate comment, you can use the `post_comment_to_issue` function. This function takes the issue number, the comment text, and the repository owner and name as parameters.

Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/add-patch-file-generation.

.


🎉 Latest improvements to Sweep:

  • Sweep can now passively improve your repository! Check out Rules to learn more.

💡 To recreate the pull request edit the issue title or description. To tweak the pull request, leave a comment on the pull request.
Join Our Discord

@sweep-ai sweep-ai bot linked a pull request Oct 7, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sweep Sweep your software chores
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant