Skip to content

Commit

Permalink
First commit (#1)
Browse files Browse the repository at this point in the history
  • Loading branch information
kongzii authored Jun 6, 2024
1 parent 8c1a55c commit d07d61a
Show file tree
Hide file tree
Showing 11 changed files with 7,921 additions and 2 deletions.
3 changes: 3 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
OPENAI_API_KEY=
TAVILY_API_KEY=
BET_FROM_PRIVATE_KEY=
15 changes: 15 additions & 0 deletions .github/actions/python_prepare/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: "Prepare Python environment"
description: "Set up Python and install dependencies"
runs:
using: "composite"
steps:
- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: 3.10.14
- name: Install Poetry
shell: bash
run: curl -sSL https://install.python-poetry.org | python3 -
- name: Install dependencies
shell: bash
run: poetry install
18 changes: 18 additions & 0 deletions .github/workflows/python_ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: Python CI

on:
pull_request:
push:
branches: [main]
workflow_dispatch:

jobs:
mypy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
submodules: true
- uses: ./.github/actions/python_prepare
- name: Run mypy
run: poetry run mypy
123 changes: 121 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,121 @@
# gnosis-labs-zuberlin2024
Repository for the hackathon run by Gnosis Labs at ZuBerlin 2024.
# Gnosis Labs ZuBerlin 2024

Welcome to the Gnosis AI ZuBerlin 2024 Hackathon repo! Here you will find all you need to build a tool for AI Agents that can make predictions on outcomes of future events.

[Presentation available here.](https://docs.google.com/presentation/d/1gajA3m5p_X4R4oyNc80p5_uSYZz0z2R-YKxm0RQnz_4/edit?usp=sharing)

Follow the instructions below to get started.

## Support

Contact us at https://t.me/+Fb0trLKZdMw2MTQ8.

## Setup

Install the project dependencies with `poetry`, using Python 3.10 (you can use [pyenv](https://github.com/pyenv/pyenv) to manage multiple Python versions):

```bash
python3.10 -m pip install poetry
python3.10 -m poetry install
python3.10 -m poetry shell
```

Copy `.env.example` to `.env` and fill in the values:

### OpenAI API key

We will provide you with OpenAI key that's allowed to use gpt-3.5-turbo and embedding models, contact us on the TG group above.

However, everyone is welcome to use arbitrary LLM if wanted.

### Tavily API key

Create a free acount on https://tavily.com and get the key there.

Again, everyone is welcome to use arbitrary search engines, combine them, or even do a totally different approaches!

### Private key on Gnosis Chain

Use your existing or create a new wallet on Gnosis Chain.

By default the script will do only very tiny bets (0.00001 xDai per market), but of course, you can contact us on the TG group above with your public key to get some free xDai.

## Task

Your task is to modify `predict` function in `trader/prediction.py` by any means necessary.

Goal of the `predict` function is, given an `question` about the future, answer it with either `True` (the answer is `yes`), `False` (if the answer is `no`) or `None` (if the prediction failed).

All the questions are guaranteed to be about the future and to be in a binary yes/no format.

You can play with the prompts, different approaches, different LLMs, search engines, or anything you can think of.

The code can be messy, the only thing we ask you is for it to be reproducible on our machines, and to help with that, there is `mypy` as the only check of CI pipeline on Github.

A few ideas to jump start your experiments:

On the research side:

- Scrape multiple search engines
- Scrape different kinds of sources depending on question type
- Trying different methods for extracting valuable information from each site
- Handling cases where two sources contain conflicting information

On the prediction side:

- Have an ensemble of agents making predictions, and taking an average or other aggregation method
- Currently the LLM returns a float, and this is converted to a binary Yes/No answer by thresholding at 0.5. Experiment with having the LLM return different kinds of answers (e.g. categorical)

### Testing your experiments

Run

```bash
PYTHONPATH=. streamlit run trader/app.py
```

to start a Streamlit application where you can give your prediction method either question [from the Omen market](https://aiomen.eth.limo/), or write your own.

Run

```bash
python trader/benchmark.py --n N
```

where `N` is number of markets to do a prediction on. The benchmark script will run

1. Random agent (coin flip between yes and no answers)
2. Question-only agent (only LLM call, without any information from internet)
3. `prediction.py/predict`-based agent

on `N` open markets from https://manifold.markets.

The idea is that markets on Manifold are mostly answered by real people, so the closer your agent is to their predictions, the better. However, it isn't always the case.

Bear in mind your LLM credits, Tavily credits or any other paid 3rd provider credits when running the benchmark, as it answers many markets in a single run, which can be very costly.

Run

```bash
python trader/main.py
```

the script will place bets on random 10 markets from https://aiomen.eth.limo, these won't be used for the final evaluation, but you can double-check that all works as expected.

### Submission

1. Run `python trader/main.py --final`, it will place bets on all markets that will be used for the evaluation. You can run the script multiple times, but we will always look only at the latest bet on the market from your public key. If you get no markets found error, either we didn't open them yet, or they are already closed and it's too late for the submission.
2. Once you are happy with your agent's predictions, open a PR against this repository with your implementation and public key used for placing bets. This is your submission.
3. Make sure the CI pipeline is all green.

### Evaluation

1. Quantitative
1. We will create N markets from the address `0xa7E93F5A0e718bDDC654e525ea668c64Fd572882` by the end of the June, and they will be resolved in roughly two weeks after the creation.
2. We will measure the accuracy of your agent's answers (by the last bet on each market).

2. Qualitative
1. We will look into implementation and judge the creativity of the improvements.

3. Cheating
1. For example, sometimes, the exactly same markets can be found on other prediction market platforms. If we see in the code that the prediction isn’t doing anything practical, we will disqualify it. That being said, it's okay to look at other markets if they are not about the same question, for example, given the evaluation question `Will GNO hit $1000 by the end of 2025?` it's okay to use markets such as `Will GNO hit $500 by the mid of 2025?` as a guidance, but it's not okay to look at the market `Will GNO hit $1000 by the end of 2025?` and copy-paste current probabilities.
36 changes: 36 additions & 0 deletions mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
[mypy]
python_version = 3.10
files = trader/
plugins = pydantic.mypy
warn_redundant_casts = True
warn_unused_ignores = True
disallow_any_generics = True
warn_return_any = True
check_untyped_defs = True
show_error_codes = True
strict_equality = True
explicit_package_bases = True
show_traceback = True
disallow_incomplete_defs = True
disallow_untyped_defs = True
ignore_missing_imports = True
# Exclude submodules that are themselves not type-checked
exclude = prediction_market_agent/tools/mech/mech/

# See https://github.com/python/mypy/issues/3905#issuecomment-421065323
# We don't want to ignore all missing imports as we want to catch those in our own code
# But for certain libraries they don't have a stub file, so we only enforce import checking for our own libraries.
# Another alternative would be to list out every single dependency that does not have a stub.
[mypy-prediction_market_agent.*]
ignore_missing_imports = False
[mypy-scripts.*]
ignore_missing_imports = False
[mypy-tests.*]
ignore_missing_imports = False

[pydantic-mypy]
# See https://pydantic-docs.helpmanual.io/mypy_plugin/
init_forbid_extra = True
init_typed = True
warn_required_dynamic_aliases = True
warn_untyped_fields = True
Loading

0 comments on commit d07d61a

Please sign in to comment.