First commit (#1)

gnosis · Jun 6, 2024 · d07d61a · d07d61a
1 parent 8c1a55c
commit d07d61a
Show file tree

Hide file tree

Showing 11 changed files with 7,921 additions and 2 deletions.
diff --git a/.env.example b/.env.example
@@ -0,0 +1,3 @@
+OPENAI_API_KEY=
+TAVILY_API_KEY=
+BET_FROM_PRIVATE_KEY=
diff --git a/.github/actions/python_prepare/action.yaml b/.github/actions/python_prepare/action.yaml
@@ -0,0 +1,15 @@
+name: "Prepare Python environment"
+description: "Set up Python and install dependencies"
+runs:
+  using: "composite"
+  steps:
+    - name: Set up Python 3.10
+      uses: actions/setup-python@v2
+      with:
+        python-version: 3.10.14
+    - name: Install Poetry
+      shell: bash
+      run: curl -sSL https://install.python-poetry.org | python3 -
+    - name: Install dependencies
+      shell: bash
+      run: poetry install
diff --git a/.github/workflows/python_ci.yaml b/.github/workflows/python_ci.yaml
@@ -0,0 +1,18 @@
+name: Python CI
+
+on:
+  pull_request:
+  push:
+    branches: [main]
+  workflow_dispatch:
+
+jobs:
+  mypy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+        with:
+          submodules: true
+      - uses: ./.github/actions/python_prepare
+      - name: Run mypy
+        run: poetry run mypy
diff --git a/README.md b/README.md
@@ -1,2 +1,121 @@
-# gnosis-labs-zuberlin2024
-Repository for the hackathon run by Gnosis Labs at ZuBerlin 2024.
+# Gnosis Labs ZuBerlin 2024
+
+Welcome to the Gnosis AI ZuBerlin 2024 Hackathon repo! Here you will find all you need to build a tool for AI Agents that can make predictions on outcomes of future events.
+
+[Presentation available here.](https://docs.google.com/presentation/d/1gajA3m5p_X4R4oyNc80p5_uSYZz0z2R-YKxm0RQnz_4/edit?usp=sharing)
+
+Follow the instructions below to get started.
+
+## Support
+
+Contact us at https://t.me/+Fb0trLKZdMw2MTQ8.
+
+## Setup
+
+Install the project dependencies with `poetry`, using Python 3.10 (you can use [pyenv](https://github.com/pyenv/pyenv) to manage multiple Python versions):
+
+```bash
+python3.10 -m pip install poetry
+python3.10 -m poetry install
+python3.10 -m poetry shell
+```
+
+Copy `.env.example` to `.env` and fill in the values:
+
+### OpenAI API key
+
+We will provide you with OpenAI key that's allowed to use gpt-3.5-turbo and embedding models, contact us on the TG group above.
+
+However, everyone is welcome to use arbitrary LLM if wanted.
+
+### Tavily API key
+
+Create a free acount on https://tavily.com and get the key there.
+
+Again, everyone is welcome to use arbitrary search engines, combine them, or even do a totally different approaches!
+
+### Private key on Gnosis Chain
+
+Use your existing or create a new wallet on Gnosis Chain. 
+
+By default the script will do only very tiny bets (0.00001 xDai per market), but of course, you can contact us on the TG group above with your public key to get some free xDai.
+
+## Task
+
+Your task is to modify `predict` function in `trader/prediction.py` by any means necessary.
+
+Goal of the `predict` function is, given an `question` about the future, answer it with either `True` (the answer is `yes`), `False` (if the answer is `no`) or `None` (if the prediction failed).
+
+All the questions are guaranteed to be about the future and to be in a binary yes/no format.
+
+You can play with the prompts, different approaches, different LLMs, search engines, or anything you can think of.
+
+The code can be messy, the only thing we ask you is for it to be reproducible on our machines, and to help with that, there is `mypy` as the only check of CI pipeline on Github.
+
+A few ideas to jump start your experiments:
+
+On the research side:
+
+- Scrape multiple search engines
+- Scrape different kinds of sources depending on question type
+- Trying different methods for extracting valuable information from each site
+- Handling cases where two sources contain conflicting information
+
+On the prediction side:
+
+- Have an ensemble of agents making predictions, and taking an average or other aggregation method
+- Currently the LLM returns a float, and this is converted to a binary Yes/No answer by thresholding at 0.5. Experiment with having the LLM return different kinds of answers (e.g. categorical)
+
+### Testing your experiments
+
+Run 
+
+```bash
+PYTHONPATH=. streamlit run trader/app.py
+```
+
+to start a Streamlit application where you can give your prediction method either question [from the Omen market](https://aiomen.eth.limo/), or write your own.
+
+Run 
+
+```bash
+python trader/benchmark.py --n N
+```
+
+where `N` is number of markets to do a prediction on. The benchmark script will run
+
+1. Random agent (coin flip between yes and no answers)
+2. Question-only agent (only LLM call, without any information from internet)
+3. `prediction.py/predict`-based agent
+
+on `N` open markets from https://manifold.markets. 
+
+The idea is that markets on Manifold are mostly answered by real people, so the closer your agent is to their predictions, the better. However, it isn't always the case.
+
+Bear in mind your LLM credits, Tavily credits or any other paid 3rd provider credits when running the benchmark, as it answers many markets in a single run, which can be very costly.
+
+Run 
+
+```bash
+python trader/main.py
+```
+
+the script will place bets on random 10 markets from https://aiomen.eth.limo, these won't be used for the final evaluation, but you can double-check that all works as expected.
+
+### Submission
+
+1. Run `python trader/main.py --final`, it will place bets on all markets that will be used for the evaluation. You can run the script multiple times, but we will always look only at the latest bet on the market from your public key. If you get no markets found error, either we didn't open them yet, or they are already closed and it's too late for the submission. 
+2. Once you are happy with your agent's predictions, open a PR against this repository with your implementation and public key used for placing bets. This is your submission.
+3. Make sure the CI pipeline is all green.
+
+### Evaluation
+
+1. Quantitative 
+    1. We will create N markets from the address `0xa7E93F5A0e718bDDC654e525ea668c64Fd572882` by the end of the June, and they will be resolved in roughly two weeks after the creation.
+    2. We will measure the accuracy of your agent's answers (by the last bet on each market).
+
+2. Qualitative
+    1. We will look into implementation and judge the creativity of the improvements.
+
+3. Cheating
+    1. For example, sometimes, the exactly same markets can be found on other prediction market platforms. If we see in the code that the prediction isn’t doing anything practical, we will disqualify it. That being said, it's okay to look at other markets if they are not about the same question, for example, given the evaluation question `Will GNO hit $1000 by the end of 2025?` it's okay to use markets such as `Will GNO hit $500 by the mid of 2025?` as a guidance, but it's not okay to look at the market `Will GNO hit $1000 by the end of 2025?` and copy-paste current probabilities.
diff --git a/mypy.ini b/mypy.ini
@@ -0,0 +1,36 @@
+[mypy]
+python_version = 3.10
+files = trader/
+plugins = pydantic.mypy
+warn_redundant_casts = True
+warn_unused_ignores = True
+disallow_any_generics = True
+warn_return_any = True
+check_untyped_defs = True
+show_error_codes = True
+strict_equality = True
+explicit_package_bases = True
+show_traceback = True
+disallow_incomplete_defs = True
+disallow_untyped_defs = True
+ignore_missing_imports = True
+# Exclude submodules that are themselves not type-checked
+exclude = prediction_market_agent/tools/mech/mech/
+
+# See https://github.com/python/mypy/issues/3905#issuecomment-421065323
+# We don't want to ignore all missing imports as we want to catch those in our own code
+# But for certain libraries they don't have a stub file, so we only enforce import checking for our own libraries.
+# Another alternative would be to list out every single dependency that does not have a stub.
+[mypy-prediction_market_agent.*]
+ignore_missing_imports = False
+[mypy-scripts.*]
+ignore_missing_imports = False
+[mypy-tests.*]
+ignore_missing_imports = False
+
+[pydantic-mypy]
+# See https://pydantic-docs.helpmanual.io/mypy_plugin/
+init_forbid_extra = True
+init_typed = True
+warn_required_dynamic_aliases = True
+warn_untyped_fields = True