Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing tests with provided OPENAI_API_KEY #85

Merged
merged 3 commits into from
Sep 20, 2024

Conversation

iQuxLE
Copy link
Collaborator

@iQuxLE iQuxLE commented Sep 11, 2024

Fixing all tests that were currently failing even though an API key was given
test_mapper.py sometimes randomly cannot find the "test" collection, but this is not a continuous error.

Most relevant changes:

  • in the wrapper module we are now using actual temporary files to prevent readonly database operations
  • DatabaseAugmentedCompletionEvaluator was missing an abstract class
  • openai_extractor.py is now fully refactored to use the updated openAI API
  • duckdb_adapter.py had still some problems with fetching the right dimensions regarding a model
    • now you can use "openai:" for default text-embedding-ada-002, "openai:text-embedding-ada-002" or any other model after the colon (e.g. "openai:text-embedding-3-small"), if no model is provided default to sentence-transformer
  • clinvar_wrapper.py now correctly fetches the "clinical_significance" via "germline_classification" and does not break if "trait_set" is not in the data

@iQuxLE iQuxLE force-pushed the tests_with_apikey_fix branch 2 times, most recently from ab5763d to a245bf7 Compare September 11, 2024 23:19
…enAI API key was passed

- Implement abstractmethod in dae_evaluator and refactor to new OpenAI API
- Refactor dimension function in duckdb_adapter
- test refactoring: clinwar wrapper, dae evaluator, splitter
- Fix issues with readonly database and linting
@caufieldjh
Copy link
Member

I'm still getting some strange duckdb errors locally - any idea what is going on @iQuxLE ?

================================================================================= short test summary info ==================================================================================
FAILED tests/store/test_duckdb_adapter.py::test_store_variations[all-MiniLM-L6-v2-False] - duckdb.duckdb.Error: Failure while replaying WAL file "/home/harry/curate-gpt/tests/output/duckdbvss.db.wal": Cannot bind index 'test_collection', unknown index type 'HNSW'. You need ...
FAILED tests/store/test_duckdb_adapter.py::test_store_variations[None-False] - duckdb.duckdb.Error: Failure while replaying WAL file "/home/harry/curate-gpt/tests/output/duckdbvss.db.wal": Cannot bind index 'test_collection', unknown index type 'HNSW'. You need ...
FAILED tests/store/test_duckdb_adapter.py::test_fetch_all_memory_safe - duckdb.duckdb.Error: Failure while replaying WAL file "/home/harry/curate-gpt/tests/output/duckdbvss.db.wal": Cannot bind index 'test_collection', unknown index type 'HNSW'. You need ...
FAILED tests/store/test_duckdb_adapter.py::test_the_embedding_function_variations[None-None-False] - duckdb.duckdb.Error: Failure while replaying WAL file "/home/harry/curate-gpt/tests/output/duckdbvss.db.wal": Cannot bind index 'test_collection', unknown index type 'HNSW'. You need ...
FAILED tests/store/test_duckdb_adapter.py::test_the_embedding_function_variations[one_collection-None-False] - duckdb.duckdb.Error: Failure while replaying WAL file "/home/harry/curate-gpt/tests/output/duckdbvss.db.wal": Cannot bind index 'test_collection', unknown index type 'HNSW'. You need ...
ERROR tests/store/test_duckdb_adapter.py::test_diversified_search - duckdb.duckdb.Error: Failure while replaying WAL file "/home/harry/curate-gpt/tests/output/duckdbvss.db.wal": Cannot bind index 'test_collection', unknown index type 'HNSW'. You need ...

@caufieldjh
Copy link
Member

More of a note to myself than anyone else: reminder that running pytest through tox uses its own environment, so it doesn't access env vars like OPENAI_API_KEY

@caufieldjh
Copy link
Member

test_runner also seems to have trouble finding the test collection:

FAILED tests/evaluation/test_runner.py::test_runner[20-4-fields_to_predict0-fields_to_mask0] - ValueError: Insufficient test objects in collection terms_go_testing_4; 0 < 4
FAILED tests/evaluation/test_runner.py::test_runner[20-4-fields_to_predict1-fields_to_mask1] - ValueError: Insufficient test objects in collection terms_go_testing_4; 0 < 4
FAILED tests/evaluation/test_runner.py::test_runner[20-4-fields_to_predict2-fields_to_mask2] - ValueError: Insufficient test objects in collection terms_go_testing_4; 0 < 4

@iQuxLE
Copy link
Collaborator Author

iQuxLE commented Sep 13, 2024

@caufieldjh

I'm still getting some strange duckdb errors locally - any idea what is going on @iQuxLE ?

These WAL errors did not come up for me as of right now. However, it is some kind of a known problem with the persistent duckdb vector similarity search feature

The reasoning for locking this feature behind an experimental flag is that “WAL” recovery is not yet properly implemented for custom indexes, meaning that if a crash occurs or the database is shut down unexpectedly while there are uncommitted changes to a HNSW-indexed table, you can end up with data loss or corruption of the index.

By now I handled this by killing the old process when connecting to the db as here. I keep you updated and try to find a better solution.

@iQuxLE
Copy link
Collaborator Author

iQuxLE commented Sep 13, 2024

test_runner also seems to have trouble finding the test collection:

FAILED tests/evaluation/test_runner.py::test_runner[20-4-fields_to_predict0-fields_to_mask0] - ValueError: Insufficient test objects in collection terms_go_testing_4; 0 < 4
FAILED tests/evaluation/test_runner.py::test_runner[20-4-fields_to_predict1-fields_to_mask1] - ValueError: Insufficient test objects in collection terms_go_testing_4; 0 < 4
FAILED tests/evaluation/test_runner.py::test_runner[20-4-fields_to_predict2-fields_to_mask2] - ValueError: Insufficient test objects in collection terms_go_testing_4; 0 < 4

I also ran into those in some cases, however somehow most of the times it seemed to be not failing. I did lots of Debugging for this already and the data is stratified correctly but then suddenly It fails to find it. For this I think the reason might also be some "parallel" read problems, so different processes are looking into the same db.

@iQuxLE
Copy link
Collaborator Author

iQuxLE commented Sep 13, 2024

Some try-except-finaly blocks when creating DB instances, should help to prevent most of this. I restructure a bit, as also I wanna enable a more improved DEBUG_MODE.

@caufieldjh
Copy link
Member

OK, excellent. I'm not too worried about the duckdb issues as there are so many ways that can break and we can't account for all of them, plus we have chromadb as a fallback. But catching the errors around creating the DBs is a good idea.

…e isolated dbs for each test, introduce tmp_path and DEBUG mode, refactor to cleaner code
@iQuxLE
Copy link
Collaborator Author

iQuxLE commented Sep 20, 2024

This is a quick fix so not all dbs from tests/wrappers are outputted locally.

It enables better debugging and isolated dbs for all wrapper tests.
Also it gives a structure on how to redo the go_test_chroma_db fixture.

  • It is used in multiple tests, and I prefer a solution where each test uses a distinct db instance/collection

@caufieldjh caufieldjh merged commit e35bed8 into monarch-initiative:main Sep 20, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants