Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config / CLI refactoring #72

Merged
merged 22 commits into from
Oct 18, 2023
Merged

Config / CLI refactoring #72

merged 22 commits into from
Oct 18, 2023

Conversation

pmeier
Copy link
Member

@pmeier pmeier commented Oct 12, 2023

Yet another major refactoring PR, but hopefully the last one. Closes #50 and helps #71. Highlights:

  • Instead of using a .py file as configuration, we now use a TOML file. To configure Python objects, i.e. adding custom source storages or assistants, we now require them to be put into the configuration as a importable path, e.g. my_module.MyAssistant. For that, the user has to move my_module on the PYTHONPATH.

  • The config internally is now handled by pydantic-settings to eliminate the need for us to implement any type checking or add functionality to pull values from env vars.

  • Instead of having ragna.demo_config and ragna.builtin_config as "constants", they are now class methods on the config itself, i.e. ragna.Config.demo() and ragna.Config.builtin(). The latter at least removes the long import overhead detailed in Unify embedding model / tokenizer for builtin source storages? #71, since we now only perform the availability check if the user actually wants to use the config anyway.

  • Update pydantic to >=2. This was previously blocked by chroma, but was resolved for >=0.4.13.

  • Fix our document handler story, which has been bugging me for a while:

    if page_extractor is None:
    try:
    # FIXME:
    page_extractor = BUILTIN_PAGE_EXTRACTORS[Path(name).suffix]()
    except KeyError:
    raise RagnaException()

    Functionality is not much different, but is properly implemented now. I pondered on the question if we should make the document handlers configurable as well, but found no good use case. Why would someone deselect the PDF handler for example? Maybe there is a use case for this, but if so we can go for this later.

  • I've added an s to builtin component namespaces, i.e ragna.source_storage -> ragna.source_storages and ragna.assistant -> ragna.assistants to better reflect their nature.

  • The CLI got a new ragna config command that is used to create and check config TOML files. I'm planning to have an interactive wizard there, but this is lower prio. Will open for this.

  • Remove logging, since we don't have a good story on that yet

@pmeier pmeier changed the base branch from database to main October 12, 2023 21:00
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@pmeier pmeier changed the title [DO-NOT-MERGE] Config / CLI refactoring Oct 16, 2023
@pavithraes
Copy link
Member

Instead of having ragna.demo_config and ragna.builtin_config as "constants", they are now class methods on the config itself, i.e. ragna.Config.demo() and ragna.Config.builtin()

@pmeier First, thanks - I like this change. :) Quick question: What's the difference between the demo and builtin configs from the user's POV?

@pmeier
Copy link
Member Author

pmeier commented Oct 18, 2023

What's the difference between the demo and builtin configs from the user's POV?

The demo config is the bare minimum. No requirements other than the default ragna install needed. With the demo config, ragna is stateless and no worker is needed to actually perform the tasks. This is the default when using the Python API.

The builtin config includes all builtin components that ragna provides if the requirements (installed packages, env vars) are met. ragna will use a local database to store the state. Plus, we use a file system queue for the tasks, which in turn means that one needs to have a worker running for the tasks to be performed. This is the default for the REST API and will become the default for the UI.

@pmeier pmeier linked an issue Oct 18, 2023 that may be closed by this pull request
@pmeier pmeier merged commit 21ac6fb into main Oct 18, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG?] ragna ls raises an AttributeError How are we going to handle configuration?
2 participants