Skip to content

argilla-io/rubrix-feedstock

 
 

Repository files navigation

About rubrix

Home: https://recogn.ai

Package license: Apache-2.0

Feedstock license: BSD-3-Clause

Summary: Open-source tool for exploring, labeling, and monitoring data for NLP projects.

Development: https://github.com/recognai/rubrix

Documentation: https://rubrix.readthedocs.io/en/stable/

Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects.

Key features:

  • Open: Rubrix is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can use and combine your preferred libraries without implementing any specific interface.

  • End-to-end: Most annotation tools treat data collection as a one-off activity at the beginning of each project. In real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model goes into production, you want to monitor and analyze its predictions, and collect more data to improve your model over time. Rubrix is designed to close this gap, enabling you to iterate as much as you need.

  • User and Developer Experience: The key to sustainable NLP solutions is to make it easier for everyone to contribute to projects. Domain experts should feel comfortable interpreting and annotating data. Data scientists should feel free to experiment and iterate. Engineers should feel in control of data pipelines. Rubrix optimizes the experience for these core users to make your teams more productive.

  • Beyond hand-labeling: Classical hand labeling workflows are costly and inefficient, but having humans-in-the-loop is essential. Easily combine hand-labeling with active learning, bulk-labeling, zero-shot models, and weak-supervision in novel data annotation workflows.

PyPI: https://pypi.org/project/rubrix

Current build status

All platforms:

Current release info

Name Downloads Version Platforms
Conda Recipe Conda Downloads Conda Version Conda Platforms
Conda Recipe Conda Downloads Conda Version Conda Platforms

Installing rubrix

Installing rubrix from the conda-forge channel can be achieved by adding conda-forge to your channels with:

conda config --add channels conda-forge
conda config --set channel_priority strict

Once the conda-forge channel has been enabled, rubrix, rubrix-server can be installed with conda:

conda install rubrix rubrix-server

or with mamba:

mamba install rubrix rubrix-server

It is possible to list all of the versions of rubrix available on your platform with conda:

conda search rubrix --channel conda-forge

or with mamba:

mamba search rubrix --channel conda-forge

Alternatively, mamba repoquery may provide more information:

# Search all versions available on your platform:
mamba repoquery search rubrix --channel conda-forge

# List packages depending on `rubrix`:
mamba repoquery whoneeds rubrix --channel conda-forge

# List dependencies of `rubrix`:
mamba repoquery depends rubrix --channel conda-forge

About conda-forge

Powered by NumFOCUS

conda-forge is a community-led conda channel of installable packages. In order to provide high-quality builds, the process has been automated into the conda-forge GitHub organization. The conda-forge organization contains one repository for each of the installable packages. Such a repository is known as a feedstock.

A feedstock is made up of a conda recipe (the instructions on what and how to build the package) and the necessary configurations for automatic building using freely available continuous integration services. Thanks to the awesome service provided by Azure, GitHub, CircleCI, AppVeyor, Drone, and TravisCI it is possible to build and upload installable packages to the conda-forge Anaconda-Cloud channel for Linux, Windows and OSX respectively.

To manage the continuous integration and simplify feedstock maintenance conda-smithy has been developed. Using the conda-forge.yml within this repository, it is possible to re-render all of this feedstock's supporting files (e.g. the CI configuration files) with conda smithy rerender.

For more information please check the conda-forge documentation.

Terminology

feedstock - the conda recipe (raw material), supporting scripts and CI configuration.

conda-smithy - the tool which helps orchestrate the feedstock. Its primary use is in the construction of the CI .yml files and simplify the management of many feedstocks.

conda-forge - the place where the feedstock and smithy live and work to produce the finished article (built conda distributions)

Updating rubrix-feedstock

If you would like to improve the rubrix recipe or build a new package version, please fork this repository and submit a PR. Upon submission, your changes will be run on the appropriate platforms to give the reviewer an opportunity to confirm that the changes result in a successful build. Once merged, the recipe will be re-built and uploaded automatically to the conda-forge channel, whereupon the built conda packages will be available for everybody to install and use from the conda-forge channel. Note that all branches in the conda-forge/rubrix-feedstock are immediately built and any created packages are uploaded, so PRs should be based on branches in forks and branches in the main repository should only be used to build distinct package versions.

In order to produce a uniquely identifiable distribution:

  • If the version of a package is not being increased, please add or increase the build/number.
  • If the version of a package is being increased, please remember to return the build/number back to 0.

Feedstock Maintainers

About

A conda-smithy repository for rubrix.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%