-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve definition of dependency table entries #420
Conversation
This is an example implementation, how we might solve #419. @ChristianGeng if you have a better idea, feel free to open a different pull request, or add your comments here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The old
DEPEND_TYPE_NAMES
is not needed any more and can be hardcoded to
"meta", "media" etc. without a loss in generality.
-
I personally like moving
DEPENDENCIES
->DEPENDENCY
to singular. There seems to be no clear convention on that topic, neither in python pep nor in the SQL community. Some people recommend to use plural for tables, and singular for columns in order to distinguish between te two. But even if it does not help to disambiguate I like reading it better -
Representing
DEPENDENCY_TABLE
as a dict is very fortunate: like this is looks very much like a table scheme. In addition, in case that we decide to move topolars
one day, this will rather simplify things, as polars uses a dict for data frame scheme definition. Saying that, polars uses acollections.OrderedDict
. I do not see the advantage of that, as all newer python version will respect the order of k,v creation and not return random permutations of key names. So using a dict I take it is fine. Also the dict stresses the data structure nature in comparison to principally unrelated attributes. -
The additional
docstrings
I find come in handy too. They are not intended to be part of the sphinx pages, but are great in their functions as remindersTherefore I would simply approve this.
Closes #419
The following improvements are done to all settings inside
audb.core.define
that are related to the dependency table:# Dependencies
sectionDEPENDENCY
in all related variablesThe idea of the previous implementation was to have a central place, where we can change a name if needed.
E.g.
define.DependField.ARCHIVE = "archive"
could be changed todefine.DependField.ARCHIVE = "zipfile"
,without the need to change it at any other place of the code. In principle, this makes sense, but with the dependency table, we are anyway not allowed to change those names, as they are also stored inside the dependency table as column names.
In my opinion, this means, we also don't have to ensure that we have a central place to define the name,
and can write
"archive"
instead ofdefine.DependField.ARCHIVE
, which also improves readability of the code. And allows us to specify the dictionaries, introduced in this pull request.