Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for empty user-defined subsampling scheme #1082

Merged
merged 1 commit into from
Sep 19, 2023

Conversation

huddlej
Copy link
Contributor

@huddlej huddlej commented Sep 18, 2023

Description of proposed changes

Adds a check for an empty user-defined subsampling scheme and provides a helpful error message when this happens to replace the unhelpful AttributeError message users would get before when the workflow tried to access a nonexistent method of a NoneType.

This error can happen when a user defines a custom subsampling scheme where the contents are not properly indented below the subsampling key. The resulting config is still a valid YAML file, so the emptiness of the subsampling scheme needs to be checked specifically. We could probably handle this better and more generically with a more stringent YAML schema.

Along the lines of that last point, I tried adding the following lines to the config schema to prevent an empty subsampling scheme:

  subsampling:
    type: object
    minProperties: 1

This change did not catch the issue with the user-defined configuration, however, because we call the validation function after we load the default config file and it is during this step that Snakemake throws an error (presumably when deep-merging the user-provided and default config files).

Next, I tried validating both before and after the configfile declaration in the Snakefile. This change catches the incorrect config file and prints the following error for the user:

WorkflowError in file /Users/jhuddlesfredhutch.org/projects/nextstrain/ncov/Snakefile, line 36:
Error validating config file.
ValidationError: None is not of type 'object'

Failed validating 'type' in schema['properties']['subsampling']:
    OrderedDict([('type', 'object'), ('minProperties', 1)])

On instance['subsampling']:
    None
  File "/Users/jhuddlesfredhutch.org/projects/nextstrain/ncov/Snakefile", line 36, in <module>

I like that this error gets generated from the schema without additional code added to the workflow itself. I wish the error from this validation step provided more details to the user about what specifically went wrong. The error I provide in the commit of this PR seems more helpful to the user, even though it requires a modification to the workflow

Testing

  • Tested with a broken builds YAML where the subsampling scheme was not properly indented
  • Tested with CI

@huddlej huddlej mentioned this pull request Sep 18, 2023
1 task
Adds a check for an empty user-defined subsampling scheme and provides a
helpful error message when this happens to replace the unhelpful
`AttributeError` message users would get before when the workflow tried to
access a nonexistent method of a `NoneType`.

This error can happen when a user defines a custom subsampling scheme
where the contents are not properly indented below the `subsampling`
key. The resulting config is still a valid YAML file, so the emptiness
of the subsampling scheme needs to be checked specifically. We could
probably handle this better and more generically with a more stringent
YAML schema.
@huddlej huddlej merged commit f89f2a2 into master Sep 19, 2023
13 checks passed
@huddlej huddlej deleted the check-for-empty-subsampling branch September 19, 2023 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

1 participant