Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update help docs for augur curate format-dates #1651

Open
anna-parker opened this issue Oct 16, 2024 · 7 comments · May be fixed by #1653
Open

Update help docs for augur curate format-dates #1651

anna-parker opened this issue Oct 16, 2024 · 7 comments · May be fixed by #1653
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@anna-parker
Copy link
Collaborator

anna-parker commented Oct 16, 2024

Current Behavior

Hi! I am running augur curate format-dates, but augur is unable to format any of the dates.

WARNING: Unable to format dates for the following (record, field, date string):
(11, 'date', '1987')
(12, 'date', '1999-05')
...

Expected behavior

I would expect augur to format the dates as per documentation: 1987 should become 1987-XX-XX, '1999-05' should become 1999-05-XX.

How to reproduce

Steps to reproduce the current behavior:

  1. I tried to attach a downsampled tsv file with only the two fields I displayed above - this didn't work - so I added the small example to my branch: https://github.com/anna-parker/marburg-virus-tree/blob/curate_dates/example.tsv, you can download this and then run
  2. run
augur curate format-dates --metadata example.tsv --date-fields "date" --output-metadata example_output.tsv --failure-reporting warn
  1. See that in example_output.tsv both dates have be formatted as XXXX-XX-XX.

Your environment: if running Nextstrain locally

  • Operating system: MacOs Sonoma 14.6.1
  • Version:
$ auspice --version
2.53.0
@anna-parker anna-parker added the bug Something isn't working label Oct 16, 2024
@joverlee521
Copy link
Contributor

Hi @anna-parker, sorry if the documentation is not clear here!

You will need to provide the --expected-date-formats option for custom date formats that do not match the current defaults ['%Y-%m-%d', '%Y-%m-XX', '%Y-XX-XX', 'XXXX-XX-XX'].

If you update the command to include --expected-date-formats '%Y' '%Y-%m', it should parse the dates as expected.
Using your example data, I was able to run the following without any warnings:

augur curate format-dates \
    --metadata example.tsv \
    --date-fields "date" \
    --expected-date-formats '%Y' '%Y-%m' \
    --output-metadata example_output.tsv \
    --failure-reporting warn

Please let us know if this doesn't work for you and we can investigate further.

@corneliusroemer corneliusroemer added documentation Improvements or additions to documentation and removed bug Something isn't working labels Oct 16, 2024
@corneliusroemer corneliusroemer changed the title augur curate format-dates is unable to parse dates with only year or year-month augur curate format-dates is unable to parse dates with only year or year-month in contrast to what the docs suggest Oct 16, 2024
@corneliusroemer
Copy link
Member

corneliusroemer commented Oct 16, 2024

Thanks @joverlee521 for explaining where the issue lies. I hope you don't mind if I reopen this issue as while not a bug, it's a documentation issue that hasn't yet been resolved afaict:

The first line of augur curate docs makes users believe that 2023 -> 2023-XX-XX works by default:

Brave Browser 2024-10-16 19 13 43

https://docs.nextstrain.org/projects/augur/en/stable/usage/cli/curate/format-dates.html

Looking further down the docs, it's strange that --expected-date-formats is listed under the "REQUIRED" header if it has defaults set and hence isn't actually required. It's only required in the sense that it's a no-op if the argument isn't passed.

Brave Browser 2024-10-16 19 13 46

So given how it's documented, one should either remove the defaults so that it's actually required, or move it to optional, or change the defaults.

@corneliusroemer
Copy link
Member

There's one more oddity in the docs:
Default being stated twice, once as true and once as false

Brave Browser 2024-10-16 19 25 40

It might be nice to mention a standard usage as well, something like this which was generated by ChatGPT after reading the docs:

augur curate format-dates --metadata metadata.tsv --output curated_metadata.tsv --date-fields date_column --expected-date-formats "%Y" "%Y-%m" "%Y-%m-%d"

@joverlee521 joverlee521 changed the title augur curate format-dates is unable to parse dates with only year or year-month in contrast to what the docs suggest Update help docs for augur curate format-dates Oct 16, 2024
@joverlee521
Copy link
Contributor

The first line of augur curate docs makes users believe that 2023 -> 2023-XX-XX works by default:

Ah, I see that now. Is it clearer if updated to

Screenshot 2024-10-16 at 10 55 42 AM

So given how it's documented, one should either remove the defaults so that it's actually required, or move it to optional, or change the defaults.

I missed this when I added the default values in #1501. I'll move --expected-date-formats down to the optional section.

Default being stated twice, once as true and once as false

The --no-mask-failure oddity is noted in #1585 and is an Augur-wide issue that will be resolved separately.

@corneliusroemer
Copy link
Member

Thanks @joverlee521! Those edits are great! What I don't yet understand is what format-dates does if run with default values. Does it error if any of the dates are not one of the expected default formats?

@joverlee521 joverlee521 self-assigned this Oct 16, 2024
@joverlee521
Copy link
Contributor

What I don't yet understand is what format-dates does if run with default values. Does it error if any of the dates are not one of the expected default formats?

If only run with default values it will just be no-op and will error if the dates do not match the default formats.

Hmm, maybe it would be better to just mark the --expected-date-formats option as explicitly required even though it has default values.

@anna-parker
Copy link
Collaborator Author

@joverlee521 thanks for looking into this! I think the screenshot you provided already makes this much clearer

Screenshot 2024-10-16 at 10 55 42 AM

@joverlee521 joverlee521 linked a pull request Oct 17, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants