Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing trait literals are converted to lower case in augur export stderr output #1584

Open
corneliusroemer opened this issue Aug 17, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@corneliusroemer
Copy link
Member

Current Behavior

Augur export converts literals to lower case in its stderr output - which is probably not what we want

Expected behavior

Literals are output as they are, not changing case.

How to reproduce

Steps to reproduce the current behavior:

  1. Include some extra literals in a colors file, including upper case in non-initial position, e.g. 21L
  2. Run augur export with --colors
  3. Observe log message
$ augur export v2             --tree builds/wuhan/tree.nwk             --metadata builds/wuhan/metadata_with_bloom_scores.tsv             --node-data builds/wuhan/branch_lengths.json builds/wuhan/muts.json builds/wuhan/clades_display.json builds/wuhan/clades.json builds/wuhan/clades_nextstrain.json builds/wuhan/clades_who.json builds/wuhan/internal_pango.json             \
  --colors builds/wuhan/colors.tsv             --auspice-config profiles/clades/wuhan/auspice_config.json             --title 'SARS-CoV-2 phylogeny'             --description profiles/clades/description.md             --include-root-sequence-inline             --minify-json             --output auspice/wuhan/auspice_raw.json
        
Validating schema of 'builds/wuhan/muts.json'...
Validating config file profiles/clades/wuhan/auspice_config.json against the JSON schema
Validating schema of 'profiles/clades/wuhan/auspice_config.json'...
WARNING: Requested color-by field 'placement_priors' does not exist and will not be used as a coloring or exported.

WARNING: These values for trait clade_membership were not specified in the colors file you provided:
        21k, 21f, 20f, 20h, 20b, 21m, 19a, 20i, 20c, recombinant, 21e, 21g, 20g, 20a, 21j, 20e, 20j, 21d, 21i, 21a, 20d, 19b, 21b, 21c, 21h.
        Auspice will create colors for them.
WARNING: These values for trait clade_who were not specified in the colors file you provided:
        recombinant.
        Auspice will create colors for them.

WARNING: These values for trait clade_nextstrain were not specified in the colors file you provided:
        21k, 21f, 20f, 20h, 20b, 21m, 19a, 20i, 20c, recombinant, 21e, 21g, 20g, 20a, 21j, 20e, 20j, 21d, 21i, 21a, 20d, 19b, 21b, 21c, 21h.
        Auspice will create colors for them.

Validating produced JSON
Validating schema of 'auspice/wuhan/auspice_raw.json'...
Validating that the JSON is internally consistent...
        WARNING:  The filter "new_node" does not appear as a property on any tree nodes.
Validation of 'auspice/wuhan/auspice_raw.json' succeeded, but there were warnings you may want to resolve.

Note this line:

 21k, 21f, 20f, 20h, 20b, 21m, 19a, 20i, 20c, recombinant, 21e, 21g, 20g, 20a, 21j, 20e, 20j, 21d, 21i, 21a, 20d, 19b, 21b, 21c, 21h.

the input colors were 21K not 21k:

image
@corneliusroemer corneliusroemer added the bug Something isn't working label Aug 17, 2024
@jameshadfield
Copy link
Member

Here's the code behind this - the erroneous console output is a side-effect of the matching being done in lower case:

augur/augur/export_v2.py

Lines 330 to 344 in 988380c

elif key.lower() in provided_colors:
# `provided_colors` typically originates from a colors.tsv file
scale = []
trait_values = {str(val).lower(): val for val in get_values_across_nodes(node_attrs, key)}
trait_values_unseen = {k for k in trait_values}
for provided_key, provided_color in provided_colors[key.lower()]:
if provided_key.lower() in trait_values:
scale.append([trait_values[provided_key.lower()], provided_color])
trait_values_unseen.discard(provided_key.lower())
if len(scale):
coloring["scale"] = scale
if len(trait_values_unseen):
warn(f"These values for trait {key} were not specified in the colors file you provided:\n\t{', '.join(trait_values_unseen)}.\n\tAuspice will create colors for them.")
return coloring
warn(f"You've supplied a colors file with information for {key} but none of the values found on the tree had associated colors. Auspice will generate its own color scale for this trait.")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants