Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io.read_metadata does not always set index dtype as expected #925

Open
joverlee521 opened this issue May 13, 2022 · 2 comments
Open

io.read_metadata does not always set index dtype as expected #925

joverlee521 opened this issue May 13, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@joverlee521
Copy link
Contributor

Current Behavior

We try to explicitly set the dtype for the index column within io.read_metadata:

augur/augur/io.py

Lines 116 to 121 in 139ba04

# If we found a valid column to index the DataFrame, specify that column and
# also tell pandas that the column should be treated like a string instead
# of having its type inferred. This latter argument allows users to provide
# numerical ids that don't get converted to numbers by pandas.
kwargs["index_col"] = index_col
kwargs["dtype"] = {index_col: "string"}

However, this does not work as expected because of a bug in pandas where the dtype is ignored for index_col.

Additional context

Found this issue when looking into nextstrain/ncov#948

@joverlee521 joverlee521 added the bug Something isn't working label May 13, 2022
@victorlin
Copy link
Member

Seems like this was fixed in pandas 1.4.0, should we bump minimum requirement in setup.py?

@joverlee521
Copy link
Contributor Author

Hmm, starting with pandas 1.4.0, they only support Python 3.8 and higher. Maybe we should just work around it to avoid bumping the minimum Python dependency for Augur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: Backlog
Development

No branches or pull requests

2 participants