Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I get an error "ValueError: Length mismatch: Expected axis has 0 elements, new values have 19 elements" #18

Closed
zhzhai opened this issue Jul 16, 2023 · 4 comments

Comments

@zhzhai
Copy link

zhzhai commented Jul 16, 2023

$ lanceotron callPeaks --format Web sort.my.bw
Traceback (most recent call last):
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/bin/lanceotron", line 8, in
sys.exit(cli())
^^^^^
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/lib/python3.11/site-packages/lanceotron/cli.py", line 123, in cli
args.func(**vars(args))
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/lib/python3.11/site-packages/lanceotron/modules.py", line 122, in find_and_score_peaks
output.columns = [
^^^^^^^^^^^^^^
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/lib/python3.11/site-packages/pandas/core/generic.py", line 6002, in setattr
return object.setattr(self, name, value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.set
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/lib/python3.11/site-packages/pandas/core/generic.py", line 730, in _set_axis
self._mgr.set_axis(axis, labels)
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 225, in set_axis
self._validate_set_axis(axis, new_labels)
File "/apps/users/zhaizhihao/miniconda3/envs/gostripe/lib/python3.11/site-packages/pandas/core/internals/base.py", line 70, in _validate_set_axis
raise ValueError(
ValueError: Length mismatch: Expected axis has 0 elements, new values have 19 elements

@LHentges
Copy link
Owner

Hello, sorry you're having a problem! Would you be able to tell me which version you're running (Looks like you're using conda, so in the env with LanceOtron you can just type conda list, and find the version there).

Also, could make the bigwig file you are using available?

@zhzhai
Copy link
Author

zhzhai commented Jul 21, 2023

Thank you for your reply!

  1. The lanceotron version is 1.2.6.
  2. My bigwig file can be open in the IGV, and it from a bam:
    $bamCoverage --bam filename.bam.sorted -o filename.bw --extendReads -bs 1 --normalizeUsing RPKM
  3. I also use your test file to do it, the results are here:
    $lanceotron callPeaks chr22.bw
    /apps/users/zhaizhihao/miniconda3/envs/lanceotron/lib/python3.11/site-packages/sklearn/base.py:347: InconsistentVersionWarning: Trying to unpickle estimator StandardScaler from version 0.23.1 when using version 1.3.0. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
    https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
    warnings.warn(
    /apps/users/zhaizhihao/miniconda3/envs/lanceotron/lib/python3.11/site-packages/sklearn/base.py:347: InconsistentVersionWarning: Trying to unpickle estimator StandardScaler from version 0.23.1 when using version 1.3.0. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
    https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
    warnings.warn(
    2023-07-21 22:57:38.224816: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
    2023-07-21 22:57:40.674857: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
    2023-07-21 22:57:40.675998: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
    To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
    2023-07-21 22:58:30.149583: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
    36/36 [==============================] - 8s 186ms/step

I don't kwon what the problem is, can you help me? Thanks you very much!

@LHentges
Copy link
Owner

LHentges commented Aug 4, 2023

Thank you for your detailed response!

For the test file that you have run, I believe LanceOtron successfully finished running - do you have a chr22_Ltron.bed file in the directory you ran it in?

What you are seeing are warnings, rather than errors. These warnings are coming from two packages: TensorFlow and SciKit-learn

For TF, there are quite a few optimisations done to increase speed. On your particular machine, it looks like there were a few of these optimisations that were attempted but ultimately couldn't be applied.

For sklearn, normalisation was carried out version 0.23.1 and these weights were saved so they could be applied to new data. In this way new data is normalised in the exact same way as the original training data. You have a different version of sklearn installed (1.3.0), so it warns you in case the developers of the package change the way they carry out normalisation. This has been known for some time, and does not affect the behaviour of LanceOtron, but it should still be dealt with so I am raising another issue for that separately (issue 19).

Just to be clear these warnings will not stop LanceOtron from running, and are likely separate from the original issue you faced.

I have some inclination it has to do with python 3.11 and the latest conda install - I'm going to do some more testing next week to see if I can get to the bottom of your problem.

@LHentges
Copy link
Owner

LHentges commented Aug 9, 2023

So I can LanceOtron works in Python 3.11 with the latest conda install in my testing. I was also able to reproduce the same error message.

What is happening is that no candidate peaks are found in your bigwig track. This means there was nothing to score, and when the output was being written it inferred there were no columns in the dataframe, then when attempting to assign the 19 column names it threw an error.

You can rerun LanceOtron and change some of the parameters (it looks like you were using the default values), making them less stringent and see if any candidate peaks arise from that. These would then proceed as normal to be scored and output.

I think this resolves your individual issue, but I'm going to open a new one to improve the handling of this situation (issue 20).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants