Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding seqkit concat #4841

Merged
merged 23 commits into from
Feb 9, 2024
Merged

Adding seqkit concat #4841

merged 23 commits into from
Feb 9, 2024

Conversation

DLBPointon
Copy link
Contributor

Adding the seqkit concat command.

PR checklist

Closes #XXX

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

@DLBPointon DLBPointon added new module Adding a new module Do not merge yet labels Feb 2, 2024
@DLBPointon DLBPointon self-assigned this Feb 2, 2024
@DLBPointon DLBPointon requested a review from a team as a code owner February 2, 2024 16:16
@DLBPointon
Copy link
Contributor Author

Ok, it looks like concat will produce the file of the concatenated results like it should. However, they will not be in any particular order from run to run. Need to swap the md5sum to a file contains assertion.

@heuermh
Copy link
Contributor

heuermh commented Feb 6, 2024

https://bioinf.shenwei.me/seqkit/usage/

Reproducibility

Subcommands sample and shuffle use random function, random seed could be given by flag -s (--rand-seed). This makes sure that sampling result could be reproduced in different environments with same random seed.

Wonder if this is also true for concat?

concatenate sequences with same ID from multiple files

Attentions:
...
4. Order of sequences with different IDs are random.

@DLBPointon
Copy link
Contributor Author

https://bioinf.shenwei.me/seqkit/usage/

Reproducibility
Subcommands sample and shuffle use random function, random seed could be given by flag -s (--rand-seed). This makes sure that sampling result could be reproduced in different environments with same random seed.

Wonder if this is also true for concat?

concatenate sequences with same ID from multiple files
Attentions:
...
4. Order of sequences with different IDs are random.

Ah the docs, would have made sense if I read those too. Thanks for confirming what I thought.

@heuermh
Copy link
Contributor

heuermh commented Feb 7, 2024

Sorry, a little fast on the commit suggestion button! Mistaked this open pull request for my own 😬

@DLBPointon
Copy link
Contributor Author

Sorry, a little fast on the commit suggestion button! Mistaked this open pull request for my own 😬

No worries, just means you're a maintainer now too

"""

stub:
def args = task.ext.args ?: ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove def args in stub if it is not used. Actually I don't know if there is a recommendation, but it saves 1 line of code 🤷

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, it keeps it a bit more similar to the normal test. So whichever is fine, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, in my other modules, I have only kept prefixes because they are actually used. So removing it is good for consistency at least? Like you said it is a waste line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do the same for my modules ;) So feel free to update that and then ready to merge 🚀

@DLBPointon DLBPointon added this pull request to the merge queue Feb 9, 2024
Merged via the queue into master with commit d934254 Feb 9, 2024
11 checks passed
@sateeshperi sateeshperi deleted the dp24_seqkit_concat branch February 9, 2024 16:03
jch-13 pushed a commit to jch-13/modules that referenced this pull request Mar 19, 2024
* Adding seqkit concat

* Prettier

* Updates

* Update format

* Correct md5sum

* Updated to 2.7.0, further attempts to get it working

* Corrected tests

* Corrected tests

* Corrected tests

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Maxime U Garcia <[email protected]>

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Sateesh_Peri <[email protected]>

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Sateesh_Peri <[email protected]>

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Sateesh_Peri <[email protected]>

* Updating Snapshot

* Update main.nf

Remove unnecessary args assignment for stub

---------

Co-authored-by: Maxime U Garcia <[email protected]>
Co-authored-by: Michael L Heuer <[email protected]>
Co-authored-by: Sateesh_Peri <[email protected]>
jennylsmith pushed a commit to RSC-RP/modules that referenced this pull request Mar 20, 2024
* Adding seqkit concat

* Prettier

* Updates

* Update format

* Correct md5sum

* Updated to 2.7.0, further attempts to get it working

* Corrected tests

* Corrected tests

* Corrected tests

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Maxime U Garcia <[email protected]>

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Sateesh_Peri <[email protected]>

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Sateesh_Peri <[email protected]>

* Update modules/nf-core/seqkit/concat/tests/main.nf.test

Co-authored-by: Sateesh_Peri <[email protected]>

* Updating Snapshot

* Update main.nf

Remove unnecessary args assignment for stub

---------

Co-authored-by: Maxime U Garcia <[email protected]>
Co-authored-by: Michael L Heuer <[email protected]>
Co-authored-by: Sateesh_Peri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants