Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update UniVec database. Remove duplicates from oxford_nanopore.fasta #192

Merged
merged 1 commit into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ Changelog
.. This document is user facing. Please word the changes in such a way
.. that users understand how the changes affect the new version.

version 0.12.0-dev
------------------
+ Update internal UniVec database to version from November 21st 2023.

version 0.11.1
------------------
+ Fix a memory leak that occurred in Python 3.12 due to a refcounting API
Expand Down
5 changes: 3 additions & 2 deletions src/sequali/contaminants/README
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,12 @@ Other sources of contaminations are:
Source : (https://www.ncbi.nlm.nih.gov/tools/vecscreen/contam/

NCBI keeps a database called UniVec for vectors and updated. It was last
updated on May 5th 2017. It can be downloaded from:
updated on November 21st 2023. It can be downloaded from:
https://ftp.ncbi.nlm.nih.gov/pub/UniVec

The UniVec database contains many adapters, linkers, and primers but not yet
a lot of those that are used in Oxford nanopore sequencing. The
a lot of those that are used in Oxford nanopore sequencing. As of 2023
some nanopore Adapters have been added. The
oxford_nanopore.fasta file provides the sequences as represented in the
technical documentation. All possible barcoding sequencing adapters for
nanopore have not yet been added.
Expand Down
41 changes: 41 additions & 0 deletions src/sequali/contaminants/UniVec.fasta
Original file line number Diff line number Diff line change
Expand Up @@ -27103,3 +27103,44 @@ GCGCTTTTGGCGAACCTTCTCGCATTAGCTGCGTTGTGCATATTGGCGATGGTG
>gnl|uv|A08585.1:1-86 pMG36 expression vector DNA sequence
ATGGCAATCGTTTCAGCAGAAAAATTCGTAATTCGAGCTCGCCCGGGGATCGATCCTCTAGAGTCGACCT
GCAGGCATGCAAGCTT
>gnl|uv|NGB01098.1:1-61 Oxford Nanopore Technologies Sequencing Adapter Y_top
GGCGTCTGCTTGGGTGTTTAACCTTTTTTTTTTAATGTACTTCGTTCAGTTACGTATTGCT
>gnl|uv|NGB01099.1:1-22 Oxford Nanopore Technologies Sequencing Adapter Y_bottom
GCAATACGTAACTGAACGAAGT
>gnl|uv|NGB01100.1:1-15 Oxford Nanopore Technologies PCR Barcoding Kit (SQK-PBK004) 5' primer top
ATCGCCTACCGTGAC
>gnl|uv|NGB01101.1:1-29 Oxford Nanopore Technologies PCR Barcoding Kit (SQK-PBK004) 3' primer top
TTAACCTACTTGCCTGTCGCTCTATCTTC
>gnl|uv|NGB01102.1:1-22 Oxford Nanopore Technologies PCR Barcoding Kit (SQK-PBK004) 3' primer bottom
TTTCTGTTGGTGCTGATATTGC
>gnl|uv|NGB01103.1:1-30 Oxford Nanopore Technologies Direct RNA Sequencing Kit (SQK-RNA002) RT Adapter top
GGCTTCTTCTTGCTCTTAGGTAGTAGGTTC
>gnl|uv|NGB01104.1:1-49 Oxford Nanopore Technologies Direct RNA Sequencing Kit (SQK-RNA002) RT Adapter bottom
GAGGCGAGCGGTCAATTTTCCTAAGAGCAAGAAGAAGCCTTTTTTTTTT
>gnl|uv|NGB01105.1:1-31 Oxford Nanopore Technologies Direct RNA Sequencing Kit (SQK-RNA002) RNA Adapter Mix Top
TTTTTTTTTTTTTATGATGCAAGATACGCAC
>gnl|uv|NGB01106.1:1-76 Oxford Nanopore Technologies Direct RNA Sequencing Kit (SQK-RNA002) RNA Adapter Mix Bottom
GAGGCGAGCGGTCAATTTGCAATATCAGCACCAACAGAAACAACCATCGTCTATCCCTCATCATCAGAAC
CTACTA
>gnl|uv|NGB01107.1:1-36 Oxford Nanopore Technologies Ligation Sequencing Kit (SQK-LSK114) Ligation Adaptor Y Top
TTTTTTTTCCTGTACTTCGTTCAGTTACGTATTGCT
>gnl|uv|NGB01108.1:1-27 Oxford Nanopore Technologies Ligation Sequencing Kit (SQK-LSK114) Ligation Adaptor Y Bottom
GCAATACGTAACTGAACGAAGTACAGG
>gnl|uv|NGB01109.1:1-26 PacBio ULI gDNA amplification adapter
AAGCAGTGGTATCAACGCAGAGTACT
>gnl|uv|NGB01110.1:1-19 Illumina Nextera-Illumina PCR Kit Adapter Read 1-Read 2
CTGTCTCTTATACACATCT
>gnl|uv|NGB01111.1:1-16 Illumina DNA PCR-Free Tagmentation Adapter
ATGTGTATAAGAGACA
>gnl|uv|NGB01112.1:1-19 Nextera Mate Pair Adapter
AGATGTGTATAAGAGACAG
>gnl|uv|NGB01113.1:1-33 Illumina TruSeq UD/CD Adapter Trimming Read 1
AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
>gnl|uv|NGB01114.1:1-33 Illumina TruSeq UD/CD Adapter Trimming Read 2
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
>gnl|uv|NGB01115.1:1-21 Illumina TruSeq Small RNA Adapter Trimming Read
TGGAATTCTCGGGTGCCAAGG
>gnl|uv|NGB01116.1:1-19 Illumina TruSeq RNA 5' Adapter RA5
GCAGAGCACAGCCGACGAC
>gnl|uv|NGB01117.1:1-21 Illumina TruSeq RNA 3' Adapter RA3
TGGAATTCTCGGGTGCCAAGG
16 changes: 0 additions & 16 deletions src/sequali/contaminants/oxford_nanopore.fasta
Original file line number Diff line number Diff line change
@@ -1,21 +1,9 @@
>Oxford nanopore ligation kit or rapid adapter, top strand
TTTTTTTTCCTGTACTTCGTTCAGTTACGTATTGCT
>Oxford nanopore ligation kit, bottom strand
GCAATACGTAACTGAACGAAGTACAGG
>Oxford nanopore Strand Switching Primer II (SSPII)
TTTCTGTTGGTGCTGATATTGCTTT
>Oxford nanopore RT Primer (RTP)
CTTGCCTGTCGCTCTATCTTCAGAGGAG
>Oxford nanopore cDNA RT Adapter (CRT)
CTTGCGGGCGGCGGACTCTCCTCTGAAGATAGAGCGACAGGCAAG
>Oxford nanopore RT Adapter (RTA), top strand
GGCTTCTTCTTGCTCTTAGGTAGTAGGTTC
>Oxford nanopore RT Adapter (RTA), bottom strand
GAGGCGAGCGGTCAATTTTCCTAAGAGCAAGAAGAAGCCTTTTTTTTTT
>Oxford nanopore RNA Adapter Mix (RMX), top strand
TTTTTTTTTTTTTATGATGCAAGATACGCAC
>Oxford nanopore RNA Adapter Mix (RMX), bottom strand
GAGGCGAGCGGTCAATTTGCAATATCAGCACCAACAGAAACAACCATCGTCTATCCCTCATCATCAGAACCTACTA
>Oxford nanopore CDNA primer, forward sequence
ATCGCCTACCGTGACAAGAAAGTTGTCGGTGTCTTTGTGACTTGCCTGTCGCTCTATCTTC
>Oxford nanopore CDNA primer, reverse sequence
Expand All @@ -24,10 +12,6 @@ ATCGCCTACCGTGACAAGAAAGTTGTCGGTGTCTTTGTGTTTCTGTTGGTGCTGATATTGC
ACTTGCCTGTCGCTCTATCTTCTTTTTTTTTTTTTTTTTTTT
>Oford nanopore Strand Switching Primer (SSP)
TTTCTGTTGGTGCTGATATTGCTGGG
>Oxford nanopore Adapter Mix (AMX), top strand
TTTTTTTTTTAATGTACTTCGTTCAGTTACGTATTGCT
>Oxford nanopore Adapter Mix (AMX), bottom strand
GCAATACGTAACTGAACGAAGT
>Oxford nanopore Native Adapter (NA), top strand
TTTTTTTTCCTGTACTTCGTTCAGTACGTATTGCT
>Oxford nanopore Native Adapter (NA), bottom strand
Expand Down
Loading