Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annotate ID for multi-allelic sites #2264

Open
ikarus97 opened this issue Aug 20, 2024 · 1 comment
Open

annotate ID for multi-allelic sites #2264

ikarus97 opened this issue Aug 20, 2024 · 1 comment

Comments

@ikarus97
Copy link

I found that ID column from multi-allelic site from a source file only transferred to the first allele in the target file.

My source annotation file:

chr21	5030278	rs1258851236	C	G,T	.	.	RS=1258851236

And my target file:

chr21	5030278	.	C	G	.	.	.
chr21	5030278	.	C	T	.	.	.

The command I used is as follows:
bcftools annotate -c +ID -a [source file] [target file]

And I got:

chr21	5030278	rs1258851236	C	G	.	.	.
chr21	5030278	.	C	T	.	.	.

Shouldn't the ID (rs1258851236) be annotated to both lines in the target file?

The version I used: bcftools_annotateVersion=1.19+htslib-1.19

@pd3
Copy link
Member

pd3 commented Sep 9, 2024

The program has a limitation, when a VCF is used as the source of annotations, it can match a line only once. You'd have to split the multiallelic records into biallelics (bcftools norm -m -) or create a tab-delimited file. I believe that would work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants