Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cellbase gene services wrong behavior for for some genes (we've counted 3604 in GRCh37) #530

Open
jlfrueda opened this issue Feb 20, 2020 · 0 comments

Comments

@jlfrueda
Copy link

jlfrueda commented Feb 20, 2020

Many of the gene-related services of cellbase return no information for certain genes for which entries actually exist in the database, most of them appearing in related queries such as transcript search, for example. The problem can be replicated using both HGNC names and Ensembl Ids.

We encountered these problem for 3406 genes in GRCh37 (which we mainly use).

For example, for RBP3, which appears in OMIM and so is important for diagnostics,

we found that cellbase gets no results:

If we specify the assembly, results are found for GRCh38:

which returns E2F1 in addition to RBP3, but nothing for GRCh37:

Using its ensembl ID does not work either:

The gene is actually in the GRCh37 database, and can be found with

If one search for RBP3 and does not specify the assembly,

cellbase returns only the GRCh37 result, and not the GRCh38 one. This behavior is also seem with the ensembl Ids. While, as we saw,

finds no results,

finds the GRCh37 gene, but

finds no results. Both

seem to work fine. So, to sum up:

  • feature/gene/RBP3/info: no results
  • feature/gene/RBP3/info?assembly=grch37: no results
  • feature/gene/RBP3/info?assembly=grch38: two results, RBP3 and E2F1
  • feature/gene/ENSG00000107618/info: no results
  • feature/gene/ENSG00000265203/info: no results
  • feature/gene/ENSG00000107618/info?assembly=grch37: no results
  • feature/gene/ENSG00000265203/info?assembly=grch38: success (??)
  • feature/gene/search?name=RBP3: only GRCh37 result, but not GRCh38
  • feature/gene/search?name=RBP3&assembly=grch37: success
  • feature/gene/search?name=RBP3&assembly=grch38: success
  • feature/gene/search?id=ENSG00000107618: success
  • feature/gene/search?id=ENSG00000265203: no results
  • feature/gene/search?id=ENSG00000107618&assembly=grch37: success
  • feature/gene/search?id=ENSG00000265203&assembly=grch38: success

One can make several assumptions about gene/*/info and gene/search behaving inconsistently with respect to unspecified assemblies, but even so gene/RBP3/info?assembly=grch37, gene/ENSG00000107618/info?assembly=grch37, feature/gene/RBP3/info?assembly=grch38 or feature/gene/search?id=ENSG00000265203 should be returning the right results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant