Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occurrences seem to have been removed in some GBIF datasets - was it intentional? #267

Open
ManonGros opened this issue Feb 1, 2024 · 10 comments
Assignees

Comments

@ManonGros
Copy link

Hi PLAZI,

My colleagues and I recently noticed that for at least two datasets, the occurrences seem to have disappeared.

For example, this dataset https://www.gbif.org/dataset/e542d309-1af6-465f-816b-3ef3e62a526e has occurrences indexed on GBIF but there aren't any in the archive: https://tb.plazi.org/GgServer/dwca/C56BA308C17E0264FF88FFD6FFC6C32D.zip
GBIF has a safeguard to prevent accidental deletion of occurrences so if all the occurrences disappear from a given dataset, they are still in the index. This is why the occurrences are still on GBIF. We can removed those from the index manually but I want to make sure this wasn't an accidental deletion.
The other dataset concerned is this one: https://www.gbif.org/dataset/91a51391-41b1-4184-bb62-6cd3bbbfdd74.

Both datasets contain material from specimens in institution that aren't well represented on GBIF, the occurrences are valuable.
Were they deleted on purpose? Let me know, thanks!

@flsimoes
Copy link

flsimoes commented Feb 1, 2024

Hi @ManonGros I honestly don't know, it seems to be something on the GBIF side of the operation.

On our side it seems normal
See:

I'll have a further look and get back to you

@Carol-Sokolowicz
Copy link

I did the quality check on GGI.
If the problem persist let us know

@ManonGros
Copy link
Author

Thank you Carol! The data looks good for the first dataset! Would it be possible to do the same for this other one: https://www.gbif.org/dataset/91a51391-41b1-4184-bb62-6cd3bbbfdd74? Thanks!

@flsimoes
Copy link

flsimoes commented Feb 1, 2024

Thank you Carol! The data looks good for the first dataset! Would it be possible to do the same for this other one: https://www.gbif.org/dataset/91a51391-41b1-4184-bb62-6cd3bbbfdd74? Thanks!

Should be updated in the next couple of hours

@spalp
Copy link

spalp commented Feb 2, 2024

Thank you both.
I found yet another dataset, which previously had occurrences, but now returns 0 occurrences: https://www.gbif.org/occurrence/search?dataset_key=50e48d61-aaca-4af9-9ef3-d2104aba3b8b

@Carol-Sokolowicz
Copy link

This has been updated, if there's something elso wrong let me know
https://www.gbif.org/occurrence/search?dataset_key=50e48d61-aaca-4af9-9ef3-d2104aba3b8b

@myrmoteras
Copy link
Contributor

@ManonGros is there a way on GBIF side to check, whether there are more datasets that have been disappearing? Just to understand the issue on our end.

@muttcg
Copy link

muttcg commented Feb 6, 2024

@myrmoteras More than 2600+ Plazi checklist datasets were affected.
List of GBIF Registry dataset keys: plazi.csv

@myrmoteras
Copy link
Contributor

@muttcg thanks. This might have to do with a reaanalysis of the articles using more advanced tools, especially older articles/datasets.
We look into this asap

@gsautter
Copy link

gsautter commented Feb 6, 2024

@myrmoteras your assumption seems to be pretty much spot-on ... since we retro-applied quality control to our older IMFs, occurrences might well have been filtered from several DwCAs ... as we catch up on inspecting these IMFs, the occurrences should reappear, though, at least the correct ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants