Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LivingAtlas: Additional fields for SpeciesListPipeline (ARGA) #865

Open
wants to merge 23 commits into
base: dev
Choose a base branch
from

Conversation

nickdos
Copy link
Contributor

@nickdos nickdos commented Mar 8, 2023

Pls assign to @djtfmartin.

PR contains changes for additional fields to be processed during the specieslist phase of the index pipeline. There are two set of fields that relate to:

  • locatedInCountry a string field contains a single value, the (ISO) country name, populated via a species list. The idea being that the taxon is known to be located in that country. ARGA uses this for the large percentage of records that have no location data. Config var includePresentInCountry has a default false value.
  • Traits from AusTraits that are multi-value type and are also populated via species lists - one list per trait. Lists must have type COMMON_TRAIT and contain additional columns: traitName and traitValue. Traits defined in IndexedFields.java and (SOLR) schema.xml will be indexed as multi-value but any additional lists (that are added without changes to those files) will still be indexed as dynamic fields. Config var includeTraits has a default false value.

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.*;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed this - IDE did this and it might break coding rules?

@@ -73,4 +74,7 @@ public interface IndexFields {
String GGBN_TERMS_LOAN = "http://data.ggbn.org/schemas/ggbn/terms/Loan";
String LOAN_DESTINATION_TERM = "http://data.ggbn.org/schemas/ggbn/terms/loanDestination";
String LOAN_IDENTIFIER_TERM = "http://data.ggbn.org/schemas/ggbn/terms/loanIdentifier";
String AUS_TRAITS_FIRE_RESPONSE = "fire_response";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert from snake case to camel

@@ -51,6 +51,7 @@ public interface IndexFields {
String POINT_0_02 = "point-0.02";
String POINT_0_1 = "point-0.1";
String POINT_1 = "point-1";
String PRESENT_IN_COUNTRY = "presentInCountry";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering why we need this. Cant the data just provide countryCode ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its the field name, so data will look like presentInCountry:Australia or presentInCountry:Italy. Could use country code I suppose presentInCountry:AU but data is from region field in species list, which uses full name, so would require an additional lookup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed the field to taxonPresentInCountry now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants