Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return all rows of parent data for a given sdss_id and catalogid #43

Merged
merged 1 commit into from
Aug 13, 2024

Conversation

albireox
Copy link
Member

Change the /target/parents/{catalog}/{sdss_id} to return all the parent catalog entries for all the entries associated with sdss_id in sdss_id_to_catalog. This can be limited to one catalogid but passing a ?catalogid= query parameter.

Note that in most cases this will return multiple rows for the same value, since generally the sdss_id will be associated with the same parent catalog value across different cross-matches. For example:

curl -L "http://127.0.0.1:8000/target/parents/gaia_dr2_source/129047350"

returns three copies of the same entry in Gaia DR2 for source_id=375250708536870400.

That same sdss_id will only return one row for Gaia DR3, since only one of the catalogids associated with the sdss_id has Gaia DR3 information.

I also found that there are duplicates in the sdss_id_to_catalog view for the same catalogid and sdsdid. I'll investigate why those are being created and re-run the view, but for now I've added a distinct to the get_parent_catalog_data to reject them.

Fixes #42

Copy link
Contributor

@havok2063 havok2063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. Thanks! I see what you mean now about the parent catalog primary_key being the same across different catalogids. I hadn't realized that before. It's a lot of redundant info. I guess that's why you returned just the first element. I'm ok one way or the other, but if it makes more sense the way you had it before, and if we think this new way will be confusing to users, then I can update the front-end instead.

@albireox
Copy link
Member Author

Yes, I'm not sure. It will almost always the same data, but not always, and that not always may be informative. One option, which I don't know if it's more confusing, is to add a distinct on the sdss_id_to_catalog column with the PK to the parent catalogue. That way it would keep the different rows, but only if the different entries in sdss_id_to_catalog point to different parent catalogue rows.

@havok2063
Copy link
Contributor

I'm ok keeping it like this for now. We can bring it up for discussion, if we want, the next time your on the data-viz call.

@havok2063 havok2063 merged commit b3f2385 into main Aug 13, 2024
2 checks passed
@albireox albireox deleted the albireox/issue42 branch August 13, 2024 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

the parent catalog route only returns the first item in the list
2 participants