Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency suggestion: Only search datasets #2

Open
levyj opened this issue Aug 24, 2020 · 2 comments
Open

Efficiency suggestion: Only search datasets #2

levyj opened this issue Aug 24, 2020 · 2 comments

Comments

@levyj
Copy link

levyj commented Aug 24, 2020

I may be misunderstanding what you are doing, either technically or intention. However, you may be unnecessarily searching derived views (such as filtered views) that will never have records not also in the parent dataset.

For example, if you want to find my salary record (might as well; I think everyone else has!), you do not need search all of the below views. Anything appearing in the last five will, by definition, appear in the first one. In effect, you can search just https://data.cityofchicago.org/browse?limitTo=datasets.

That said, you may realize this and intentionally be highlighting the derived views, as well, so people will know about them. If so, fair enough.

image

@anthonymoser
Copy link
Owner

anthonymoser commented Sep 20, 2020

Thanks for these suggestions, I know it's taken me a while to get around to them. Do you know if there is a way to filter out derived views based on the metadata? I'm using the Sodapy library and the method for getting datasets doesn't really offer a good way to request only primary data sets. I'm filtering out the maps based on metadata, so I could also remove the extra sets, if they're classified as such.

In full disclosure I also had not realized the Chicago data portal was structured in that way. Is that a feature of Socrata's platform, or is that an implementation choice? I've noticed that other cities and entities using the Socrata platform seem to have a much more haphazard approach to the creation of data sets, which is a credit to Chicago's effort.

@levyj
Copy link
Author

levyj commented Sep 21, 2020

Kind of in reverse order:

Thanks!

If I am understanding the question right, this is a Socrata feature. From https://data.cityofchicago.org/browse, for example, it is basically the difference under View Types between Datasets and Filtered Views.

I actually have not used Sodapy (or even Python, as much as I should learn to do). Looking at the documentation briefly, I thought I might have an answer but now am not sure because I am not sure what API it is using behind the scenes. What I think I have figured out:

  • Give it a try if you like but I don't think the viewType element is going to help because you probably will get tabular for both.
  • If you get a parent element, especially one pointing to an id other than the view, itself, that should be an indicator of a derived view.
  • The right answer is to ask at https://support.socrata.com/hc/en-us/requests/new. Feel free to CC either me or the main Data Portal address. (I don't want to put the addresses here for spam reasons but if you have trouble finding either or both, let me know.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants