Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search for Approved Foods #13

Open
jesserosato opened this issue Aug 14, 2014 · 11 comments
Open

Search for Approved Foods #13

jesserosato opened this issue Aug 14, 2014 · 11 comments

Comments

@jesserosato
Copy link
Contributor

WIC is super specific about what it covers, their site has PDFs full of approved foods. It would be great to get that data in a usable format and provide a search utility for WIC users to be able to see if foods qualify before they go to the store. Ideally, they'd be able to scan UPCs in the store to see if that food qualifies, but that may be overly ambitious...

@marcfarley
Copy link

Would something like this help? http://www.pdfonline.com/convert-pdf-to-html/

@jesserosato
Copy link
Contributor Author

@marcfarley Thanks for the tip! It looks like it could be really helpful. I'll point it out for more research at our next Hack Night (Wednesday at 6pm at Sacramento Hacker Lab if you're interested and in the Sacramento area).

@Elizabethcase
Copy link
Contributor

I used tabula.nerdpower.org to extract the tables. Still need someone to go through by hand and add in foods and details that are listed in the main food file

@Elizabethcase
Copy link
Contributor

Hey, Joseph had a great idea: see if there's an API that connects with UPCs to grab more info about food. Also, we still need a search, woo hoo!

@jesserosato
Copy link
Contributor Author

@Elizabethcase That's a great idea. It looks like there's a few UPC APIs out there, but it looks like Amazon's is the best free one. I'll add food search to the issues queue once I get the back end pushed up to the repo (hopefully by hack night next week).

@Elizabethcase
Copy link
Contributor

Great -> and just an issue to note here, I need to fix the UPCs because excel auto removed leading zeros

@jesserosato
Copy link
Contributor Author

@Elizabethcase I think we may have to do the food data in batches. Looking at the CSV, it looks like the different PDFs maybe had different column orders? Or some just are missing columns? I'll bring it up at Hack Night and see if anyone has any good ideas on how to clean this data up.

@Elizabethcase
Copy link
Contributor

Yep. Some have organic, some have packaging, some don't have either. I can set all the blanks to nulls but it's definitely not super clean data

On Sep 22, 2014, at 16:57, Jesse Rosato [email protected] wrote:

@Elizabethcase I think we may have to do the food data in batches. Looking at the CSV, it looks like the different PDFs maybe had different column orders? Or some just are missing columns? I'll bring it up at Hack Night and see if anyone has any good ideas on how to clean this data up.


Reply to this email directly or view it on GitHub.

@jesserosato
Copy link
Contributor Author

Yeah, I mean, it is scraped from PDFs, so it's actually pretty great. I think I'm gonna just use the common fields for now, and we'll have to ask CDPH if they can get us clean data at some point.

On Sep 22, 2014, at 7:52 PM, Elizabeth Case [email protected] wrote:

Yep. Some have organic, some have packaging, some don't have either. I can set all the blanks to nulls but it's definitely not super clean data

On Sep 22, 2014, at 16:57, Jesse Rosato [email protected] wrote:

@Elizabethcase I think we may have to do the food data in batches. Looking at the CSV, it looks like the different PDFs maybe had different column orders? Or some just are missing columns? I'll bring it up at Hack Night and see if anyone has any good ideas on how to clean this data up.


Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub.

@civicissuebot
Copy link

Hello! This issue looks like it still needs help!
It's been clicked on 1 times through the Civic Issue Finder on http://www.codeforamerica.org/.
Can this issue be closed or does it still need some assistance?

If you wrote this issue, you can always update the labels for specifying tasks, add more info in the description to make it easier to contribute, or re-write the title to make more contributors interested in helping out.
If you are an open source contributor, ask and see how you can help by commenting or check out more open issues in this repo at https://github.com/code4sac/wicit/issues.

Just doing a little 🌱 open source gardening 🌱 of Brigade projects!
For more info/tools for creating civic issues, check out Got Issues Thank you!

@josephlei
Copy link
Contributor

I've been in touch with the WIC team at CDPH last year and then again yesterday.. I have some contacts and will continue to reach out to them regarding their approved UPCs in machine readable format. Don't know if we'll end up using excel but if so, custom formats like 000000000 will add leading zeros as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants