-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uploading Code for Webscraping, EDA, and Viz #33
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Alan, in addition to the in-line feedback that I gave please make the following adjustments:
- Work within the file organization of the repository, get rid of the folder DATA_271_Data_Clinic_I, and organize your files into the correct folders
- Designate 1 clean notebook that methodically answers each question that Trevor asked
- Look at all of the data, not just 2003 -- work to combine as much of your data as possible
- The web scraping code should be a .py script that will download all of the data into a digestible format for the notebook
- Update the notebook readme w/ your file name and 1-2 sentence description of what the notebook is
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:35Z Adjust to use markdown headings to organize |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:36Z use a for-loop to clean this up |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:37Z allyears is not a descriptive name for a df, perhaps PA_contributions would be more appropriate |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:38Z why are the _x and _y columns? If they are supposed to be the same join on them or drop them (do exploration before deciding a path forward. If they are different change _x and _y to be the file source (i.e. filer or contributor) averyschoen commented on 2023-11-07T18:22:39Z Clean up the column names to be some consistent format |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:39Z adjust the axis label to year from year, no need to print out table if the barplot shows that information, use full names for all labels (put a space in FilerType) |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:40Z Similar comment as above, use proper label names, standardize capitalisation on the title, remove the table print out, whty are there a bunch of different unknowns (there should only be one and it should all be grouped together). |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-10-31T14:56:41Z these could be deprecated abbreviations, don't group into unknown, use chatgpt as inspiration for what they might have been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work Alan. Let me know if you have questions about the in-line feedback I have left here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you're making these revisions, keep in mind that all of the functions that you are using will be helping you get into the schema we outlined in google docs. I sent you a slack message regarding the expectations for function annotations. Please address the merge conflicts and finish the EDA you began in step 5
Clean up the column names to be some consistent format View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB averyschoen commented on 2023-11-07T18:23:54Z This is incomplete, please wrap this up asap |
…linic-climate-cabinet into PA_EDA_and_Schema
basics of makefile and added classify fns
No description provided.