Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 1803: percent of zips (WIP) #1916

Open
wants to merge 96 commits into
base: main
Choose a base branch
from

Conversation

lucasmbrown-usds
Copy link
Contributor

@lucasmbrown-usds lucasmbrown-usds commented Sep 22, 2022

For issue #1803 . Work in progress.

emma-nechamkin and others added 30 commits August 10, 2022 12:07
Imputes income field with a light refactor. Needs more refactor and more tests (I spotchecked). Next ticket will check and address but a lot of "narwhal" architecture is here.
Added HOLC indicator (Historic Redlining Score) from NCRC work; included 3.25 cutoff and low income as part of the housing burden category.
* Update PR threshold count to 10

We now show 10 indicators for PR. See the discussion on the github issue for more info: #1621

* Do not use linguistic iso for Puerto Rico

Closes 1350.

Co-authored-by: Shelby Switzer <[email protected]>
* Remove code that drops Guam and USVI from ETL

* Add back code for dropping rows by FIPS code

We may want this functionality, so let's keep it and just make the constant currently be an empty array.

Co-authored-by: Shelby Switzer <[email protected]>
Removing HOLC calculation from score narwhal.
Rescales linguistic isolation to drop puerto rico
adds leaky underground storage tanks
also includes merging / clean up of the release
* added tribalId for Supplemental dataset (#1804)

* Setting zoom levels for tribal map (#1810)

* NRI dataset and initial score YAML configuration (#1534)

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <[email protected]>

* Correct copy typo (#1809)

* Add basic test suite for COI (#1518)

* Update COI to use new yaml (#1518)

* Add tests for DOE energy budren (1518

* Add dataset config for energy budren (1518)

* Refactor ETL to use datasets.yml (#1518)

* Add fake GEOIDs to COI tests (#1518)

* Refactor _setup_etl_instance_and_run_extract to base (#1518)

For the three classes we've done so far, a generic
_setup_etl_instance_and_run_extract will work fine, for the moment we
can reuse the same setup method until we decide future classes need more
flexibility --- but they can also always subclass so...

* Add output-path tests (#1518)

* Update YAML to match constant (#1518)

* Don't blindly set float format (#1518)

* Add defaults for extract (#1518)

* Run YAML load on all subclasses (#1518)

* Update description fields (#1518)

* Update YAML per final format (#1518)

* Update fixture tract IDs (#1518)

* Update base class refactor (#1518)

Now that NRI is final I needed to make a small number of updates to my
refactored code.

* Remove old comment (#1518)

* Fix type signature and return (#1518)

* Update per code review (#1518)

Co-authored-by: Jorge Escobar <[email protected]>
Co-authored-by: lucasmbrown-usds <[email protected]>
Co-authored-by: Vim <[email protected]>
Yikes! Fixing merge messup!
Imputes income field with a light refactor. Needs more refactor and more tests (I spotchecked). Next ticket will check and address but a lot of "narwhal" architecture is here.
Added HOLC indicator (Historic Redlining Score) from NCRC work; included 3.25 cutoff and low income as part of the housing burden category.
* Update PR threshold count to 10

We now show 10 indicators for PR. See the discussion on the github issue for more info: #1621

* Do not use linguistic iso for Puerto Rico

Closes 1350.

Co-authored-by: Shelby Switzer <[email protected]>
* Remove code that drops Guam and USVI from ETL

* Add back code for dropping rows by FIPS code

We may want this functionality, so let's keep it and just make the constant currently be an empty array.

Co-authored-by: Shelby Switzer <[email protected]>
Removing HOLC calculation from score narwhal.
Rescales linguistic isolation to drop puerto rico
adds leaky underground storage tanks
also includes merging / clean up of the release
mattbowen-usds and others added 9 commits September 23, 2022 13:20
* Move test to base for broader coverage (#1848)

* Remove duplicate line (#1848)

* FUDS needed an extra mock (#1848)
* Add tribal count notebook (#1917)

* test without caching

* added comment

Co-authored-by: lucasmbrown-usds <[email protected]>
* Add tribal data to downloads (#1904)

* Update test pickle with current cols (#1904)

* Remove text of tribe names from GeoJSON (#1904)

* Update test data (#1904)

* Add tribal overlap to smoketests (#1904)
* should be working, has unnecessary loggers

* removing loggers and cleaning up

* updating ejscreen tests

* adding tests and responding to PR feedback

* fixing broken smoke test

* delete smoketest docs
@emma-nechamkin
Copy link
Contributor

emma-nechamkin commented Sep 29, 2022

YIKES! I read the code incorrectly

mattbowen-usds and others added 13 commits September 29, 2022 12:42
* Backfill population in island areas (#1882)

* Update smoketest to account for backfills (#1882)

As I wrote in the commend:
We backfill island areas with data from the 2010 census, so if THOSE tracts
have data beyond the data source, that's to be expected and is fine to pass.
If some other state or territory does though, this should fail

This ends up being a nice way of documenting that behavior i guess!

* Fixup lint issues (#1882)

* Add in race demos to 2010 census pull (#1851)

* Add backfill data to score (#1851)

* Change column name (#1851)

* Fill demos after the score (#1851)

* Add income back, adjust test (#1882)

* Apply code-review feedback (#1851)

* Add test for island area backfill (#1851)

* Fix bad rename (#1851)
* Add back lack of plumbing fields (#1920)

* Reorder fields for excel (#1921)

* Reorder excel fields (#1921)

* Fix formating, lint errors, pickes (#1921)

* Add missing plumbing col, fix order again (#1921)

* Update that pickle (#1921)
Base automatically changed from emma-nechamkin/release/score-narwhal to main December 2, 2022 02:50
@vim-usds
Copy link
Collaborator

vim-usds commented Feb 2, 2023

upshot: what % of a zip code is disadv? Some places do have data on zip codes.

Random example:
If there's $1M to zip code X and if 75% are disadv., therefore 750k went to disadvantaged communities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants