Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV for all served terms (i.e. one row per (legislator, term served)) #662

Open
nrjones8 opened this issue Feb 9, 2019 · 4 comments
Open

Comments

@nrjones8
Copy link

nrjones8 commented Feb 9, 2019

First of all, thanks so much for making and maintaining this data!

I was wondering if you all would be open to adding another csv that would be generated based on the existing legislators-current.yaml and legislators-historical.yaml files. Right now, those files contain more detailed term information than the CSV versions do - which makes sense since the number of terms served by any given legislator can vary, making it hard to put that data into the existing one-row-per-legislator CSV.

I'm wondering if you'd be open to creating a new csv that includes term start/end information for every term served by every legislator? It would look something like this:

bioguide_id,office_type,congress_number,start_date,end_date
B000226,sen,1,1789-03-04,1793-03-03
B000546,rep,1,1789-03-04,1791-03-03
B001086,rep,1,1789-03-04,1791-03-03
C000187,rep,1,1789-03-04,1791-03-03
...
V000119,rep,76,1939-01-03,1941-01-03
V000119,rep,77,1941-01-03,1943-01-03
...

so the same legislator can appear multiple times (one time per term served). This would allow people to more easily do analyses on all of the members for a given Congress number (i.e. grab all the bioguide_ids for a given Congress number, then join those IDs to the legislators-historical.csv file.

I created a quick prototype (not ready for review, wanted to see if you all were open to the idea first) to give an idea of what I mean: nrjones8@0e83389

Thanks in advance!

@JoshData
Copy link
Member

I think we'd love to add such a file. It may be difficult to assign congress numbers accurately to the whole dataset however: see #185. That might be better to address as a nice-to-have later. We should also attempt to include every term field in the output and use the same field names as much as possible. So with those caveats I'm 👍 .

@nrjones8
Copy link
Author

ah yes, I figured there was a reason that congress numbers hadn't been added before! Thanks for that context.

Makes sense on including every term field - just to clarify though, you mean including all subfields under a term object from the source YAML, like this one?

- type: rep
    start: '2013-01-03'
    end: '2015-01-03'
    state: NV
    party: Democrat
    district: 4
    office: 1330 Longworth House Office Building
    address: 1330 Longworth HOB; Washington DC 20515-2804
    phone: 202-225-9894
    url: http://horsford.house.gov
    rss_url: http://horsford.house.gov/rss.xml
    contact_form: https://horsford.house.gov/contact/email-me

Only downside I can see for that is that each term object doesn't have a fixed set of fields. Especially for past legislators, just means they'd have a bunch of missing data (e.g. contact_form, url, address, phone etc.) in the resulting CSV.

@caleblucas
Copy link

@JoshData thanks for your work on this great project! does something like what @nrjones8 detailed exist that you know of now?

@JoshData
Copy link
Member

I don't think anyone yet has stepped up to create the file, no.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants