OpenDataPipeline

Extract structured datasets for the Stanford Open Data Project

Use Python 3.

pip3 install awscli boto3
aws configure
# enter access key id and secret access key

pip3 install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
python3 gspreadsheet.py
# This will query the google sheet API, bring in the file as csv, convert to JSON, and upload to AWS

gspreadsheet.py pulls metada from the google sheet as a csv and converts to JSON. To upload to an S3 bucket in AWS requires configuring S3 with boto3.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.Rhistory		.Rhistory
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gspreadsheet.py		gspreadsheet.py
metadata.py		metadata.py
olympics.py		olympics.py
scraper1.py		scraper1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenDataPipeline

About

Releases

Packages

Contributors 3

Languages

License

TheStanfordDaily/open-data-pipeline

Folders and files

Latest commit

History

Repository files navigation

OpenDataPipeline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages