Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 1.24 KB

README.md

File metadata and controls

36 lines (27 loc) · 1.24 KB

Accompanying blog post

Step -1: Environment setup

python3 -m venv venv
source venv/Scripts/activate
pip install -r .\requirements.txt

mkdir driver profile_html scraped_data clean_data

Download the Selenium Chrome webdriver from here and store it to the "driver" folder.

Step 0: Store LinkedIn authentication in a config file

  • Rename "sample.ini" to "config.ini"
  • Update username and password config variables with your LinkedIn profile credentials

Step 1: Download interesting profiles

cd scraper
python3 fetch_profile_urls.py
python3 download_profile_pages.py

After this step, you should have html profile pages downloaded in the "profile_html" directory.

Step 2: Scrap profile data with BeautifulSoup

python3 scrap_profile_data.py

Step 3: Data preparation

For data cleaning and preparation I used the "data-preparation.ipynb" jupyter notebook.

Step 4: Data Visualization

You can find the Tableau workbook here