Merge pull request #3 from driftlesslabs/ci

Continuous integration testing and documentation
atlregional · Sep 28, 2024 · e9aef48 · e9aef48
2 parents 80ff3fd + 098744d
commit e9aef48
Show file tree

Hide file tree

Showing 110 changed files with 848 additions and 1 deletion.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,118 @@
+name: Implementation Testing
+
+on:
+  push:
+    branches:
+      - '*'
+  pull_request:
+    branches:
+      - '*'
+
+env:
+  CACHE_NUMBER: 0  # increase to reset cache manually
+
+jobs:
+  foundation:
+
+    strategy:
+      matrix:
+        python-version: ["3.10"]
+    defaults:
+      run:
+        shell: bash -l {0}
+    name: linux-64-py${{ matrix.python-version }}
+    runs-on: ubuntu-latest
+    steps:
+      # checkout the code in this repository
+      - uses: actions/checkout@v4
+        with:
+          path: 'arc-activitysim'
+
+      # checkout the main branch of ActivitySim itself
+      - uses: actions/checkout@v4
+        with:
+          repository: 'ActivitySim/activitysim'
+          ref: main
+          path: 'activitysim'
+          fetch-depth: 0  # get all tags, lets setuptools_scm do its thing
+
+      - name: Setup Miniforge
+        uses: conda-incubator/setup-miniconda@v3
+        with:
+            miniforge-version: latest
+            activate-environment: asim-test
+            python-version: ${{ matrix.python-version }}
+
+      - name: Set cache date for year and month
+        run: echo "DATE=$(date +'%Y%m')" >> $GITHUB_ENV
+
+      - uses: actions/cache@v4
+        with:
+          path: ~/conda_pkgs_dir
+          key: linux-64-conda-${{ hashFiles('activitysim/conda-environments/github-actions-tests.yml') }}-${{ env.DATE }}-${{ env.CACHE_NUMBER }}
+        id: cache
+
+      - name: Update environment
+        run: |
+          conda env update -n asim-test -f activitysim/conda-environments/github-actions-tests.yml
+
+      - name: Install activitysim
+        # installing without dependencies is faster, we trust that all needed dependencies
+        # are in the conda environment defined above.  Also, this avoids pip getting
+        # confused and reinstalling tables (pytables).
+        run: |
+          python -m pip install ./activitysim --no-deps
+
+      - name: Conda checkup
+        run: |
+          conda info -a
+          conda list
+
+      - name: Get the Fulton data
+        run: |
+          cd arc-activitysim
+          python scripts/fetch-fulton.py
+
+      - name: Run progressive tests
+        run: |
+          cd arc-activitysim
+          python -m pytest tests/test_activitysim.py
+
+      - name: Run without Sharrow
+        run: |
+          cd arc-activitysim
+          python scripts/run-fulton.py
+          
+      - name: Upload legacy artifacts
+        uses: actions/upload-artifact@v4
+        with:
+          name: legacy-outputs
+          path: |
+            ${{ github.workspace }}/arc-activitysim/output-fulton-legacy/final_*.csv
+            ${{ github.workspace }}/arc-activitysim/output-fulton-legacy/*.log
+            ${{ github.workspace }}/arc-activitysim/output-fulton-legacy/timing_log.csv
+
+      - name: Check legacy outputs
+        run: |
+          cd arc-activitysim
+          python scripts/check-fulton.py --check-dir ${{ github.workspace }}/arc-activitysim/output-fulton-legacy
+
+      - name: Run with Sharrow
+        run: |
+          cd arc-activitysim
+          python scripts/run-fulton.py --sharrow
+
+      - name: Upload Sharrow artifacts
+        uses: actions/upload-artifact@v4
+        with:
+          name: sharrow-outputs
+          path: |
+            ${{ github.workspace }}/arc-activitysim/output-fulton-sharrow/final_*.csv
+            ${{ github.workspace }}/arc-activitysim/output-fulton-sharrow/*.log
+            ${{ github.workspace }}/arc-activitysim/output-fulton-sharrow/timing_log.csv
+
+      - name: Check sharrow outputs
+        run: |
+          cd arc-activitysim
+          python scripts/check-fulton.py --check-dir ${{ github.workspace }}/arc-activitysim/output-fulton-sharrow
+
diff --git a/README.md b/README.md
@@ -1,2 +1,118 @@
 # arc-activitysim
 The standalone activitysim implementation for ARC travel demand model.
+
+## Installation
+
+To install the ARC ActivitySim model, simply clone this repository:
+
+```bash
+git clone https://github.com/atlregional/arc-activitysim.git
+cd arc-activitysim
+``` 
+
+Using this model requires ActivitySim 1.3 or later.  This is most easily
+accomplished using the `activitysim` conda package, which first requires the
+installation of the `conda` package manager.  For most systems, the Miniforge
+distribution is recommended, which can be downloaded from 
+[conda-forge](https://github.com/conda-forge/miniforge?tab=readme-ov-file#miniforge3).
+
+Once `conda` is installed, the `activitysim` package can be installed from the 
+MiniForge Prompt (or the terminal on Linux/Mac):
+
+```bash
+conda create -n ARC-ASIM activitysim -c conda-forge --override-channels
+```
+
+This will create a new conda environment named `ARC-ASIM` with the `activitysim`
+package installed.  To activate the environment, use:
+
+```bash 
+conda activate ARC-ASIM
+```
+
+## Running the Model
+
+The ARC ActivitySim model can be run using the `activitysim` command line tool.
+The model is configured using the `configs` directory, which contains the
+configuration files for the model.  From the directory where this repository 
+has been cloned, the model can be run using the following command:
+
+```bash
+activitysim run -c configs -d data_dir -o output_dir
+```
+
+Where `data_dir` is the directory containing the input data for the model, and
+`output_dir` is the directory where the model output will be written.  The data
+directory should contain the necessary input files (houeholds, persons, land use,
+and skims), which can be the full scale ARC data or a smaller test data set (see
+instructions to access the Fulton County test data below).  The output directory
+will be created if it does not exist, and the model output will be written to
+subdirectories of this directory.
+
+## Running the Model with Sharrow
+
+The ARC ActivitySim model can also be run with the sharrow enabled.  This is
+done by adding the relevant sharrow configs directory to the command.  For
+example, to run the model with the sharrow in compile-test mode, use the 
+following command:
+
+```bash 
+activitysim run -c configs -c configs_sh_compile -d data_dir -o output_dir
+``` 
+
+This will run the model with the sharrow enabled, and will compile the numba
+code and run tests to ensure the results match between sharrow and legacy modes. 
+Once the sharrow compiling is complete, the model can subsequently be run in
+sharrow's "production" mode, which will be much faster:
+
+```bash
+activitysim run -c configs -c configs_sh -d data_dir -o output_dir
+``` 
+
+## Testing Dataset (Fulton County)
+
+This model is built to run with the data that simulates the full-scale
+model of the ARC region, but this scale of data can be overwhelming 
+for testing the operation of the model, especially on more limited
+platforms.
+
+To facilitate testing, data for a smaller slice of the region is available.
+This test data includes just Fulton County, which has 1,296 zones; this is
+a small enough area to run the model on a laptop or within the CI testing
+infrastructure, as it will require only about 6GB of RAM to to store the
+skims in memory, and another 1 or 2 GB for the rest of the model. But this 
+area is still large enough to provide a meaningful test of the model, with 
+enough zones to exercise the model's capabilities and complexity. The Fulton 
+County data can be downloaded with this Python script (also available
+as [fetch-fulton.py](./scripts/fetch-fulton.py)):
+
+```python
+from pathlib import Path
+from activitysim.examples.external import download_external_example
+
+example_dir = download_external_example(
+  name=".", 
+  working_dir=Path.cwd(),
+  assets={
+    "arc-fulton-data.tar.zst": {
+      "url": "https://github.com/atlregional/arc-activitysim/releases/download/v1.3.0/arc-fulton-data.tar.zst",
+      "sha256": "402c3cf1fdd96ae0342f17146453b713602ca8454b78f1e8ff8cbc403e03441e",
+      "unpack": "arc-fulton-data",
+    },
+  }
+)
+```
+
+## Continuous Integration Testing
+
+This repository is configured to run continuous integration testing
+using GitHub Actions. The tests are run on a small subset of the data
+for Fulton County, and the results are uploaded to the `Actions` tab
+of the repository.  The tests are configured in the `.github/workflows`
+directory, and use the scripts in the `scripts` directory.
+
+Note that the tests are run in a clean environment every time, so the 
+first sharrow test includes the overhead of compiling all the numba code.  
+This will make it appear that this sharrow test is *much* slower than the 
+comparable legacy test; this is normal an not an indication that sharrow is 
+slower than the legacy code for production runs.