Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Massimo #18

Draft
wants to merge 52 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
52a3b06
Add pybullet sim code
Nate711 Apr 11, 2020
4b2bf5b
update readme with diagrams
Nate711 Apr 11, 2020
9b43cb4
Merge pull request #1 from stanfordroboticsclub/master
fgolemo Aug 4, 2020
d9d765e
refactor into package
fgolemo Aug 4, 2020
cd8c489
updated gi
fgolemo Aug 4, 2020
49ba3c9
cleanup
fgolemo Aug 4, 2020
46e0add
Merge branch 'master' into sim
fgolemo Aug 4, 2020
0f3986d
small fixes
fgolemo Aug 4, 2020
6aaf855
added new pybullet sim
fgolemo Aug 9, 2020
16246ae
added halfcheetah-like walking environment
fgolemo Aug 9, 2020
63d41ed
highlights
fgolemo Aug 9, 2020
fc1b060
bugfix and improved example
fgolemo Aug 9, 2020
63eb854
bugfix: reset now returns observation
fgolemo Aug 13, 2020
b1dff94
scaled down action amount and torque
fgolemo Aug 13, 2020
3e4e3ae
cleanup - we don't need the woofer for what we're doing here and in c…
fgolemo Aug 28, 2020
551bc11
added chair generation
fgolemo Aug 29, 2020
3a79bdd
stairs are now fixed to each other and to the ground
fgolemo Aug 29, 2020
46d249d
added random colors for better visbility
fgolemo Aug 29, 2020
fe0a59b
typo
fgolemo Aug 29, 2020
e01f0ca
added black settings file
fgolemo Aug 29, 2020
f362812
added photo utility
fgolemo Aug 29, 2020
1d4a4b1
added renderer to env
fgolemo Aug 29, 2020
c69486f
added note
fgolemo Aug 29, 2020
1e9948b
added new envs and random position init
Sep 1, 2020
641c5af
added handful new environments, added action smoothing, modularized r…
fgolemo Sep 3, 2020
e1fa1da
bugfix
fgolemo Sep 3, 2020
803e802
added reward for stability
fgolemo Sep 3, 2020
bc43e8d
bugfix
fgolemo Sep 3, 2020
c23511d
bugfixes
fgolemo Sep 3, 2020
a406663
added new scaled-n-smoothed variant environment
fgolemo Sep 3, 2020
c4c3ad0
added support for reward monitoring
fgolemo Sep 9, 2020
0fd8667
added readme for the new environments
fgolemo Sep 9, 2020
b43fdf8
remove pip editable install files
Sep 10, 2020
a11b87b
switched from ssh to https in installation to prevent Docker ssh from…
Sep 10, 2020
74b04a9
added control over action_smoothing and RandomZRot
Sep 15, 2020
1717683
Merge pull request #2 from fgolemo/massimo
optimass Sep 15, 2020
9f4feef
merged & updated from @optimass PR
fgolemo Sep 17, 2020
4dd2dc2
made it so that the penalty for rotation is in range 0-3 not 0-(3*pi^2)
fgolemo Sep 17, 2020
93c7904
added stop_on_flip
fgolemo Sep 17, 2020
fe2499a
made the robot stuff optional
fgolemo Sep 23, 2020
6d19cf3
simple prototype for playing back the recording
fgolemo Sep 23, 2020
d216f8c
added glen playback stuff
fgolemo Sep 25, 2020
010ee74
both simulations side by side
fgolemo Sep 25, 2020
44ff6ca
made imitation learning env into gym env
fgolemo Sep 30, 2020
2503efe
made 48x48 the new default and added better camera position
fgolemo Sep 30, 2020
52cb841
added masking
fgolemo Sep 30, 2020
1148b0e
added incremental env
fgolemo Oct 6, 2020
0cc2c88
added policy
fgolemo Oct 7, 2020
740e244
added tools for recording hard-coded gait
fgolemo Oct 7, 2020
3cd466e
bugfix
fgolemo Oct 8, 2020
96ceb4d
added support to change the episode length
Oct 15, 2020
bc0cc1f
bug in the new episode lenght
Oct 15, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 137 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,138 @@
# Created by .ignore support plugin (hsz.mobi)
### Python template
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
.idea/
wandb/
.DS_Store
*.DS_Store
*.pyc
**/__pycache__
*.hdf5
scripts/*.npz
scripts/*.pkl
21 changes: 0 additions & 21 deletions LICENSE

This file was deleted.

135 changes: 112 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,130 @@
# Stanford Quadruped

## Overview
This repository hosts the code for Stanford Pupper and Stanford Woofer, Raspberry Pi-based quadruped robots that can trot, walk, and jump.
# Stanford Quadruped Gym

This repo is adapted from the original Standford Quadruped repo at https://github.com/stanfordroboticsclub/StanfordQuadruped and we've added an additional simulator and gym-compatible environments

![Pupper CC Max Morse](https://live.staticflickr.com/65535/49614690753_78edca83bc_4k.jpg)

Video of pupper in action: https://youtu.be/NIjodHA78UE

Project page: https://stanfordstudentrobotics.org/pupper
## Installation

We assume you got an environment where `pip` and `python` point to the Python 3 binaries. This repo was tested on Python 3.6.

```bash
git clone https://github.com/fgolemo/StanfordQuadruped.git
cd StanfordQuadruped
pip install -e .[sim]
```

#### Robot installation

```bash
sudo pip install -e .[robot]
```

## Getting Started

The new simulator lives in `stanford_quad/sim/simulator2.py` in the `PupperSim2` class. You can run the simulator in dev mode by running the script

python scripts/07-debug-new-simulator.py

## Gym Environments

There are currently 11 environments. All come in 2 variants: `Headless` and `Graphical`. `Headless` is meant for training and launches the PyBullet sim without any GUI, `Graphical` is meant for local debugging, offers a GUI to inspect the robot and is significantly slower.

You can try out one of the walking environments, by running:

python scripts/08-run-walker-gym-env.py

### Walking

In all environments, the observations are the same:

**Observation space**:
- 12 leg joints in the order
- front right hip
- front right upper leg
- front right lower leg
- front left hip/upper/lower leg
- back right hip/upper/lower leg
- back left hip/upper/lower leg
- 3 body orientation in euler angles
- 2 linear velocity (only along the plane, we don't care about z velocity

The joints are normalized to be in [-1,1] but the orientation and velocity can be arbitrarily large.

The **action space** in both environments is also 12-dimensional (corresponding to the same 12 joints as above in that order) and also normalized in [-1,1] but the effects are different between both environments (see below).

In both environments, the goal is to walk/run as fast as possible straight forward. This means the **reward** is calculated as relative increase of x position with respect to the previous timestep (minus a small penalty term for high action values, same as in HalfCheetah-v2)

#### Parameters

There are several settings for the walking environment and a handful of combinations of these parameters have been given dedicated names. Here's the list of parameters and their meaning:

- `debug` (bool): If True, this will turn ON the GUI. Usually the `[Headless|Graphical]` name in the environment specifies that this is False/True respectively.
- `steps` (int): default 120, Length of an episode. This corresponds to 2 seconds at 60Hz.
- `relative_action` (bool): If False, action commands correspond directly to the joint positions. If True, then at each step, the actions are added to the stable resting position, i.e. instead of `robot.move(action)`, it's `robot.move(REST_POSE+action)`.
- `action_scaling` (float): By default, the robot has a large movement range and very responsive joints, meaning the policy can pick the maximum negative joint position in one step and the maximum positive joint position in the next step. This causes a lot of jitter. In order to reduce this, this setting allows to restrict the movement range. Best used in combination with `relative_action`.
- `action_smoothing` (int): Another method to reduce jitter. If this is larger than 1, actions aren't applied to the robot directly anymore but instead go into a queue of this length. At each step, the mean of this queue is applied to the robot.
- `random_rot` (3-tuple of floats): This allows to specify the initial random rotation. The 3 elements in the triple correspond to rotation around the x/y/z axes respectively. The rotations are drawn from a normal distribution centered at 0 and this value specifies the variance on each axis. Values are in degrees.
- `reward_stability` (float): Specifies the coefficient of the IMU reward term that encourages stability. By default, it's 0.

Based on our previous experiments, the following set of parameters seem to perform best (corresponding to the environment **`Pupper-Walk-Relative-ScaledNSmoothed3-RandomZRot-Headless-v0`**):

```python
params = {
"debug": False,
"steps": 120,
"relative_action": True,
"action_scaling": 0.3,
"action_smoothing": 3,
"random_rot": (0,0,15),
"reward_stability": 0
}
```

#### Pupper-Walk-Absolute-[Headless|Graphical]-v0

In this environment, you have full absolute control over all joints and their resting position is set to 0 (which looks unnatural - the robot is standing fully straight, legs extended). Any action command is sent directly to the joints.

#### Pupper-Walk-Relative-[Headless|Graphical]-v0

In this env, your actions are relative to the resting position (see image at the top - roughly that). Meaning an action of `[0]*12` will put the Pupper to a stable rest. Action clipping is done after summing the current action and the action corresponding to the resting position, which means the action space is asymmetric - e.g. if a given joint's resting position is at `0.7`, then the action space for that joint is `[-1.7,.3]`. This is intentional because it allows the Pupper to start with a stable position.

#### Pupper-Walk-Relative-ScaledDown_[0.05|0.1|0.15|...|0.5]-[Headless|Graphical]-v0

Similar to the `Pupper-Walk-Relative` but here the actions are multiplied with a factor (in the environment name) to reduce the range of motion.

#### Pupper-Walk-Relative-ScaledDown-RandomZRot-[Headless|Graphical]-v0

Like the `Pupper-Walk-Relative-ScaledDown_0.3` but with random initial z rotation (rotation is drawn from a normal distribution, centered at 0, variance of 15 degrees).

#### Pupper-Walk-Relative-ScaledNSmoothed3-RandomZRot-[Headless|Graphical]-v0

Like `Pupper-Walk-Relative-ScaledDown-RandomZRot` but with an additional action smoothing of **3**.

#### Pupper-Walk-Relative-ScaledNSmoothed5-RandomZRot-[Headless|Graphical]-v0

Documentation & build guide: https://pupper.readthedocs.io/en/latest/
Like `Pupper-Walk-Relative-ScaledDown-RandomZRot` but with an additional action smoothing of **5**.

## How it works
![Overview diagram](imgs/diagram1.jpg)
The main program is ```run_robot.py``` which is located in this directory. The robot code is run as a loop, with a joystick interface, a controller, and a hardware interface orchestrating the behavior.
#### Pupper-Walk-Relative-Smoothed5-RandomZRot-[Headless|Graphical]-v0

The joystick interface is responsible for reading joystick inputs from a UDP socket and converting them into a generic robot ```command``` type. A separate program, ```joystick.py```, publishes these UDP messages, and is responsible for reading inputs from the PS4 controller over bluetooth. The controller does the bulk of the work, switching between states (trot, walk, rest, etc) and generating servo position targets. A detailed model of the controller is shown below. The third component of the code, the hardware interface, converts the position targets from the controller into PWM duty cycles, which it then passes to a Python binding to ```pigpiod```, which then generates PWM signals in software and sends these signals to the motors attached to the Raspberry Pi.
![Controller diagram](imgs/diagram2.jpg)
This diagram shows a breakdown of the robot controller. Inside, you can see four primary components: a gait scheduler (also called gait controller), a stance controller, a swing controller, and an inverse kinematics model.
Like `Pupper-Walk-Relative-ScaledNSmoothed5-RandomZRot` but with the actions only averaged in a list of 5, not scaled.

The gait scheduler is responsible for planning which feet should be on the ground (stance) and which should be moving forward to the next step (swing) at any given time. In a trot for example, the diagonal pairs of legs move in sync and take turns between stance and swing. As shown in the diagram, the gait scheduler can be thought of as a conductor for each leg, switching it between stance and swing as time progresses.
#### Pupper-Walk-Relative-RewardStable0.5-[Headless|Graphical]-v0

The stance controller controls the feet on the ground, and is actually quite simple. It looks at the desired robot velocity, and then generates a body-relative target velocity for these stance feet that is in the opposite direction as the desired velocity. It also incorporates turning, in which case it rotates the feet relative to the body in the opposite direction as the desired body rotation.
Like `Pupper-Walk-Relative` but with the additional reward term for body orientation close to zero. Coefficient for the reward term is 0.5

The swing controller picks up the feet that just finished their stance phase, and brings them to their next touchdown location. The touchdown locations are selected so that the foot moves the same distance forward in swing as it does backwards in stance. For example, if in stance phase the feet move backwards at -0.4m/s (to achieve a body velocity of +0.4m/s) and the stance phase is 0.5 seconds long, then we know the feet will have moved backwards -0.20m. The swing controller will then move the feet forwards 0.20m to put the foot back in its starting place. You can imagine that if the swing controller only put the leg forward 0.15m, then every step the foot would lag more and more behind the body by -0.05m.
#### Pupper-Walk-Relative-RewardStable0.5-ScaledDown3-[Headless|Graphical]-v0

Both the stance and swing controllers generate target positions for the feet in cartesian coordinates relative the body center of mass. It's convenient to work in cartesian coordinates for the stance and swing planning, but we now need to convert them to motor angles. This is done by using an inverse kinematics model, which maps between cartesian body coordinates and motor angles. These motor angles, also called joint angles, are then populated into the ```state``` variable and returned by the model.
Like `Pupper-Walk-Relative-RewardStable0.5` but additionally with the actions scaled by 0.3

#### Pupper-Walk-Relative-RewardStable0.5-ScaledDown-RandomZRot-[Headless|Graphical]-v0

## How to Build Pupper
Main documentation: https://pupper.readthedocs.io/en/latest/
Like `Pupper-Walk-Relative-RewardStable0.5-ScaledDown3` but with additional random initial rotation around the z axis

You can find the bill of materials, pre-made kit purchasing options, assembly instructions, software installation, etc at this website.
#### Pupper-Walk-Relative-RewardStable0.5-ScaledNSmoothed-RandomZRot-[Headless|Graphical]-v0

Like `Pupper-Walk-Relative-RewardStable0.5-ScaledDown-RandomZRot` but additionally with the actions smoothed with queue length 3.

## Help
- Feel free to raise an issue (https://github.com/stanfordroboticsclub/StanfordQuadruped/issues/new/choose) or email me at nathankau [at] stanford [dot] edu
- We also have a Google group set up here: https://groups.google.com/forum/#!forum/stanford-quadrupeds


13 changes: 0 additions & 13 deletions pupper/HardwareConfig.py

This file was deleted.

3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[tool.black]
line-length = 120
target-version = ['py37']
2 changes: 1 addition & 1 deletion robot.service
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ After=joystick.service

[Service]
ExecStartPre=-sudo pigpiod
ExecStart=/usr/bin/python3 /home/pi/StanfordQuadruped/run_robot.py
ExecStart=/usr/bin/python3 /home/pi/StanfordQuadruped/scripts/run_robot.py
KillSignal=2
TimeoutStopSec=10

Expand Down
Loading