Skip to content

Commit

Permalink
Merge branch 'refs/heads/master' into lis-spot/implement-vlm-predicat…
Browse files Browse the repository at this point in the history
…e-eval

# Conflicts:
#	predicators/pretrained_model_interface.py
#	setup.py
  • Loading branch information
lf-zhao committed Apr 30, 2024
2 parents f884df7 + a8151a2 commit 8c17c42
Show file tree
Hide file tree
Showing 50 changed files with 3,041 additions and 303 deletions.
15 changes: 7 additions & 8 deletions .github/workflows/predicators.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: ["3.10.14"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -28,7 +28,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: ["3.10.14"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -40,16 +40,15 @@ jobs:
- name: Install dependencies
run: |
pip install -e .
pip install -U git+https://github.com/python/mypy.git@9a10967fdaa2ac077383b9eccded42829479ef31
# Note: if mypy issue #5485 gets resolved, we can install from head again.
pip install mypy==1.8.0
- name: Mypy
run: |
mypy . --config-file mypy.ini
lint:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: ["3.10.14"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -71,7 +70,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: ["3.10.14"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -95,7 +94,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: ["3.10.14"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -116,7 +115,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: ["3.10.14"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ logs
saved_approaches
saved_datasets
scripts/results
llm_cache
pretrained_model_cache
machines.txt
*_vision_data
tests/_fake_trajs
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Methods for predicate learning are implemented as Approaches (e.g., `predicators
A simple implementation of search-then-sample bilevel planning is provided in `predicators/planning.py`. This implementation uses the "SeSamE" strategy: SEarch-and-SAMple planning, then Execution.

## Installation
* This repository uses Python versions 3.8+.
* This repository uses Python versions 3.10-3.11. We recommend 3.10.14.
* Run `pip install -e .` to install dependencies.

## Instructions For Running Code
Expand Down
12 changes: 11 additions & 1 deletion mypy.ini
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
strict_equality = True
disallow_untyped_calls = True
warn_unreachable = True
exclude = (predicators/envs/assets|predicators/third_party|venv)
exclude = (predicators/envs/assets|venv)

[mypy-predicators.*]
disallow_untyped_defs = True
Expand All @@ -15,6 +15,7 @@ ignore_missing_imports = True

[mypy-predicators.third_party.*]
ignore_missing_imports = True
ignore_errors = True

[mypy-setuptools.*]
ignore_missing_imports = True
Expand Down Expand Up @@ -127,3 +128,12 @@ ignore_missing_imports = True

[mypy-pbrspot.*]
ignore_missing_imports = True

[mypy-ImageHash.*]
ignore_missing_imports = True

[mypy-google.*]
ignore_missing_imports = True

[mypy-google.generativeai.*]
ignore_missing_imports = True
8 changes: 4 additions & 4 deletions predicators/approaches/active_sampler_learning_approach.py
Original file line number Diff line number Diff line change
Expand Up @@ -737,13 +737,13 @@ def _wrap_object_specific_samplers(
base_sampler: NSRTSampler,
) -> NSRTSampler:

def _wrapped_sampler(state: State, goal: Set[GroundAtom],
rng: np.random.Generator,
objects: Sequence[Object]) -> Array:
def _wrapped_sampler(
state: State, goal: Set[GroundAtom], rng: np.random.Generator,
objects: Sequence[Object]) -> Array: # pragma: no cover
objects_tuple = tuple(objects)
# If we haven't yet learned a object-specific sampler for these objects
# then use the base sampler.
if objects_tuple not in object_specific_samplers: # pragma: no cover
if objects_tuple not in object_specific_samplers:
return base_sampler(state, goal, rng, objects)
sampler = object_specific_samplers[objects_tuple]
return sampler(state, goal, rng, objects)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Grammar Search Invention Approach
This approach is primarily useful for inventing predicates via program synthesis from demonstrations, as described originally in:
[Predicate Invention for Bilevel Planning](https://arxiv.org/abs/2203.09634). Silver*, Chitnis*, Kumar, McClinton, Lozano-Perez, Kaelbling, Tenenbaum. AAAI 2023.

An example command for running the approach from that paper is:
```
python predicators/main.py --env cover --approach grammar_search_invention --excluded_predicates all --num_train_tasks 50
```

Last updated: 04/28/2024

## Inventing predicates by leveraging a VLM
We can leverage a VLM to propose concepts that form the basis of the grammar used for predicate invention. This has two advantages: (1) invented predicates operate directly on images, (2) the names of predicates correspond to common-sense concepts.

To do this, we need to supply demonstrations in the form of a sequence of images and labelled options corresponding to the `_Option` that the robot used to get between subsequent states corresponding to subsequent images.

### Creating datasets for VLM predicate invention
Demonstrations should be saved as a subfolder in the `saved_datasets` folder. The folder should be named `<env_name>__vlm_demos__<seed>_<num_demos>`. For instance, `apple_coring__vlm_demos__456__1`.
Within the folder, there should be 1 subfolder for every demonstration trajectory. So in the above example, there should be exactly 1 subfolder. Name each of these subfolders `traj_<demonstration_number>` with 0-indexing (e.g., `traj_0` for the first demo).
Within each traj subfolder, there should be two things:
1. a subfolder corresponding to each timestep for the demonstration. This will contain all the images (potentially from multiple camera views) that correspond to the observation at the current timestep.
2. an `options_traj.txt` file that lists out the series of options executed between each of the states.

The `options_traj.txt` file should contain strings corresponding to the options executed as part of the trajectory. The format for each option should be `<option_name>(<objects>, [<continuous_params>])`.
An example file might look like:
```
pick(apple, [])
place_on(apple, plate, [])
pick(slicing_tool, [])
slice(slicing_tool, apple, hand, [])
```

Given this, a sample folder structure for a demonstration might look like:
apple_coring__vlm_demos__456__2
| traj0
| 0
| 0.jpg
| 1
| 1.jpg
| 2
| 2.jpg
| 3
| 3.jpg
| 4
| 4.jpg
| 5
| 5.jpg
| options.txt
| traj1
| 0
| 0.jpg
| 1
| 1.jpg
| 2
| 2.jpg
| 3
| 3.jpg
| 4
| 4.jpg
| 5
| 5.jpg
| options.txt

### Running predicate invention using these image demos
To use the Gemini VLM, you need to set the `GOOGLE_API_KEY` environment variable in your terminal. You can make/get an API key [here](https://aistudio.google.com/app/apikey).

Example command: `python predicators/main.py --env apple_coring --seed 456 --approach grammar_search_invention --excluded_predicates all --num_train_tasks 1 --num_test_tasks 0 --offline_data_method img_demos --vlm_trajs_folder_name apple_coring__vlm_demos__456__1`

The important flags here are the `--offline_data_method img_demos` and the `--vlm_trajs_folder_name apple_coring__vlm_demos__456__1`. The latter should point to the folder housing the demonstration set of interest!

Note that VLM responses are always cached, so if you run the command on a demonstration set and then rerun it, it should be much faster since it's using cached responses!

Also, the code saves a human-readable txt file to the `saved_datasets` folder that contains a text representation of the GroundAtomTrajectories. You can manually inspect and even edit this file, and then rerun the rest of the predicate invention pipeline starting from this file alone (and not the original demos) as input. Here's an example command that does that:
`python predicators/main.py --env apple_coring --seed 456 --approach grammar_search_invention --excluded_predicates all --num_train_tasks 1 --offline_data_method demo+labelled_atoms --handmade_demo_filename apple_coring__demo+labelled_atoms__manual__1.txt`

where `apple_coring__demo+labelled_atoms__manual__1.txt` is the human-readable txt file.

### Structure of human-readable txt files
We assume the txt files have a particular structure that we leverage for parsing. To explain these components, consider this below example:

```
===
{*Holding(spoon): True.
*Submerged(teabag): False.
*Submerged(spoon): False.} ->
pick(teabag, hand)[] ->
{*Holding(spoon): True.
*Submerged(teabag): False.
*Submerged(spoon): False.} ->
place_in(teabag, cup)[] ->
{*Holding(spoon): True.
*Submerged(teabag): False.
*Submerged(spoon): False.} ->
pick(spoon, hand)[] ->
{*Holding(spoon): True.
*Submerged(teabag): False.
*Submerged(spoon): False.} ->
place_in(spoon, cup)[] ->
{*Holding(spoon): True.
*Submerged(teabag): False.
*Submerged(spoon): False.}
===
```

**Components**
- Separator: '===' is used to separate one trajectory from another (so a trajectory is sandwiched between two lines that have only '===' on them). In the above example, there is exactly one demonstration trajectory.
- State: Each state is a bulleted list of atoms enclosed between set brackets {}. In the above example, there are 5 states. Note importantly that the format of every atom should be `*<predicate_name>(<ob1_name>, <obj2_name>, ...).`. The `*` at the start, and the period `.` at the end are very important.
- Skill: Each skill is sandwiched between two states and takes the format: `<skill_name>(<ob1_name>, <obj2_name>, ...)[<continuous_param_vector>]`. In the above example, there are 4 skills. Note that after every state, there is a `->` character, followed by a newline, then a skill followed by another `->` character and newline. This is also critical to parsing. Note also that the above example doesn't feature any continuous parameters.


### Future features to be added
* Enable pipeline to consider demonstrations that have low-level object-oriented state, as well as image observations.
* Enable invented VLM predicates to actually be used and run at test-time.
* Consider different VLM's
Loading

0 comments on commit 8c17c42

Please sign in to comment.