4b Human perception

Getting human perception scores from street-level imagery. The perception categorries are safety, lively, beautiful, wealthy, boring and depressing.

The scores are in scale of 0-10.

Safety, lively, beautiful, wealthy high score indicates strong positive feeling

Boring, depressing high score indicates strong negative feeling

Model

The models are pretrained on the MIT Place Pulse 2.0 dataset. The backbone of the model is vision transformer (ViT) pretrianed on ImageNet (ViT_B_16_Weights.IMAGENET1K_SWAG_E2E_V1). We added 3 Linear layers with ReLU as activation, in ViT heads for classification.

Code snippet:

nn.Linear(num_fc, 512, bias=True),
nn.ReLU(True),
nn.Linear(512, 256, bias=True),
nn.ReLU(True),
nn.Linear(256, num_class, bias=True)

The model structure can be found in code/model_training/perception/Model_01.py. The pretained models will be automatically downloaded when run inference.py (recommended method). You can also manually download the models here.

How to run the model

Set up environment with requirements-cv-linux.txt.

Input

The input CSV should:

have each row representing an image to process, and
contain minimally two columns, named uuid and path, to specify image UUID and the local image file path, respectively

Output

One CSV for each perception dimension.

Each CSV contains two columns: uuids(image name) and the inferred perception scores.

To reproduce `sample_output`

Modify out_Path in inference.py to the directory you wish to store the output CSVs, then

python3 inference.py

To run inference for your own image/images

Modify inference.py:

Modify out_Path to the directory you wish to store the output CSVs
Modify in_Path to the path of your input CSV

Run:

python3 inference.py

Acknowledgements

Our work in human perception builds on and uses code from human-perception-place-pulse developed by Ouyang (2023).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4b Human perception

Model

How to run the model

To reproduce `sample_output`

To run inference for your own image/images

Acknowledgements

Clone this wiki locally

4b Human perception

Model

How to run the model

To reproduce sample_output

To run inference for your own image/images

Acknowledgements

Clone this wiki locally

To reproduce `sample_output`