Skip to content

Supermarket Scene Description using FasterRCNN + KMeans + CLIP + GPT and Product Retrieval using DenseNet.

Notifications You must be signed in to change notification settings

growupboron/ObjectDescriptionSupermarket-CVCS-UniMoRe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Object Detection and Scene Description in a Supermarket

This is a course project for the postgraduate level course of Computer Vision and Cognitive System taught at DIEF, UniMoRe.

Datasets

  • For the Object Detection task, we use the SKU110K dataset.
  • For the Product Classification and Embeddings for the Product Retrieval task, we use the GroceryStoreDataset.

Training and Experimentations

For training the Faster RCNN model for Object detection:

sbatch frcnn.slurm

For training the DenseNet 121 model for Product Classification and Embeddings for the Product Retrieval:

sbatch clf.slurm

Implementation and Inference

Object Detection and Scene Description

  • For the implementation of the complete pipeline:
    • Classical Scene Image Preprocessing (Histogram Equalization)
    • Inference of both models: Faster RCNN and DenseNet 121 (commented out)
    • Shelf numbering: K Means with Silhouette Analysis
    • Dominant colour recognition (commented out)
    • Zero-Shot Product Detection using CLIP (Contrastive Language-Image Pre-training) model
    • Spatial Description through geometrical templating
    • Concise Scene Description using ChatGPT 3.5 Turbo through OpenAI API
export OPENAI_API_KEY=entergeneratedAPIKey

sbatch inference.slurm

pipeline

Retrieval Mechanism

Retrieval was initially experimented using Google Colab: https://colab.research.google.com/drive/1HXn3XRod3_6CHOes7aB0bJltz-IJagRP?usp=sharing

sbatch retrival.slurm

retrival

(Additional modifications can be made by editing the Python scripts mentioned in the corresponding slurm files.)

About

Supermarket Scene Description using FasterRCNN + KMeans + CLIP + GPT and Product Retrieval using DenseNet.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published