Image Captioning Project

This project is a Flask web application that allows users to upload images and generate captions for them using a custom AI model. The model utilizes EfficientNet for the Convolutional Neural Network (CNN) component, a custom Long Short-Term Memory (LSTM) network, and a multihead attention layer. The model has an accuracy of 42%.

ScreenShots

HomePage

Upload Image

Image Uploaded

Image captioned

Directory Structure

Image Captioning Project
│
├── app.py
├── prediction.py
├── templates
│   └── index.html
├── static
│   ├── css
│   ├── images
│   └── media
├── Models
│   ├── model.h5
│   └── tokenizer.json
├── test images
└── model_training.pynb

app.py: Main Flask application file.
prediction.py: Contains the logic for image captioning using the AI model.
templates/index.html: HTML template for the main page.
static/css: Directory for CSS files.
static/images: Directory for image files.
static/media: Directory for media files.
Models: Directory containing the pre-trained model and tokenizer.
test images: Directory containing test images.
model_training.pynb: Jupyter notebook containing the code for training the AI model.

Setup and Installation

Prerequisites

Python 3.8 or higher
Pip (Python package installer)
Jupyter Notebook (for running model_training.pynb)

Installation

Clone the repository:

git clone https://github.com/harshit433/Image-Captioning-Cantilever-.git
cd Image Captioning Project

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```

Create `requirements.txt`

If requirements.txt is not already present, you can generate it with the following command after installing the necessary packages:

pip freeze > requirements.txt

Running the Application

Start the Flask application:
```
python app.py
```
Open your browser and go to:
```
http://127.0.0.1:5000/
```

Training the Model

To train the model, open the model_training.pynb file in Jupyter Notebook and run the cells. This notebook contains the code for training the AI model using EfficientNet for the CNN component, a custom LSTM network, and a multihead attention layer.

Usage

Upload an Image: Click on the "Choose File" button to select an image from your computer.
Generate Caption: After selecting the image, click on the "Upload" button to generate a caption for the image.
View Result: The generated caption and the uploaded image will be displayed on the same page.

Project Details

app.py

This is the main Flask application file which handles the web server, routes, and the logic for handling image uploads and generating captions.

prediction.py

This file contains the core logic for generating captions using a custom AI model. The model utilizes EfficientNet for the CNN component, a custom LSTM network, and a multihead attention layer. The model and tokenizer are loaded from the Models directory.

model_training.pynb

This Jupyter notebook contains the code for training the AI model. It includes data preprocessing, model architecture, training loop, and evaluation metrics.

templates/index.html

This HTML file serves as the front-end for the application where users can upload images and view the generated captions.

static

css: This directory is intended for any CSS files needed for styling the web pages.
images: This directory can be used to store images used in the project.
media: This directory is used to store the user uploaded images.

Models

This directory contains the pre-trained model (model.h5) and the tokenizer (tokenizer.json) used for generating captions.

test images

This directory can be used to store images for testing the application.

Model Details

CNN Component: Utilizes EfficientNet for feature extraction from images.
LSTM Network: A custom LSTM network is used for sequence generation.
Multihead Attention Layer: Enhances the model's ability to focus on different parts of the image when generating captions.
Accuracy: The model has an accuracy of 42%.

Contributing

Feel free to fork this repository and make your changes. Pull requests are welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning Project

ScreenShots

HomePage

Upload Image

Image Uploaded

Image captioned

Directory Structure

Setup and Installation

Prerequisites

Installation

Create `requirements.txt`

Running the Application

Training the Model

Usage

Project Details

app.py

prediction.py

model_training.pynb

templates/index.html

static

Models

test images

Model Details

Contributing

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Models		Models
static		static
templates		templates
test images		test images
.gitattributes		.gitattributes
README.md		README.md
app.py		app.py
model_training.ipynb		model_training.ipynb
prediction.py		prediction.py
requirements.txt		requirements.txt

harshit433/Image-Captioning-Cantilever

Folders and files

Latest commit

History

Repository files navigation

Image Captioning Project

ScreenShots

HomePage

Upload Image

Image Uploaded

Image captioned

Directory Structure

Setup and Installation

Prerequisites

Installation

Create requirements.txt

Running the Application

Training the Model

Usage

Project Details

app.py

prediction.py

model_training.pynb

templates/index.html

static

Models

test images

Model Details

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Create `requirements.txt`

Packages