Model Micro Batching

This repo helps us to micro-batch concurrent API requests and inference them in a single shot.

Technologies used

FastAPI
Redis

Architecture

API Collects the request, creates a uuid for request and pushes to redis. Batcher reads the redis queue, stacks the data into micro-batches and sends for model inferencing The results are decoupled and pushed back to redis The API function polls redis db every X seconds to check if results are published Once results are available they are sent back to client

Diagram

Usage

Start your model service (TF-Serving/ other model APIs) if any

Start redis server

docker run -d -p 6379:6379 redis

Start batcher service

python batcher.py

Start API

uvicorn api:app

Benchmark

Set the below conditions to mimic real-world times

Image decode time = 0.5 seconds
1 inference = 0.5 seconds
Batch size of model = 5

1 API worker, 1 Batch processor - model combo

Average times :

1 Image - 1.07 seconds
10 Images - 2.26 seconds
50 Images - 7.49 seconds
100 Images - 12.72 seconds
200 Images - 25.1 seconds

1 API worker, 2 Batch processor - model combo

Average times:

1 Image - 1.07 seconds
10 Images - 1.67 seconds
50 Images - 4.05 seconds
100 Images - 7.18 seconds
200 Images - 13.00 seconds

Concerns/ Test points

Error Handling
Redis db sanity when multiple batcher processes run

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
clogger		clogger
config		config
model		model
test		test
LICENSE		LICENSE
Readme.md		Readme.md
api.py		api.py
batcher.py		batcher.py
micro-batch.png		micro-batch.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Micro Batching

Architecture

Usage

Benchmark

1 API worker, 1 Batch processor - model combo

1 API worker, 2 Batch processor - model combo

Concerns/ Test points

About

Releases

Packages

Languages

License

aravindsuresh93/model-micro-batching-api

Folders and files

Latest commit

History

Repository files navigation

Model Micro Batching

Architecture

Usage

Benchmark

1 API worker, 1 Batch processor - model combo

1 API worker, 2 Batch processor - model combo

Concerns/ Test points

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages