This is a machine learning project aimed to create a traffic LLM that can provide comprehensive traffic forecasts and live summaries for users.
- Singapore Traffic Images with Vehicle Count (2023): We contributed this new dataset containing ~700k road camera traffic images with vehicle counts collated from the entire year of 2023 specifically for this project. Traffic camera image data is pulled from data.gov.sg API and the annotation of vehicle count is done automatically using YOLOv5 objection detection. Dataset is now public for use on Kaggle
- Llama-2-7b-chat-hf + Mistral-7B-Instruct-v0.1: This dataset is created by prompt engineering the LLMs on randomly-generated Singaporean traffic live data to produce summaries for finetuning a smaller language model
traffic-route-visualisation.ipynb
: explaratory visualization of the traffic around Singapore and route planning using Google Maps APItraffic_data_collection.ipynb
: data collection and annotation of vehicle counts across Singapore in 2023 using traffic camera imagestraffic_forecast_model_training.ipynb
: using data collected to train MLP models for each traffic camera to forecast the traffic along each Singaporean roadchatbot_distillation.ipynb
: leveraging knowledge distillation to distil the abstrative summarization capabilties of LLMs (specifically Llama-2-7b-chat-hf and Mistral-7B-Instruct-v0.1) to finetune a 8-bit quantized smaller language model (Microsoft Phi-2) with LoRA adapters
camera_{CAMERA_ID}_model.h5
: these are MLP models created bytraffic_forecast_model_training.ipynb
for forecasting traffic on a road for which the traffic camera is installed. The forecasting abilities of these specialized models are quite decent.- TrafficLLM PeftModel: To be released soon. Will be releasing the finetuned lora adapters for producing traffic forecasting and recommendations using small language models.