Projects

What I'm building

Open-source projects spanning multimodal LLMs, model serving, computer vision, and AI tooling — mostly built with LitServe and PyTorch.

LitServe
TTS
Voice Cloning
AI

Chatterbox TTS API

Production-ready Text-to-Speech API built on Resemble AI's Chatterbox model with LitServe. Supports zero-shot voice cloning, emotion intensity control, and base64 audio I/O.

LitServe
Object Detection
Transformers
Computer Vision

RF-DETR Object Detection API

Real-time object detection API using RF-DETR, a SOTA transformer-based model, deployed with LitServe. End-to-end inference with no region proposals or anchor boxes required.

LitServe
AI
APIs
Production

LitServe Examples

A curated collection of production-grade AI serving examples built on LitServe — Lightning AI's high-performance inference engine. Covers speech, vision, LLMs, embeddings, and object detection.

LitServe
Llama
Multimodal
LLM

Chat with Llama 3.2 Vision

Deploy Meta's Llama 3.2 Vision multimodal LLM with LitServe for lightning-fast inference. Supports image understanding and visual question answering via a clean REST API.

LitServe
Multimodal
LLM
Qwen

Chat with Qwen2-VL

Deploy and chat with Alibaba's Qwen2-VL multimodal large language model using LitServe. Supports image understanding, document parsing, and visual reasoning tasks.

LitServe
Multimodal
LLM
MiniCPM

Chat with MiniCPM-V 2.6

Deploy MiniCPM-V 2.6 — a GPT-4V level multimodal LLM designed for edge devices — using LitServe. Handles single image, multi-image, and video inputs.

LitServe
Multimodal
LLM
Streamlit

Chat with Phi 3.5 Vision

Deploy and chat with Microsoft's Phi 3.5-vision multimodal LLM. LitServe handles high-performance inference while a Streamlit frontend gives you multi-image chat, comparison, and video summarization.

Python
OCR
LLM
FastAPI

Receipt OCR Engine

An efficient open-source OCR engine for receipt image processing. Combines Tesseract OCR for raw text extraction with LLM-powered structured data parsing — available as a CLI tool and FastAPI service.

PyTorch
MONAI
Medical AI
Segmentation

3D Lung Tumour Segmentation

3D semantic segmentation of lung tumours from CT scans using PyTorch Lightning and MONAI. Trained on the Medical Segmentation Decathlon lung dataset with a U-Net based architecture.