AI Projects


Vehicle Detection & Counting: A computer vision project that detects, tracks, and counts vehicles in real time from CCTV/video footage. It uses YOLOv8 for object detection and SORT for tracking, giving each vehicle a unique ID as it moves through the scene.

AI Fitness & Posture Tracking App: A computer vision project that tracks body posture and movement in real time. It uses pose estimation to detect key body joints, count exercise repetitions, and assess the quality of movements such as squats, gym exercises, stretching, and yoga poses. It is built with MediaPipe and OpenCV.

Pigeon Detection & Deterrent: A computer vision project that detects pigeons in live video/CCTV and automatically triggers a deterrent response (sprinkler system). It uses a custom YOLOv5 model trained on annotated pigeon footage, running on a Raspberry Pi with a connected camera and Shelly device to activate a water spray when birds are detected, with IoT integration via a Shelly relay.

Neural Style Transfer: A deep learning project that transforms ordinary photos into stylised artwork. It uses a VGG19-based neural style transfer pipeline to combine the content of a photograph with the visual style of famous paintings, producing artistic images with distinctive textures, colours, and patterns. It is implemented in PyTorch using VGG19.

Kitesurfer Detection & Tracking System: A computer vision project that detects and tracks kitesurfers in beach video footage. It uses a lightweight YOLO model for object detection and Meta’s SAM 2 to help annotate training data, making it easier to build a custom detector for riders and kites in real-world coastal scenes. It uses Python, YOLOv11 nano, Meta SAM 2, OpenCV, PyTorch, and Ultralytics tracking tools.

🔹 Audio Transcription & Summarisation : Whisper AI, GPT-3.5, Flask REST API – transcribes audio files and generates AI summaries.

🔹 AI Personal Assistant with Tools: LangChain, function calling, tool integration – conversational assistant that can use external tools.

🔹 Translation System: Multi-language translation models – translates text between multiple languages.

🔹 LangSmith WebSearch: AI agent with web search integration and LangSmith – provides structured answers from the web.

🔹 AgentCore: Modular agent framework – reusable framework for building AI agents.

🔹 RAG Document Q&A System: LangChain, ChromaDB, vector embeddings – question answering over custom documents using RAG.

🔹 Movie Recommender – Collaborative Filtering: Scikit-Surprise, SVD, KNN, Streamlit, Docker – recommends movies based on user rating patterns; RMSE 0.934, fully Dockerized with interactive web app.

🔹 Content-Based Movie Recommender: TF-IDF, Cosine Similarity, Streamlit – recommends movies based on content features (genres); explainable and solves cold start.

🔹 Hybrid Recommender System: Ensemble of collaborative and content-based methods, Amazon Personalize – combines both for diversity and explainability.

🔹 Portfolio Website v1 : React, Next.js, Tailwind CSS – showcases first 12 projects with live demos and GitHub links.

🔹 Facial Recognition System : face_recognition, dlib, OpenCV – identify and verify faces in images and video for security and attendance.

🔹 OCR Invoice Parser: Tesseract, AWS Textract, spaCy – extracts structured data from invoices and receipts into JSON.

🔹 Image Segmentation Tool : Segment Anything Model (SAM), YOLO – segments and labels objects in images for applications like medical imaging and autonomy.

🔹 AI Image Generator: Stable Diffusion, DALL‑E API – generates images from text with multiple styles and fine-tuning options.

🔹 AI Writing Assistant: GPT‑4, fine-tuning, prompt engineering – assists with writing articles, emails, and code with context awareness.

🔹 Voice Cloning System: Coqui TTS, Tacotron 2 – clones and synthesises voices for audiobooks and accessibility.

🔹 Stock Price Analyzer: LSTM networks, PyTorch – predicts stock price movements using technical indicators and sentiment.

🔹 Sales Forecasting System: Prophet, ARIMA, seasonal decomposition – forecasts future sales trends with confidence intervals.

🔹 Anomaly Detection for Time Series: Autoencoders, Isolation Forest – detects unusual patterns for fraud detection and system monitoring.

🔹 Energy Consumption Forecasting: XGBoost, feature engineering – predicts energy usage patterns with weather and seasonality features.

🔹 Customer Support Chatbot: RASA, intent recognition, entity extraction – automated multi‑turn customer service with slot filling.

🔹 Voice Assistant (Alexa‑style): Speech‑to‑text, NLU, wake word detection – voice‑controlled personal assistant with custom skills and smart home integration.

🔹 Conversational Form Filler: DialogFlow, slot filling, NER – collects structured data conversationally for surveys and registrations.

🔹 AI Meeting Assistant: Whisper, GPT‑4, speaker diarisation – transcribes meetings and extracts summaries, todos, and key decisions.

🔹 Image Captioning System : BLIP, Vision Transformers – generates descriptions of images for accessibility and SEO.

🔹 Visual Question Answering: CLIP, multimodal transformers – answers questions about images (e.g., “What colour is the car?”).

🔹 Video Summariser : Keyframe extraction, scene detection, GPT – summarises long videos into key moments and transcript highlights.

🔹 Document Understanding AI: LayoutLM, AWS Textract – extracts information from complex documents such as forms, contracts, and papers.

🔹 Game‑Playing AI : Q‑learning, DQN – learns to play games like Atari and custom environments.

🔹 Chess Engine with ML : Policy gradients, AlphaZero‑style – trains a chess‑playing AI via self‑play and evaluation.

🔹 Robotic Arm Simulator : OpenAI Gym, robot simulation – trains agents to control a robotic arm for pick‑and‑place and path planning.

🔹 Optimization Agent: RL for optimisation – route planning and resource scheduling for delivery and scheduling tasks.

🔹 ML Model API Service: FastAPI, Docker, REST – production‑ready model serving with async inference and rate limiting.

🔹 Real‑Time Prediction Pipeline: AWS Kinesis, streaming – real‑time prediction pipeline for fraud detection and analytics.

🔹 A/B Testing Framework: Statistical testing, experiment tracking – runs controlled experiments on ML models with significance metrics.

🔹 ML Monitoring Dashboard: Prometheus, Grafana, model drift detection – monitors production models for accuracy drift, latency, and errors.

🔹 Medical Image Classifier: Transfer learning, ResNet – classifies X‑ray/MRI images for tasks like pneumonia and tumour detection.

🔹 Music Genre Classifier: Librosa, CNNs – classifies music tracks by genre using spectrogram and MFCC features.

🔹 Resume Parser & Job Matcher: spaCy, NER, matching algorithms – extracts skills from CVs and matches candidates to jobs.

🔹 Fake News Detector (Week 44, Aug 2026): BERT, text classification – flags potentially false information using content and source signals.

🔹 Social Media Content Generator: GPT‑4, DALL‑E, automation – generates posts, images, and captions for Twitter, Instagram, and LinkedIn.

🔹 AI Code Reviewer: AST parsing, GPT‑4, static analysis – automated code review with bug detection and best‑practice suggestions.

🔹 Personal Finance AI Assistant: NLP, transaction categorisation, forecasting – tracks spending, budgeting, and savings goals with reminders and advice.

🔹 End‑to‑End ML Pipeline: MLflow, Airflow – full MLOps pipeline from data to deployment with training, versioning, and monitoring.

🔹 Multi‑Agent System: Collaborative AI agents – multiple agents coordinating on complex task automation.

🔹 Portfolio Website v2 : Advanced React, animations, 3D – polished portfolio with live demos, case studies, and metrics.

🔹 Blog/Tutorial Series : Technical writing and documentation – complete 52‑week journey documented across 50+ posts and tutorials.

🔹 Custom Capstone Project Combines skills from all projects into a unique flagship system for your portfolio and interviews.