Featured Projects
AI-Powered Question Generation with Graph RAG
Created an automated question generation system for US/UK school papers that achieved >90% accuracy while reducing costs from ₹50-₹100 per question to <₹1 per question, leveraging Graph RAG for accurate knowledge retrieval from textbooks.
Challenges
- ChatGPT reduced cost from ₹50-₹100 to ₹5-₹10 per question but accuracy dropped from 95%+ to ~70%
- Generating hundreds of questions in a single prompt caused contextual bleed, where earlier outputs influenced later ones
- Standard vector search retrieved many redundant or overlapping chunks from textbooks
- This resulted in inconsistent question quality, topic drift, and reduced factual accuracy
- The drop in correctness made automated generation commercially unviable at scale
Solutions
- Generated one question at a time to preserve context integrity and improve factual correctness
- Used three LLMs in sequence: LLM-1 generates question text, LLM-2 generates MCQ options and correct answer, LLM-3 validates for quality
- Integrated Graph RAG with first and third LLMs for reference-based grounding from specific textbooks, enabling both structured and semantic search
- Built document-derived knowledge graphs from textbooks where each node represents a concept with vectorized embeddings
- Applied context engineering, nano models (LLM-1), and prompt caching to minimize token usage
- Implemented using LangGraph framework with OpenAI GPT-4o as LLM
Outcomes
- Reduced cost to <₹1 per question, including formatting and direct PDF generation
- Achieved >90% correctness, validated through human checks for 1 month
- Improved retrieval accuracy through contextual graph relationships from textbooks
- Fully automated process eliminated human intervention
- Saved approximately ₹60,000/month by replacing two manual operators
Satellite Anomaly Detection & Classification
Built a two-stage AI pipeline for detecting and classifying anomalies across satellite subsystem telemetry — reaction wheels and power systems — using an LSTM Autoencoder for unsupervised detection and a 1D CNN for supervised fault classification, all from a single unified model.
Challenges
- Needed to detect multiple fault types (friction, viscous drag, torque limits, bias torque) across multiple subsystems (reaction wheels, battery) using a single model
- Telemetry anomalies caused digital twin models to drift, impacting mission-critical operations like maneuver planning and fuel budgeting
- Faults exhibited varying severity levels and could be persistent, temporary, or progressive — requiring a model that captures complex temporal dynamics
- Cross-subsystem coupling (e.g., reaction wheel friction affecting power consumption) made naive per-channel thresholding unreliable
Solutions
- Designed a two-stage pipeline: Stage 1 trains an LSTM Autoencoder on healthy telemetry to learn nominal state-transition dynamics; Stage 2 trains a 1D CNN classifier on reconstruction residuals to identify fault type and affected component
- Built the LSTM Autoencoder with a bottleneck layer and weighted MSE loss that masks command channels, focusing learning on actual sensor response behavior
- Implemented a residual-connection 1D CNN classifier with adaptive pooling for multi-label fault classification across 12+ fault classes (4 fault types x 3+ components)
- Developed a sliding-window data pipeline with overlap-based labeling from fault interval annotations, supporting both binary detection and multi-class classification
- Used PyTorch with early stopping, gradient clipping, and checkpoint management for robust training on 26-channel spacecraft simulation data
Outcomes
- Achieved >99% precision and recall on synthetic fault datasets across all subsystems
- Single unified model successfully isolates anomalies to individual components (reaction wheels, battery) without separate per-subsystem models
- Reconstruction-error-based detection enables early fault identification before catastrophic failure, with interpretable anomaly scores similar to Kalman filter residuals
- Modular architecture allows independent optimization of detection and classification stages, with a clear path to physics-informed loss functions and remaining useful life (RUL) estimation