Building Scalable & Intelligent Systems
$ > const OmprakashSahani = {
$ > role: 'Software Engineer | Data Scientist',
$ > focus: 'Scalable Systems & AI/ML',
$ > phd_pursuit: 'Electrical Engineering and Computer Science Research (MIT)',
$ > impact: [
$ > 'Productivity_Up_75%',
$ > 'WaitTime_Down_100%'
$ > ],
$ > tools: [
$ > 'Python', 'Java', 'C++', 'JS/TS',
$ > 'React/Next.js', 'PyTorch', 'Google Cloud Platform', 'Docker'
$ > ]
$ > };
$ >
$ > // Explore my work, contributions, and aspirations below!
An innovative **Software Engineer and Data Scientist** with a profound passion for translating complex challenges into elegant, high-impact technical solutions. My drive lies at the intersection of scalable system design and intelligent data applications.
My expertise is proven through engineering projects like the AI-powered Gemini Dev Assistant, which significantly boosted productivity, and a Distributed Online Judge, which optimized execution times. I thrive on architecting robust platforms on Google Cloud, applying core CS fundamentals and advanced ML techniques.
Beyond development, my intellectual curiosity compels me towards deep academic research. I explore applying cutting-edge machine learning to real-world domains like healthcare risk management, always seeking to bridge theoretical insights with practical, impactful implementations.
Proficient in Python, Java, C++, and JavaScript, with a strong foundation in DS&A and distributed systems, I am poised to contribute meaningfully to both industry-leading innovation and frontier academic advancement.
function getEducationInfo() {
return {
university: 'Sanjay Ghodawat University',
degree: 'Bachelor of Technology (B.Tech) in Computer Science and Engineering',
graduated: 'May 2023',
gpa: '8.40/10.0 (3.62/4.0 U.S.)',
distinction: 'First Class with Distinction',
relevant_courses: [
'Data Structures & Algorithms',
'Operating Systems (Linux)',
'Computer Networks (TCP/IP)',
'Software Engineering',
'Database Management System (DBMS)',
'Distributed Systems',
'Machine Learning',
'Artificial Intelligence'
]
};
}
function getCommunityInvolvement() {
return [
'Google Developer Program (Participant)',
'GitHub Developer Program (Member)',
'OpenAI Developer Program (Participant)',
'LeetCode Profile: Active Participant',
'HackerRank Profile: Certified Expert'
];
}
// Community involvement fuels growth.
const myFuture = {
role: 'Software Engineer | Data Scientist',
target_company: 'Google',
phd_goal: 'EECS @ MIT',
impact_focus: 'Scalable AI & Distributed Systems'
};
Developed an AI-powered Code Assistant using Google Gemini API, significantly reducing code generation time by 75% and boosting developer productivity. Engineered context management algorithms (90%+ accuracy). Deployed on GCP, ensuring API response times <500ms.
ShellAI is a full-stack, web-based terminal emulator that leverages a large language model to create a smart and highly efficient command-line assistant. The projects core functionality is to translate natural language into executable shell commands, providing a seamless and intuitive user experience. Its designed to be a significant productivity booster for developers and system administrators by reducing the need to remember complex command-line syntax.
Implemented a Python-based search engine indexing 50+ documents with Inverted Index and TF-IDF ranking, achieving sub-second query latency (~0.5ms). Architected a robust, NLTK-powered, Dockerized pipeline with comprehensive testing.
Constructed a full-stack web app integrating 5+ GitHub REST APIs. Formulated a unique "GitHub Engagement Score" (0-100) using weighted heuristic models and efficient data aggregation. Deployed on GCP App Engine.
Built a scalable Online Judge with sub-second Python/Java code execution. Orchestrated a Docker-isolated judging pipeline across 3 services using Redis/RQ for IPC, enabling seamless deployment on 3 core GCP services. Reduced user wait times by 100% via WebSockets.
Applied 5+ machine learning models (Random Forest, CNN, SVM) to 4 healthcare datasets (3,400+ records) in Python. Reviewed 20+ studies to identify gaps in Clinical Decision Support Systems and evaluated diagnostic accuracy and risk prediction.
Spearheaded a team of 5 to build a Python-based weather application, including an enhanced GUI with 10% more features. Managed project lifecycle and ensured successful delivery.
Architected and developed a full-stack application that transforms complex environmental data into actionable insights and engaging narratives. Integrated real-time and historical air quality data, enabling AI-driven analysis, creative content generation (social posts, detailed reports), and interactive data visualization. Implemented data persistence with PostgreSQL and ensured seamless frontend-backend communication.
Designed and implemented an intelligent robotic arm simulator capable of executing natural language commands in a 3D physics environment. Leveraged AI to translate high-level human instructions into precise robotic movements for pick-and-place and complex stacking tasks. Overcame significant challenges in inverse kinematics, grasping stability, and collision avoidance within the simulation, demonstrating robust control logic.
Architected and developed a full-stack application that leverages machine learning to predict the biological activity of new drug molecules. The platform features a user-friendly interface for molecular data input and visualization, a trained model for accurate predictions, and a robust backend that dynamically generates AI-powered explanations for a non-technical audience. It showcases skills in data science, AI integration, and full-stack web development.
Developed a predictive heart disease model utilizing Random Forest on a dataset of 1,026 records, achieving 99% accuracy and visualizing key feature importance, leading to earlier interventions for patients.
Engineered a convolutional neural network (CNN) to categorize brain MRI images (253 images), obtained 90% accuracy with 30 images and 86% with 51, enabling quicker diagnosis and insight, showcasing strong self-motivation.
Devised a diabetes prediction model using the Pima Indians dataset (765 records, 9 features), showing SVM (80%) and Logistic Regression (79.13%) as most effective for early detection, highlighting analytical rigor.
Built 7+ interactive MATLAB visualizations, including 3D plots, vector fields, and fractals – to communicate complex mathematical ideas, with 7+ downloads on MATLAB Central, reflecting curiosity and knowledge sharing.
Implemented a hybrid XGBoost-LSTM model for energy forecasting using the UCI dataset (370 time-series, 15-minute intervals), integrating weather, temporal, and holiday data to improve accuracy, reflecting initiative and real-world impact.
Leveraged LSTM-based regression model to forecast PM2.5 concentrations, generating probabilistic intervals that captured predictive uncertainty through 10th, 50th, and 90th percentile estimates, informing air quality insights for decision-making.
Authors: Omprakash Sahani
Independent Research Project (Jan 2025 – Present)
Applied 5+ machine learning models (Random Forest, CNN, SVM) to 4 healthcare datasets (3,400+ records) in Python to evaluate diagnostic accuracy and risk prediction. Reviewed 20+ studies to identify gaps in Clinical Decision Support Systems and inform medical decision-making.
My primary research interests lie at the intersection of **Scalable Machine Learning & Artificial Intelligence** and **Robust Distributed Systems**.
Specifically, I am keen on exploring topics such as:
I am particularly excited about the work being done by faculty like **[Specific MIT Professor Name 1]** and **[Specific MIT Professor Name 2]** in the EECS department at MIT, whose research aligns closely with my aspirations in [mention their specific research area if you know it]. I look forward to contributing to and advancing these frontiers.
I'm always open to discussing new opportunities, collaborations, or research ideas. Feel free to reach out!