Omprakash Sahani

Building Scalable & Intelligent Systems

Code Terminal

$ > const OmprakashSahani = {

$ > role: 'Software Engineer | Data Scientist',

$ > focus: 'Scalable Systems & AI/ML',

$ > phd_pursuit: 'Electrical Engineering and Computer Science Research (MIT)',

$ > impact: [

$ > 'Productivity_Up_75%',

$ > 'WaitTime_Down_100%'

$ > ],

$ > tools: [

$ > 'Python', 'Java', 'C++', 'JS/TS',

$ > 'React/Next.js', 'PyTorch', 'Google Cloud Platform', 'Docker'

$ > ]

$ > };

$ >

$ > // Explore my work, contributions, and aspirations below!

A Glimpse into My Journey

An innovative **Software Engineer and Data Scientist** with a profound passion for translating complex challenges into elegant, high-impact technical solutions. My drive lies at the intersection of scalable system design and intelligent data applications.

My expertise is proven through engineering projects like the AI-powered Gemini Dev Assistant, which significantly boosted productivity, and a Distributed Online Judge, which optimized execution times. I thrive on architecting robust platforms on Google Cloud, applying core CS fundamentals and advanced ML techniques.

Beyond development, my intellectual curiosity compels me towards deep academic research. I explore applying cutting-edge machine learning to real-world domains like healthcare risk management, always seeking to bridge theoretical insights with practical, impactful implementations.

Proficient in Python, Java, C++, and JavaScript, with a strong foundation in DS&A and distributed systems, I am poised to contribute meaningfully to both industry-leading innovation and frontier academic advancement.

Education & Community

function getEducationInfo() {

return {

university: 'Sanjay Ghodawat University',

degree: 'Bachelor of Technology (B.Tech) in Computer Science and Engineering',

graduated: 'May 2023',

gpa: '8.40/10.0 (3.62/4.0 U.S.)',

distinction: 'First Class with Distinction',

relevant_courses: [

'Data Structures & Algorithms',

'Operating Systems (Linux)',

'Computer Networks (TCP/IP)',

'Software Engineering',

'Database Management System (DBMS)',

'Distributed Systems',

'Machine Learning',

'Artificial Intelligence'

]

};

}

function getCommunityInvolvement() {

return [

'Google Developer Program (Participant)',

'GitHub Developer Program (Member)',

'OpenAI Developer Program (Participant)',

'LeetCode Profile: Active Participant',

'HackerRank Profile: Certified Expert'

];

}

// Community involvement fuels growth.

My Aspirations // A Running Process

const myFuture = {

role: 'Software Engineer | Data Scientist',

target_company: 'Google',

phd_goal: 'EECS @ MIT',

impact_focus: 'Scalable AI & Distributed Systems'

};

My Technical Stack

Programming Languages

  • Python(Expert) ML, Data Analysis, Backend, Scripting
  • Java(Advanced) Backend Development, Distributed Systems, Flutter
  • JavaScript/TypeScript(Advanced) Frontend (React, Next.js), Web APIs
  • C++(Intermediate) Algorithms, Data Structures, Competitive Programming
  • SQL(Expert) Database Management, Complex Queries
  • R Programming(Intermediate) Statistical Analysis, Data Visualization
  • MATLAB(Intermediate) Numerical Computing, Algorithm Prototyping
  • Shell Scripting(Advanced) Linux Automation, DevOps workflows

Web & Mobile Frameworks

  • React.js(Advanced) Modern UI, Component-based Architecture
  • Next.js(Advanced) SSR/SSG, App Router, Full-stack
  • HTML(Expert) Semantic Structure, Web Content Markup
  • CSS(Expert) Styling, Responsive Design, Animations
  • Flask(Advanced) RESTful APIs, Backend Development
  • WebSockets(Intermediate) Real-time Communication (Online Judge)
  • Flutter(Intermediate) Cross-platform Mobile Development

Machine Learning & AI

  • PyTorch(Advanced) Deep Learning, Neural Network Design
  • TensorFlow(Intermediate) ML Model Deployment, Production ML
  • XGBoost(Advanced) Gradient Boosting, Tree-based Models
  • Scikit-learn(Expert) Classical ML Algorithms, Data Preprocessing
  • Pandas/NumPy(Expert) Data Manipulation, Scientific Computing
  • Matplotlib(Advanced) Data Visualization, Plotting
  • NLTK(Advanced) Natural Language Processing (NLP)

Cloud Platforms & DevOps

  • Google Cloud Platform (GCP)(Advanced) Compute Engine, App Engine, S3, Lambda, API Gateway, Deployment
  • Firebase(Intermediate) Backend-as-a-Service, Realtime Database, Auth
  • AWS(Intermediate) EC2, S3, Lambda, DynamoDB, RDS, VPC
  • Docker(Expert) Containerization, Microservices Deployment
  • Git & GitHub(Expert) Version Control, Collaborative Development
  • Linux(Advanced) Operating System, Development Environment

Databases & Data Management

  • MySQL(Advanced) Relational Database Management
  • PostgreSQL(Advanced) Relational Database Design, Query Optimization
  • MongoDB(Intermediate) NoSQL Document Database
  • Redis(Intermediate) Caching, Task Orchestration (Online Judge)
  • SQL (Language)(Expert) Complex Queries, Data Manipulation

Tools & Methodologies

  • VS Code(Expert) Primary Integrated Development Environment
  • Jupyter Notebook(Advanced) Data Science Workflow, Prototyping
  • Google Colab(Advanced) Collaborative ML Development
  • RESTful APIs(Expert) Designing and Consuming APIs
  • Agile/Scrum(Advanced) Project Management, Iterative Development
  • Unit/Integration Testing(Expert) Robust Software Development
  • Debugging(Expert) Problem Isolation & Resolution

My Projects & Research

GEMINI DEV ASSISTANT

GEMINI DEV ASSISTANT

Developed an AI-powered Code Assistant using Google Gemini API, significantly reducing code generation time by 75% and boosting developer productivity. Engineered context management algorithms (90%+ accuracy). Deployed on GCP, ensuring API response times <500ms.

React Native Flask Google Gemini API GCP AI/ML Python
SHELLAI

SHELLAI

ShellAI is a full-stack, web-based terminal emulator that leverages a large language model to create a smart and highly efficient command-line assistant. The projects core functionality is to translate natural language into executable shell commands, providing a seamless and intuitive user experience. Its designed to be a significant productivity booster for developers and system administrators by reducing the need to remember complex command-line syntax.

Python OpenAI API Flask React Dev Tools-Vite GCP CSS and Tailwind CSS
CUSTOM SEARCH ENGINE (MINI GOOGLE SEARCH)

CUSTOM SEARCH ENGINE (MINI GOOGLE SEARCH)

Implemented a Python-based search engine indexing 50+ documents with Inverted Index and TF-IDF ranking, achieving sub-second query latency (~0.5ms). Architected a robust, NLTK-powered, Dockerized pipeline with comprehensive testing.

Python NLTK Docker Information Retrieval Testing
GITHUB PROFILE ANALYZER WEB APP

GITHUB PROFILE ANALYZER WEB APP

Constructed a full-stack web app integrating 5+ GitHub REST APIs. Formulated a unique "GitHub Engagement Score" (0-100) using weighted heuristic models and efficient data aggregation. Deployed on GCP App Engine.

Python Flask GitHub API GCP App Engine Web Development
DISTRIBUTED ONLINE JUDGE

DISTRIBUTED ONLINE JUDGE

Built a scalable Online Judge with sub-second Python/Java code execution. Orchestrated a Docker-isolated judging pipeline across 3 services using Redis/RQ for IPC, enabling seamless deployment on 3 core GCP services. Reduced user wait times by 100% via WebSockets.

Python Java Docker Redis WebSockets GCP Distributed Systems
HEALTHCARE ML INNOVATION (Research)

HEALTHCARE ML INNOVATION (Research)

Applied 5+ machine learning models (Random Forest, CNN, SVM) to 4 healthcare datasets (3,400+ records) in Python. Reviewed 20+ studies to identify gaps in Clinical Decision Support Systems and evaluated diagnostic accuracy and risk prediction.

Python Machine Learning Random Forest CNN SVM Healthcare Analytics
REAL-TIME WEATHER FORECASTING (Lead)

REAL-TIME WEATHER FORECASTING (Lead)

Spearheaded a team of 5 to build a Python-based weather application, including an enhanced GUI with 10% more features. Managed project lifecycle and ensured successful delivery.

Python GUI Development Team Leadership
AI-POWERED ENVIRONMENTAL INTELLIGENCE PLATFORM

AI-POWERED ENVIRONMENTAL INTELLIGENCE PLATFORM

Architected and developed a full-stack application that transforms complex environmental data into actionable insights and engaging narratives. Integrated real-time and historical air quality data, enabling AI-driven analysis, creative content generation (social posts, detailed reports), and interactive data visualization. Implemented data persistence with PostgreSQL and ensured seamless frontend-backend communication.

Python Flask PostgreSQL Docker React Vite Leaflet.js Chart.js OpenWeatherMap API OpenAI API RESTful APIs
AI-POWERED ROBOTIC ARM SIMULATOR

AI-POWERED ROBOTIC ARM SIMULATOR

Designed and implemented an intelligent robotic arm simulator capable of executing natural language commands in a 3D physics environment. Leveraged AI to translate high-level human instructions into precise robotic movements for pick-and-place and complex stacking tasks. Overcame significant challenges in inverse kinematics, grasping stability, and collision avoidance within the simulation, demonstrating robust control logic.

Python PyBullet OpenAI API Flask React Vite Natural Language Processing (NLP) Robotics Simulation Inverse Kinematics
AI-POWERED DRUG DISCOVERY PLATFORM (FULL-STACK)

AI-POWERED DRUG DISCOVERY PLATFORM (FULL-STACK)

Architected and developed a full-stack application that leverages machine learning to predict the biological activity of new drug molecules. The platform features a user-friendly interface for molecular data input and visualization, a trained model for accurate predictions, and a robust backend that dynamically generates AI-powered explanations for a non-technical audience. It showcases skills in data science, AI integration, and full-stack web development.

Python OpenAI API Flask React Vite scikit-learn rdkit ChEMBL Database
PREDICTING HEART DISEASE RISK USING MACHINE LEARNING MODEL

PREDICTING HEART DISEASE RISK USING MACHINE LEARNING MODEL

Developed a predictive heart disease model utilizing Random Forest on a dataset of 1,026 records, achieving 99% accuracy and visualizing key feature importance, leading to earlier interventions for patients.

Python TensorFlow Matplotlib Jupyter Notebook Pandas scikit-learn Kaggle Dataset
MEDICAL IMAGE ANALYSIS FOR BRAIN TUMOR DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)

MEDICAL IMAGE ANALYSIS FOR BRAIN TUMOR DETECTION USING CONVOLUTIONAL NEURAL NETWORK (CNN)

Engineered a convolutional neural network (CNN) to categorize brain MRI images (253 images), obtained 90% accuracy with 30 images and 86% with 51, enabling quicker diagnosis and insight, showcasing strong self-motivation.

Python TensorFlow Matplotlib Jupyter Notebook Pandas scikit-learn Kaggle Dataset Keras
DIABETES PREDICTION ANALYSIS USING MACHINE LEARNING APPROACH

DIABETES PREDICTION ANALYSIS USING MACHINE LEARNING APPROACH

Devised a diabetes prediction model using the Pima Indians dataset (765 records, 9 features), showing SVM (80%) and Logistic Regression (79.13%) as most effective for early detection, highlighting analytical rigor.

R Shiny ggplot2 caret XGBoost Kaggle Dataset
MATHEMATICAL VISUALIZATIONS IN MATLAB

MATHEMATICAL VISUALIZATIONS IN MATLAB

Built 7+ interactive MATLAB visualizations, including 3D plots, vector fields, and fractals – to communicate complex mathematical ideas, with 7+ downloads on MATLAB Central, reflecting curiosity and knowledge sharing.

MATLAB MATLAB Graphics Statistics and Machine Learning Toolbox
HYBRIS TIME-SERIES FORECASTING OF ENERGY CONSUMPTION USING XGBOOST AND LSTM WITH WEATHER, TRMPORAL, AND HOLIDAY FEATURES

HYBRIS TIME-SERIES FORECASTING OF ENERGY CONSUMPTION USING XGBOOST AND LSTM WITH WEATHER, TRMPORAL, AND HOLIDAY FEATURES

Implemented a hybrid XGBoost-LSTM model for energy forecasting using the UCI dataset (370 time-series, 15-minute intervals), integrating weather, temporal, and holiday data to improve accuracy, reflecting initiative and real-world impact.

Python Jupyter Notebook TensorFlow XGBoost ElectricityLoadDiagrams20112014 Dataset
PROBABILISTIC TIME SERIES FORECASTING USING QUANTILE REGRESSION AND DEEP LEARNING

PROBABILISTIC TIME SERIES FORECASTING USING QUANTILE REGRESSION AND DEEP LEARNING

Leveraged LSTM-based regression model to forecast PM2.5 concentrations, generating probabilistic intervals that captured predictive uncertainty through 10th, 50th, and 90th percentile estimates, informing air quality insights for decision-making.

Python Jupyter Notebook TensorFlow XGBoost Beijing PM2.5 Dataset

Research & Publications

TRANSFORMING HEALTHCARE WITH MACHINE LEARNING

Authors: Omprakash Sahani

Independent Research Project (Jan 2025 – Present)

Applied 5+ machine learning models (Random Forest, CNN, SVM) to 4 healthcare datasets (3,400+ records) in Python to evaluate diagnostic accuracy and risk prediction. Reviewed 20+ studies to identify gaps in Clinical Decision Support Systems and inform medical decision-making.

Machine LearningHealthcare AIRandom ForestCNNSVMMedical Informatics

My Research Interests

My primary research interests lie at the intersection of **Scalable Machine Learning & Artificial Intelligence** and **Robust Distributed Systems**.

Specifically, I am keen on exploring topics such as:

  • **Optimizing Distributed ML Systems:** Researching efficient training and deployment of AI models across large-scale distributed environments (e.g., Federated Learning architectures, distributed inference).
  • **AI for System Resilience & Optimization:** Applying AI/ML techniques to enhance the fault tolerance, resource management, and performance of complex distributed systems.
  • **Algorithmic Innovation in NLP & Information Retrieval:** Developing novel algorithms for contextual understanding and efficient data access in large unstructured datasets.
  • **Machine Learning for Decision Support:** Leveraging ML models for critical decision-making and risk prediction in complex domains like healthcare.

I am particularly excited about the work being done by faculty like **[Specific MIT Professor Name 1]** and **[Specific MIT Professor Name 2]** in the EECS department at MIT, whose research aligns closely with my aspirations in [mention their specific research area if you know it]. I look forward to contributing to and advancing these frontiers.

Let's Connect

I'm always open to discussing new opportunities, collaborations, or research ideas. Feel free to reach out!