Projects

PixeLearner Project

PixeLearner: A Personalized ML-Powered App

Advanced Stellar Modeling

Advanced Stellar Modeling

Machine Learning Project

Sentiment Analysis on Movie Reviews using NLP and Machine Learning

SVR

Forecasting Stock prices using RNN

EnergyDash

EnergyDash

PixeLearner: Merging Vision with Voice Recognition

PixeLearner was conceptualized with the primary intention of seamlessly integrating vision-based machine learning with the intricacies of natural language processing. The objective was clear - create a tool that offers a natural way to recognize and label individuals, thereby enhancing personal interactions.

PixeLearner
Architecture showing how the app works.
Objective

The primary aim of PixeLearner is to seamlessly emulate human interactions - to spot familiar faces and instantly recall associated names, just as one would during a friendly meetup.

Prime Use Cases
  1. Identification of close acquaintances, colleagues, or family members from an ongoing camera feed.
  2. Efficiently linking names to faces in an almost organic manner, fostering an environment of familiarity.
  3. Real-time model enhancement with each new introduction, making it an evolving tool.
  4. Associating faces with previously remembered data and contexts.
The Edge PixeLearner Offers
  • Guarantees user data sanctity with on-device processing.
  • Sets the stage for extensive adaptations in diverse application domains.
  • Immerses budding developers into a world of ML and NLP integration, offering a holistic learning curve.
Architectural Workflow of PixeLearner
  • Camera Integration: Leveraging the AV Foundation, the system offers continuous acquisition and refinement of live video streams. This ensures optimized resource allocation and preemptive measures against memory inefficiencies.
  • Model Analysis: Each frame is subjected to in-depth processing via our custom CNN model, specifically, the MobileNetV2 architecture. This strategy yields unique facial feature embeddings, essential for accurate recognition.
  • Audio-Text Transformation: A sophisticated functionality permits users to provide vocal labels. These audio inputs are subsequently transcribed into textual data through our advanced speech-to-text subsystem.
  • BERT’s NLP Framework: The system subjects the textual data to BERT, an industry-leading NLP solution. BERT's capabilities ensure accurate tokenization and normalization of inputs. For complex tokens not inherent in BERT's lexicon, integration with Apple's NLTagger provides additional segmentation and classification.
  • Facial & Linguistic Synchronization: The interplay between facial embeddings, derived from MobileNetV2, and labels processed via BERT ensures real-time associations between recognized faces and contextual labels.
  • Continuous Model Refinement: PixeLearner's hallmark is its adaptability. The model undergoes perpetual enhancement by assimilating new labels and recognitions, ensuring heightened accuracy over time.
More than just an app!

PixeLearner is more than just a project; it represents a step forward in how we interact with our environment. It's a testament to what can be achieved when vision and voice come together, and I'm excited about the path ahead. I am thinking to send the app for review. But before that there are a few minor things that need to polished. Thanks for reading and you can check the code on my github.

View this project on GitHub

Advanced Sentiment Analysis on Movie Reviews: A Sophisticated Approach Using NLP and TF-IDF Vectorization

Project Overview

This project represents an intersection of state-of-the-art NLP techniques, statistical modeling, and advanced machine learning algorithms. Through meticulous design and implementation, a high-performance model was created to accurately classify movie reviews based on sentiment.

Data Collection and Ingestion

Data was sourced from the NLTK movie_reviews corpus and ingested through a comprehensive ETL pipeline using AWS Glue and Lambda functions, reflecting an efficient and scalable architecture.

Preprocessing and Feature Engineering

Data preprocessing involved several complex steps, optimized for performance and accuracy:

  • Tokenization: Used advanced algorithms to tokenize text, preserving semantics.
  • Stopword Removal: Applied entropy-based measures to selectively remove non-contributive words.
  • Stemming: Employed Lancaster stemming algorithm, optimized for the specific corpus.
  • Vectorization: Transformed text into numerical vectors using TF-IDF, represented by:
    \[ \text{TF-IDF} = \text{TF} \times \left(1 + \log\left(\frac{N}{1 + \text{DF}}\right)\right) \]
Model Architecture and Development

Engineered a composite model combining Logistic Regression, Random Forest, Naive Bayes, and SVM with ensemble learning techniques. Utilized stochastic gradient descent for optimization, with a custom loss function defined by:

\[ \mathcal{L}(\theta) = -\sum y \log(\hat{y}) + (1-y)\log(1-\hat{y}) + \lambda\|\theta\|_2^2 \]
Training, Hyperparameter Tuning, and Evaluation

Implemented a robust training regimen with cross-validation and GridSearch for hyperparameter tuning. The models were evaluated using precision, recall, F1-score, and ROC-AUC metric, ensuring a well-balanced classification performance.

Results, Insights, and Future Direction

The SVM model achieved excellence with the highest ROC_AUC score. Insights drawn from this project are vital for areas like targeted marketing and user experience enhancement. Future directions involve integrating deep learning algorithms and experimenting with alternative vectorization strategies.

ROC Curve
ROC curve for the models, illustrating an adept understanding of classification performance.
Confusion matrix
Confusion matrix for different models, emphasizing precision, recall, and interpretability.

View this project on GitHub

Forex Price Forecasting: An Advanced Approach Using Recurrent Neural Networks (RNNs)

Forex Price Forecast
Time-Series Prediction of Forex Prices using Recurrent Neural Networks, Showcasing Predictive Accuracy and Trend Insights.
Introduction and Objective

The goal of this project is to predict future foreign exchange rates using Recurrent Neural Networks (RNNs), a type of deep learning model well-suited for sequence prediction tasks.

Modeling Strategy and Methodology

The RNN architecture is designed to capture temporal dependencies in the data. The model is formulated as:

\[ h_t = \sigma(W_{hh} h_{t-1} + W_{xh} x_t) \] \[ y_t = W_{hy} h_t \]
Forex Price Forecast
Time-Series Prediction of Forex Prices using Recurrent Neural Networks, Showcasing Predictive Accuracy and Trend Insights.

Here, \( h_t \) is the hidden state at time \( t \), \( x_t \) is the input at time \( t \), \( W \) represents weight matrices, and \( \sigma \) is the activation function.

Data preprocessing includes Min-Max scaling, expressed mathematically as:

\[ x_{\text{scaled}} = \frac{x - x_{\text{min}}}{x_{\text{max}} - x_{\text{min}}} \]
Results and Insights

The model demonstrated robust performance in predicting forex price movements. The predicted and actual prices closely followed each other, validating the model's accuracy. The use of an interactive Plotly graph allowed for an in-depth analysis of model predictions.

View on GitHub

EnergyDash: Intelligent Energy Consumption Analytics Platform

December Power Consumption
July Power Consumption
Utilizing Machine Learning to Analyze Seasonal Variation in Power Consumption for Suburban Houses in the Sunbelt Region.
Overview

EnergyDash is an analytics platform designed to empower individuals to understand and optimize their household electricity consumption. Integrating real-time data via an API from power providers, it translates complex usage metrics into actionable insights.

Technical Implementation

The back-end of the application leverages AWS Lambda for serverless computation and Kinesis for real-time data streaming. A regression model built with Scikit-Learn analyzes consumption patterns, while the front-end visualization is constructed with Plotly.

consumption
Interactive Dashboard: Facilitating Intelligent Energy Management.
Impact and Value

EnergyDash provides a tailored user experience, enabling households to identify inefficiencies and take control of their energy spending. Through sophisticated analysis and user-friendly visualization, the platform contributes to sustainable living and cost savings.

View on GitHub

Advanced Stellar Modeling: Magnetohydrodynamic Simulations of White Dwarf Mergers

dens_slice_plt
dens_slice_plt
Evolution of a White dwarf merger. The simulation was done using the unsplit staggered mesh MHD solver in FLASH. The movie shows Beta (ratio of gas pressure to magnetic pressure) varying with time. The image is a density slice plot with a resolution of (256 × 256) cells at 1050 sec after the merger.
Project Overview

This project involved the application of advanced computational techniques to simulate and analyze complex White Dwarf merger events. Utilizing the Magnetohydrodynamic (MHD) equations, the study extended the understanding of magnetized stellar interactions for precise astronomical modeling.

MHD Formulation and Conservation Laws

The simulation relies on the MHD equations, encompassing the conservation of mass, momentum, energy, and magnetic induction. These equations are expressed as:

\(\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0\) (Conservation of Mass)

\(\frac{\partial \rho \mathbf{v}}{\partial t} + \nabla \cdot (\rho \mathbf{v} \mathbf{v} - \mathbf{B} \mathbf{B}) + \nabla p^* = 0\) (Momentum Conservation)

\(\frac{\partial E}{\partial t} + \nabla \cdot [(E + p^*) \mathbf{v} - \mathbf{B}(\mathbf{v} \cdot \mathbf{B})] = 0\) (Energy Conservation)

\(\frac{\partial \mathbf{B}}{\partial t} = \nabla \times (\mathbf{v} \times \mathbf{B})\) (Magnetic Induction)

Here, \(\rho\) is the density, \(\mathbf{v}\) the velocity, \(\mathbf{B}\) the magnetic field, and \(p^*\) the total pressure including magnetic contributions.

Simulation Methodology

The simulations were conducted using the unsplit staggered mesh MHD solver in FLASH on the STAMPEDE 2 supercomputer. The project leveraged High-Performance Computing (HPC) for detailed analysis, facilitating comparison with traditional hydrodynamic models. This exploration enhances current astronomical understanding, offering novel insights into the intricate dynamics of stellar mergers.

White Paper

View on GitHub