Tajamul Khan

I'm a Data Scientist, Analytics & Business Consultant, Career Mentor, Author, Content Creator, and a YouTube Educator. With over 6 years of experience at industry giants like Google, Amazon, and Eastman, and a prestigious degree in Data Science and Artificial Intelligence from MIT, I can confidently say, "I am a data veteran."

Skills

Python

Numpy

Pandas

Plotly

TensorFlow

Power BI

Tableau

MySQL

PostgreSQL

Microsoft SQL Server

SQLite

SCIPY

Scikit-learn

GIT

Jupyter

Microsoft Excel

Pyspark

Amazon AWS

NoSQL

R

KERAS

Latex

Cassandra

StreamLit

PyTorch

FastAPI

JSON

Looker

HTML

Projects

AirBnB Stock Price Prediction

Machine Learning – Regression

Developed a regression model to forecast Airbnb stock prices using historical financial data. Focused on feature engineering, trend analysis, and model tuning using algorithms like Linear Regression and Random Forest. Useful for understanding time-series behavior and investment patterns.

Airline Satisfaction Prediction

Machine Learning - Classification

Built a classification model to predict passenger satisfaction based on in-flight experience, service ratings, and demographics. Compared models like Logistic Regression and Decision Trees to enhance customer segmentation and loyalty strategies.

Airlines Delay Prediction

Machine Learning – Classification

Designed a predictive system to forecast flight delays using features like route, schedule, and weather. Improved airline operations and passenger experience through proactive planning.

Bank Note Authentication

Machine Learning – Classification

Built a model to detect counterfeit banknotes using statistical attributes such as variance, skewness, and kurtosis. Implemented SVM and k-NN for fast, accurate financial fraud detection.

Chronic Kidney Disease Prediction

Machine Learning – Classification

Predicted CKD using patient lab results and clinical indicators to help doctors with early diagnosis and preventive treatment planning. Enhanced accuracy by integrating diverse data and advanced machine learning.

Credit Score Prediction

Machine Learning – Classification

Classified individuals into creditworthiness bands using features like payment history, debt ratio, and income. Useful for automating lending decisions.

Customer Churn Prediction

Machine Learning – Classification

Predicted churn probability and timing using customer lifecycle metrics. Helped businesses build effective retention strategies.

Customer Satisfaction Prediction

Machine Learning – Classification

Analyzed customer sentiment and predicted satisfaction scores using service interaction data. Applied Decision Trees and Logistic Regression for feedback analysis.

Diabetes Prediction

Machine Learning – Classification (Ensemble)

Used XGBoost and Gradient Boosting to predict diabetes based on health indicators. Focused on early detection with high precision and recall.

Drug Classification

Machine Learning – Classification

Predicted drug categories for patients based on their vitals and clinical profile. Enhanced decision-making in treatment plans.

E-Commerce Shipping Prediction

Machine Learning – Classification

Predicted on-time or delayed delivery based on order characteristics and shipping data. Helped streamline logistics and improve customer satisfaction.

Heart Disease Prediction

Machine Learning – Classification

Predicted heart disease risk using medical parameters like cholesterol and ECG results. Focused on reducing false negatives through ensemble learning.

Hotel Booking Status Prediction

Machine Learning – Classification

Forecasted booking cancellations using reservation details. Supported revenue management and overbooking strategies.

Housing Cost Prediction

Machine Learning – Regression

Predicted housing prices using features like area, location, and number of bedrooms. Used Linear, Ridge, and Lasso regression to compare model efficiency.

Insurance Premium Prediction

Machine Learning – Regression

Estimated premium amounts using demographics, vehicle specs, and claim history. Helped automate pricing strategies in the insurance sector.

IPL Winner Prediction

Machine Learning – Classification

Predicted match winners using past player stats, team strength, and game conditions. Showcased the use of machine learning in sports analytics.

Breast Cancer Prediction

Machine Learning – Classification

Created a diagnostic model for breast cancer prediction using biopsy data. Prioritized high recall and model interpretability with Logistic Regression and Random Forest to support healthcare decisions.

Loan Approval Prediction

Machine Learning – Classification

Predicted loan approvals based on applicant income, employment type, and credit score. Enabled banks to automate and streamline the lending process.

Customer Segmentation

Machine Learning – Clustering

Segmented customers based on spending habits and demographics using RFM analysis and K-Means clustering. Aimed to enhance targeted marketing strategies.

Book Recommendation Engine

Machine Learning – Recommendation System

Developed a book recommendation engine leveraging collaborative filtering and content-based filtering to suggest personalized reads. Enhanced user experience by analyzing reading patterns and preferences.

Movie Recommender Application

Machine Learning – Recommendation System

Built a content-based movie recommendation system using cosine similarity and user preferences. Helped users discover similar films based on genres, cast, and keywords.

Walmart Sales Forecasting

Time Series – Forecasting

Main Objective was to build useful insights using data and make prediction models to forecast the sales for X number of months/years in order to balance supply and demand in a manner that achieves the financial and service objectives of the enterprise.

Air Passengers Traffic Forecasting

Time Series – Forecasting

Developed a time series forecasting model to predict monthly air passenger traffic. Employed techniques such as ARIMA and SARIMA to capture trends and seasonality in historical data. This model aids in strategic planning for airlines and airport authorities by anticipating passenger volumes.

Marketing Campaign Analysis

Machine Learning – Classification

The goal of this project is to improve the bank’s telemarketing campaign effectiveness while uncovering key pain points in customer engagement.

Gold Price Forecasting

Time Series – Forecasting

Developed a time series model to predict gold prices by analyzing historical data. Utilized ARIMA and SARIMA techniques to capture trends and seasonality, aiding investors and financial analysts in anticipating market movements.

Marketing Campaign Analysis

Statistical Analysis - A/B Testing

Conducted comprehensive market research using statistical techniques to analyze consumer behavior and preferences. Employed methods such as descriptive statistics and inferential analysis to uncover key insights, aiding in strategic decision-making and targeted marketing efforts.

HR Analytics

Statistical Analysis – Hypothesis Testing

Conducted exploratory data analysis and applied statistical methods to HR datasets to uncover patterns related to employee attrition and satisfaction. Utilized techniques such as correlation analysis, hypothesis testing, and data visualization to identify key factors influencing employee turnover, aiding in the development of data-driven HR strategies.

Titanic Survivor Analysis

Exploratory Data Analysis - EDA

Performed in-depth exploratory data analysis and applied statistical techniques to the Titanic dataset to uncover survival trends. Visualized relationships between features like age, gender, class, and survival rate using histograms, boxplots, and hypothesis testing to derive actionable insights from historical passenger data.

911 Calls Analysis

Exploratory Data Analysis - EDA

Performed analysis on 911 call records to uncover patterns in emergency response types, call volumes, and temporal trends. Utilized data visualization techniques to identify peak call times, common incident categories, and geographical distributions, providing insights to enhance emergency response strategies.

Flight Price Analysis

Exploratory Data Analysis - EDA

Conducted comprehensive exploratory data analysis on flight pricing data to identify key factors influencing airfare variations. Analyzed attributes such as airline, source, destination, departure time, and duration to uncover patterns and trends. Utilized data visualization techniques to present insights, aiding in understanding pricing dynamics in the aviation industry.

Zomato Business Analytics

Exploratory Data Analysis - EDA

Conducted an in-depth exploratory data analysis on Zomato's restaurant dataset to uncover key business insights. Analyzed factors such as restaurant ratings, pricing, and location to identify trends and patterns. Utilized data visualization techniques to present findings, aiding in strategic decision-making for market positioning and customer targeting.

Average Salary Dashboard

Visualization - Tableau

This dashboard explores average salaries across the US by state, industry, and job role. It highlights regional and sector-wise pay differences, enabling users to understand salary trends and support informed decision making.

Sales Analysis Dashbard

Visualization - Tableau

This interactive dashboard examines sales performance across European countries and regions. It highlights trends, regional differences, and key product categories, helping users identify growth opportunities and make data-driven sales decisions.

Customer Clustering Dashboard

Visualization - Tableau

This dashboard segments customers into distinct clusters based on purchasing behavior and demographics. It helps users understand customer groups, tailor marketing strategies, and improve targeting for better business outcomes.

Unemployment Analysis Dashboard

Visualization - Tableau

This interactive dashboard visualizes unemployment trends over time, highlighting differences by state, gender, and year. It’s designed for clear storytelling, enabling users to explore patterns and make informed decisions.

Electric Vehicle Sales Report

Visualization - Power BI

This report analyzes electric vehicle (EV) performance and market trends, covering battery efficiency, charging infrastructure, sales, and adoption rates. It highlights sustainability impacts like CO2 reduction and fuel savings, helping businesses and policymakers make informed decisions in the growing EV sector.

Amazon Prime Sales Report

Visualization - Power BI

This report explores 9,600+ titles across 87 countries, highlighting top genres like drama and comedy, common ratings, and trends in content growth. It offers a clear view of Prime Video’s global reach and evolving content mix.

Automotive Sales Report

Visualization - Power BI

This interactive report offers an in-depth analysis of car sales trends, highlighting performance across regions, models, and time periods. It helps users identify key sales drivers, monitor growth, and make data-driven business decisions.

Employee Attrition Rate Report

Visualization - Power BI

This report analyzes data from over 7,000 customers to uncover churn patterns. Key insights include a higher churn rate among monthly contract users and those using electronic checks. Demographics like senior citizens and non-partnered customers also show higher churn tendencies.

Demand Forecasting Report

Visualization - Power BI

This report analyzes ₹96M in sales across 692K products, revealing a 44% average discount and 60K discounted items. Covering 164 cities and 3 tiers, it leverages data on SKU, category, quantity, and pricing to support accurate demand planning.

NASA Space Missions Report

Visualization - Power BI

Spanning from 1957 to 2022, this dashboard showcases 89.89% mission success across 8 nations, with Russia and the USA leading in volume, France achieving the highest success rate (94.03%), and RVSN USSR dominating with 1,614 successful launches.

Swiggy Sales Report

Visualization - Power BI

Analyzing $2.67B in sales, this dashboard highlights top cuisines, best-performing cities like Hyderabad and Bangalore, leading shops like McDonald's and KFC, and rating-driven insights to guide strategic decisions.

Call Center Performance Report

Visualization - Power BI

This report analyzes PwC call center performance, focusing on call volumes, handling times, customer satisfaction, and agent efficiency. It helps identify peak hours, common issues, and process bottlenecks, enabling data-driven decisions to improve service quality and operational efficiency.

Credit Card Performance Report

Visualization - Power BI

Tracking over $55M in revenue, this dashboard highlights the dominance of Blue cards ($46M revenue), with Bills and Entertainment as top expense categories, and graduates and businessmen driving the highest spending segments.

Financial Performance Report

Visualization - Power BI

In this project, key focus was visualizing key metrics like revenue, expenses, and cash flow. Using DAX, drill-downs, and dynamic visuals, the report offers real-time insights and simplifies financial decision-making. It demonstrates my ability to turn complex data into clear, actionable visuals for better financial management.

Data Science Survey Report

Visualization - Power BI

Built an interactive Power BI dashboard to visualize global data career trends—covering salary, job roles, skills, work-life balance, and entry challenges—offering key insights into the data science landscape.

Product Sales Analysis

Visualization - Power BI

our product is outperforming its targets with 4,233 transactions and $16.15K profit this month, while returns are slightly above goal at 122; the USA leads in revenue, and top brands like Tell Tale and Tri-State drive strong sales with healthy profit margins and low return rates

Toy Sales Report

Visualization - Power BI

This report shows 68K total orders, $1.3M in revenue, and $378K profit, with Toys and Art & Crafts as the top-selling categories; revenue trends upward over time, peaking at $90K, and store locations are tracked across Airport, Commercial, Downtown, and Residential areas

Road Safety Report

Visualization - Power BI

Analyzing over 307K accidents with an average of 1.36 casualties per incident, this dashboard reveals that urban areas account for 61.23% of casualties, cars contribute the most (330K+), and Birmingham tops the list of high-risk districts, while slight injuries dominate at 84.1% of all cases.

HR Management Report

Visualization - Power BI

This report shows a workforce of 23K employees, with a majority being male 16K and 7K female; most employees are full-time, and Sales & Marketing and Operations are the largest departments. Awards and ratings favor Analytics and Finance, while employee distribution and service periods highlight strong full-time presence across key departments.

Netflix Premier Sales Report

Visualization - Power BI

Netflix’s catalog with 7,967 premiers across 749 countries and 498 categories, showing that movies make up 71% of content. TV-MA and TV-14 are the most common ratings, top categories include Documentaries and International Dramas, and the United States leads in total titles, with a sharp rise in releases after 2010.

Nike Inventory Management Report

Visualization - Power BI

This report visually tracks Nike sneaker sales, profit, and inventory across regions and product lines, letting users filter by time, city, retailer, or sales method. It also highlights stock levels, helping identify overstock or shortages for better inventory management.

IBM Work Life Balance Report

Visualization - Power BI

This report shows a workforce that is 60% male, mostly aged 30–40, with Research & Development as the largest department. Sales Executives and Research Scientists are the top roles, most employees have backgrounds in Life Sciences or Medical fields, and nearly half are married. Most employees travel rarely for work.

Product Sales Analysis

Data Engineering - Fabric, Azure, Power BI, PySpark

Developed a real-time sales analytics solution using Microsoft Fabric, integrating data from Azure SQL DB, Excel, and CSV via Data Pipelines and Dataflows Gen2. Performed advanced transformations in PySpark notebooks, modeled data in Lakehouse and Warehouse layers, and delivered interactive Power BI dashboards with Direct Lake connectivity for revenue trends, market segments, and product performance—all within a scalable Azure environment.

Let's connect