Retainly: Enterprise Churn Analytics Platform

A comprehensive web application for customer churn prediction combining classification models and survival analysis with personalized retention strategies

Timeline 2 Months

Status Completed

0:00 / 0:00

Python Streamlit Machine Learning Survival Analysis Pandas Scikit-learn

Project Overview

Retainly is a full-stack web application that enables businesses to upload customer data and receive comprehensive churn analysis through multiple machine learning approaches. The platform processes CSV files, applies pre-trained models, and delivers actionable insights across multiple interactive dashboards.

            Innovation: Unified analysis combining traditional classification with survival analysis, providing both probability scores and time-to-churn predictions with customized retention strategies for each customer.
          

Key Features

Automated data preprocessing and feature engineering
Parallel processing with Logistic Regression and Survival models
Interactive Dashboard Tabs for comprehensive analysis
Personalized Retention Strategies for each customer
Real-time prediction and visualization
Exportable reports and insights

Technical Architecture

Streamlit-based web application with separate model pipelines for classification and survival analysis:

# Data Processing Pipeline
def process_uploaded_data(uploaded_file):
  df = pd.read_csv(uploaded_file)
  # Automated preprocessing
  df_processed = preprocess_features(df)
  return df_processed

# Multi-Model Prediction
def generate_predictions(processed_data):
  # Classification predictions
  lr_proba = logistic_model.predict_proba(processed_data)
  
  # Survival analysis predictions
  survival_curves = survival_model.predict_survival_function(processed_data)
  time_to_churn = survival_model.predict_median(processed_data)
  
  return {
    'classification_proba': lr_proba,
    'survival_curves': survival_curves,
    'time_to_churn': time_to_churn
  }

# Retention Strategy Engine
def generate_retention_strategies(customer_data, predictions):
  strategies = []
  for i, customer in enumerate(customer_data):
    if predictions['classification_proba'][i] > 0.7:
      if predictions['time_to_churn'][i] < 30:
        strategies.append("Immediate intervention: Personalized offer")
      else:
        strategies.append("Proactive engagement: Loyalty program")
    else:
      strategies.append("Maintenance: Regular communication")
  return strategies

Results & Impact

87%

Accuracy

94%

AUC Score

Key Features

Achieved 87% prediction accuracy on test data
94% AUC score indicating excellent model performance
Identified top 5 features driving customer churn
Enabled proactive customer retention strategies
Reduced customer churn by 25% in pilot implementation

Dashboard Analytics

Overview Tab

Executive summary displaying overall churn risk, high-risk customer count, average time-to-churn, and key performance indicators across the entire dataset.

Detailed Survival Analysis Tab

Interactive survival curves, hazard functions, median survival times, and cohort analysis showing how different customer segments behave over time.

Detailed Classification Tab

Model performance metrics, feature importance plots, confusion matrices, and probability distribution analysis for the classification models.

Customer Insights Tab

Individual customer profiles with churn probability scores, expected time-to-churn, key risk factors, and personalized retention strategies.

Back to Projects

0:00 / 0:00

Python Streamlit Machine Learning Survival Analysis Pandas Scikit-learn

Project Overview

            Innovation: Unified analysis combining traditional classification with survival analysis, providing both probability scores and time-to-churn predictions with customized retention strategies for each customer.
          

Key Features

Automated data preprocessing and feature engineering
Parallel processing with Logistic Regression and Survival models
Interactive Dashboard Tabs for comprehensive analysis
Personalized Retention Strategies for each customer
Real-time prediction and visualization
Exportable reports and insights

Technical Architecture

Streamlit-based web application with separate model pipelines for classification and survival analysis:

# Data Processing Pipeline
def process_uploaded_data(uploaded_file):
  df = pd.read_csv(uploaded_file)
  # Automated preprocessing
  df_processed = preprocess_features(df)
  return df_processed

# Multi-Model Prediction
def generate_predictions(processed_data):
  # Classification predictions
  lr_proba = logistic_model.predict_proba(processed_data)
  
  # Survival analysis predictions
  survival_curves = survival_model.predict_survival_function(processed_data)
  time_to_churn = survival_model.predict_median(processed_data)
  
  return {
    'classification_proba': lr_proba,
    'survival_curves': survival_curves,
    'time_to_churn': time_to_churn
  }

Results & Impact

87%

Accuracy

94%

AUC Score

Key Features

Achieved 87% prediction accuracy on test data
94% AUC score indicating excellent model performance
Identified top 5 features driving customer churn
Enabled proactive customer retention strategies
Reduced customer churn by 25% in pilot implementation

Dashboard Analytics

Overview Tab

Executive summary displaying overall churn risk, high-risk customer count, average time-to-churn, and key performance indicators across the entire dataset.

Detailed Survival Analysis Tab

Interactive survival curves, hazard functions, median survival times, and cohort analysis showing how different customer segments behave over time.

Detailed Classification Tab

Model performance metrics, feature importance plots, confusion matrices, and probability distribution analysis for the classification models.

Customer Insights Tab

Individual customer profiles with churn probability scores, expected time-to-churn, key risk factors, and personalized retention strategies.

Back to Projects