This phase of the project focused on executing the actual analysis. While I had already done significant prep work in earlier stages, one key lesson became clear: you’re never really done transforming data. As I dug deeper, I found myself creating new columns, recalculating measures, and rethinking earlier steps.
The process wasn’t without its challenges – frequent laptop crashes meant I had to redo parts of the work multiple times. At some point, I had to restart the entire process, trimming the dataset from five years to just sixteen months and consolidating the tables to reduce strain on my machine.
I also learned that even the most structured plans will shift as you become more familiar with the data. Each new insight led to new questions, and that ongoing curiosity ultimately shaped the final approach.
Data Cleaning & Preparation (Project Restart)
When restarting the project, I consolidated and prepped the data with the following steps:
1. Initial Merges & Column Selection
- Merged fact_subscription with
fact_sales; retainedSubscriptionPlanIDandBillingCycleId - Joined
dim_customerfor CustomerNameandSignUpDate - Added
BillingCycleinfo from the billing cycle table (kept Label asBillingCycle,DurationMonths) - Merged with
SubscriptionPlanto getPlanNameandMonthlyPrice - Copied the final table as values into a new worksheet, cleaned the columns, and renamed them
Final columns:
SubscriptionID, Subscription Plan, Billing Cycle, DurationMonths, CustomerName, SignUpDate, TransactionDate, Payment Status, PaymentMethod, Amount Charged, Monthly Price.
Saved as a new file: Sales.xlsx
2. Data Import & Type Checks
- Imported the file into Microsoft Excel as
fact_sales - Verified and corrected data types
3. Fixing SignUpDate
I noticed that some SignUpDate values were later than the TransactionDate, which isn’t valid.
Fix: I recalculated SignUpDate per customer as the earliest transaction date.
- Removed the original
SignUpDate - Duplicated
fact_sales-> grouped by CustomerName->minofTransactionDateas the newSignUpDate - Merged this back into
fact_sales
4. Created New Columns
| Column Name | Formula |
| ExpectedRevenue | =[DurationMonths]*[MonthlyPrice] |
| Actual Revenue | =[DurationMonths]*[AmountCharged] |
| CoverageEndDate | =Date.AddMonths([TransactionDate], [DurationMonths]) |
| Customer Status | =let |
5. Date Table
- Imported a
Datetable and created:Year,Month,Quarter - Related it to
fact_sales[TransactionDate]
6. Created Some Measures
Created a blank query called “Measures” for storing measures, then created the following;
| Measure | Description | Formula |
| Failed Payments | Number of failed payments | =CALCULATE(COUNT(fact_sales[PaymentStatus]), fact_sales[PaymentStatus]="Failed") |
| Failed Payments Loss | Revenue lost due to failed payments | =CALCULATE(SUM(fact_sales[ExpectedRevenue]), fact_sales[PaymentStatus]="Failed") |
| Refunds | Number of refunds | =CALCULATE(COUNT(fact_sales[PaymentStatus]), fact_sales[PaymentStatus]="Refunded") |
| Refunds Loss | Revenue lost due to refunds | =CALCULATE(SUM(fact_sales[ExpectedRevenue]), fact_sales[PaymentStatus]="Refunded") |
| Free Trials | Number of free trials | =CALCULATE(COUNT(fact_sales[ExpectedRevenue]), fact_sales[PaymentStatus]="Success", fact_sales[ActualRevenue]=0) |
| Free Trial Loss | Revenue lost due to free trials ($0 value transactions) | =CALCULATE(SUM(fact_sales[ExpectedRevenue]), fact_sales[PaymentStatus]="Success", fact_sales[ActualRevenue]=0) |
| Active Customers | Distinct count of active customers at any point in time | =CALCULATE(DISTINCTCOUNT(fact_sales[CustomerName]), fact_sales[Customer Status]="Active") |
| Trial Customers | Numbers of customers on free trial | =CALCULATE(DISTINCTCOUNT(fact_sales[CustomerName]), fact_sales[Customer Status]="Trial") |
| Churned Customers | Number of churned customers at any point in time | =CALCULATE(DISTINCTCOUNT(fact_sales[CustomerName]), fact_sales[Customer Status]="Churned") |
| Total Customers | Number of customers | =DISTINCTCOUNT(fact_sales[CustomerName]) |
| Churn Rate | Churn rate | =DIVIDE([Churned Customers], [Total Customers], 0) |
| Average Revenue per User (ARPU) | Average revenue each customer has paid/contributed | =DIVIDE([Sum of ActualRevenue], [Active Customers], 0) |
| Average Tenure | Average no. of months a customer stayed subscribed before churning (or up to current date if still active) | =AVERAGEX(VALUES(fact_sales[CustomerName]),DATEDIFF(CALCULATE(MIN(fact_sales[SignUpDate])),CALCULATE(MAX(fact_sales[CoverageEndDate])), MONTH)) |
| ChurnRevenueLoss | Estimated revenue lost when a customer churns | =CALCULATE([Sum of ExpectedRevenue]-[Sum of ActualRevenue], fact_sales[Customer Status]="Churned") |
Task 1 Process: Patterns and Plan-Level Differences in Customer Churn
This task focused on exploring churn behavior across different plans, billing cycles, and time periods. The goal was to create dynamic visuals that allow for flexible slicing while identifying top-level churn patterns.
Churn by Subscription Plan
I began by creating a pivot table that displayed churn rates across different subscription plans. To make it easier to identify the most affected plans, I created a second helper table (a copy of the first, linked via =) and added a Highlight column.
This column dynamically pulls in the top three churn rates using a combination of formulas.
I first used the LARGE function to identify the top 3 churn rates from the original pivot table. Then, I applied a nested IFERROR/IF logic to populate the Highlight column only if the churn rate matched one of the top three values. This made it easier to call attention to the highest churn values in a visual format.
I also created a third helper table to extract just the top churned plan and its rate. This made it possible to label key visuals dynamically, depending on the selected time frame (using slicers like Year, Month, and Quarter). These slicers allowed me to explore how churn patterns shift over time.
A clustered column chart was used to visualize churn rate by subscription plan. In addition to churn rate, I also analyzed churn volume—the actual number of customers lost per plan—using similar tables and visuals.
Churn by Billing Cycle
I repeated a similar process to explore churn patterns across different billing cycles (e.g., monthly vs. annual). Again, both churn rate and churn volume were considered, using helper tables to isolate and highlight the top values.
Churn by Month
For monthly churn, I created a separate pivot table and disconnected the Month and Quarter slicers to ensure that selections didn’t conflict.
I used a combo chart for the final visual, where the bar chart displayed churn rate and a line graph showed churned customers over time.
A highlight column identified the top 3 churn months, drawing attention to seasonal or time-based churn spikes.
Churn by Quarter
I used another pivot table to analyze quarterly churn trends. This one was connected to the Year slicer and helped reveal larger, more stable trends compared to monthly data.
Churn Trends by Subscription Plan + Billing Cycle
To see how plan and billing cycle interacted, I built a pivot table that broke down churn rates and volumes across both variables. I then flattened this data into a structured table, where each row contained a subscription plan, billing cycle, churn rate, and churned customer count.
From this table, I created two focused helper tables:
- One to rank the top 3 highest churn rates by plan and billing cycle combo
- Another to rank the top 3 highest churn volumes using the same combination
These were used to drive visual highlights and callouts in the report.
Task 2 Process: When Do Customers Churn, and How Much Revenue Is Lost?
The goal of this task is to determine how long customers typically stay before they churn and quantify the revenue lost from churned customers.
Step 1: Calculate Average Tenure for Churned Customers
Created a measure to calculate average tenure for churned customers only by filtering on [Customer Status] = "Churned".
This helped isolate how long, on average, a customer stays before canceling.
Step 2: Create Tenure Buckets (via Power Query)
Using DAX for buckets proved inefficient, so I switched to Power Query:
- Duplicated the sales table
Grouped by CustomerName and computed:
- Min Date = first transaction date
- Max Date = most recent transaction date
2. Created TenureDuration column
= (Date.Year([MaxDate]) - Date.Year([MinDate])) * 12 + (Date.Month([MaxDate]) - Date.Month([MinDate]))
Note: The above formula calculates the total number of months between two dates by first computing the difference in years and converting it to months, then adding the difference in months. It ensures accurate month counts without assuming each month has a fixed number of days.
3. Merged the result back with the original sales table using CustomerName
4. Added Conditional Column to bucket the tenure into 3-month intervals:
0-3 Months, 4-6 Months, 7-9 Months,…up to 25+ Months. This resulted in 9 tenure buckets.
Step 3: Calculate Revenue Loss from Churn
Created a DAX measure to quantify churn-related revenue loss. The formula for this available in the Measures table.
Step 4: Analyze Churn by Tenure Bucket
- Built a pivot table with Tenure (bucket) as rows and Churned Customers (measure) as values.
- Created 2 helper tables to identify the top 3 tenure buckets with the highest churn volume.
- Used these tables to drive dynamic visuals (top 3 highlights).
Step 5: Analyze Revenue Lost by Tenure Bucket
Applied a similar approach:
- Pivoted on Tenure
- Values = ChurnRevenueLoss
- Created helper tables to isolate and highlight top tenure buckets by revenue loss
Task 3: How Much Revenue Is Lost Due to Failed or Refunded Payments?
To quantify the impact of failed payments, refunds, and free trials on revenue:
Created a Pivot Table for KPI calculation
Included the following measures from the data model:
ExpectedRevenueFailedPaymentsLossRefundsLossFreeTrialLossActualRevenueRevenueGap
Created a KPI Summary Table from this pivot
For each loss type, I calculated:
- Total amount lost
- % of
RevenueGap - % of
ExpectedRevenue
Built a second Pivot Table summarizing RevenueGap by month (January to December).
From this second pivot, I;
- Extracted the top 3 months with the highest revenue gap using a helper table +
LARGE()andIFformulas - Used this to support a dynamic clustered column chart that automatically highlights those top 3 months
Report and Dashboard Build
As I worked through the analysis, I also built the accompanying Excel dashboard and report, shaping them in real time as new insights and challenges emerged. The final interactive version will be available on the site soon.

