From PHP Bug Fix to Customer Segmentation Data Project — My Freelance Story

Two years ago, I had the opportunity to work on a few side data projects some of which were with paid, others were not. These projects were in the bid to build my project portfolio as I envisioned a career for myself in data — not only data science or machine learning. And I guess my quest paid off, however, one must bear in mind that not all these projects were successful. For the rest of this post, I will be sharing how I went from fixing a PHP bug on a production website for an e-commerce client to identifying a classic business analytics opportunity and finally formulating a business problem to be solved with data science.
A moderately warm afternoon, I got a call from someone, Human A, who knew someone, Human B, about my web development skills in PHP. Human A told me that Human C, founder of the e-commerce platform, had security issues with her platform and the website appeared to have been hacked and asked if I would be able to help. I agreed to and immediately was given access to the code base. I discussed with Human C and estimated a recovery time of 5 hours. This was a standard PHP monolith built on one of the popular MVC frameworks, as such it took some time to understand. Luckily the bug was fixed and the site was backup in 30 minutes.
Prior to this gig, I hadn’t been touched any PHP code for work purposes — I spent my work days with either Python & R for data analyses or Scala for Apache Spark data pipelines. And since the bug was fixed, one could I mentioned to Human C that my forte was data analyses and I could help unlock value for her business if I were given access to data. I also shared some of the open data projects I had worked on with her as proof of my expertise. We agreed on a PoC with customer transaction data to see what patterns we could find and possibly some unknown unknowns.
For me, it was important to define a hypothesis from the outset. However, Human C and I did not have any idea about where to start from, hence we decided to begin with standard business analytics — Customer Segmentation, Cohort Analysis and Customer Lifetime Value. The rest of this post, I will focus on the customer segmentation phase. Assume my work was done, right? Certainly not. This forced to ask myself, How else could I provide value to Human C’s business considering my current skillset?
I mentioned to Human C that my forte was data analyses and I could help unlock value for her business if I were given access to data. I also shared some of the open data projects I had worked on with her as proof of my expertise. We agreed on a PoC with customer transaction data to see what patterns we could find and possibly some unknown unknowns. For me, it was important to define a hypothesis from the outset. However, Human C and I did not have any idea about where to start from, hence we decided to begin with standard business analytics — Customer Segmentation, Cohort Analysis and Customer Lifetime Value. The rest of this post, I will focus on the customer segmentation phase.
Customer Segmentation Analysis
According to Wikipedia:
Market segmentation is the activity of dividing a broad consumer or business market, normally consisting of existing and potential customers, into sub-groups of consumers (known as segments) based on some type of shared characteristics.
We were aware of shared characteristics amongst the customers such demographics — gender, age, location, etc. But we wanted to uncover latent patterns mainly behavioral. Clickstream data was available as such we had to make the most of the available transaction data as far back as 2015. First step was to define customer segments for the financial year 2016 (FY2016) with the help of Human C’s business staff. We defined the customer segments based on spend patterns — number of purchases (frequency), last purchase (recency) and average amount spent per purchase.

Method: Segment customers into cohorts based on recency of customer’s purchase, frequency customer’s transactions and the average amount per purchase to uncover hidden truths about spending patterns.
Implementing the method described above, I also ran a year-on-year comparison to understand how customers transitioned between segments — comparing 2015 and 2016. What we found was amazing! See image below:

We observed that 99.96% of customers in the “New Warm” segment (bought an item in 2015, spent an average of ₦16,000 and with at least 13 purchases) in 2015 became “Cold” (spent an average of ₦800) in 2016. This is a strong indicator of low retention and churn. Summary of findings below:
- We had 2367 customers in the “cold” cohort in 2016.
- “New active high” cohort grew by 8.1% from 2015 to 2016. However, the “cold” cohort grew by a whooping 952%.
- Interestingly, “new active low” cohort was the least profitable with about ₦31,316.00 average revenue per customer as against “warm high value” cohort ₦350,000.00 average revenue per customer.
To wrap up my analysis, I recommended that:
Design targeted campaigns based on customer cohorts:
- Higher priced items in marketing campaigns may be sent to (warm and new)high value cohorts.
- Cold cohort are sent campaigns geared towards reviving their profitability.
What about the Data Science problem?
As I had other commitments, I decided not to go further on the project, however, there is a hypothesis to formulate:
Hypothesis: If we know the spend patterns of a customer based on her recency, frequency of purchase, average revenue, and other data points, we should be able to predict when she is about to stop buying from us and offer discounts, retarget her via her favorite social media platforms so as to maintain her profitability.
Conclusion
I hope you found this post interesting as well as aroused your curiosity in data. Data projects don’t necessary have to start with fancy neural networks and deep learning frameworks. In my opinion they should start with addressing specific business problems (except you are into research or you have the financial muscle of the Googles, Microsofts and Facebooks).
Kindly share your thoughts and comments — looking forward to your feedback. You can reach me via email, follow me on Twitter or connect with me on LinkedIn. Can’t wait to hear from you!!