Introduction
CONTEXT
Any pharma company sends out various campaigns to its customers(e.g. prescribers) to increase customer engagement. There are times, they face issues determining which campaign to send to which customer and which channel is most preferred by customer, given the time of day and other information.
VISION
Pitch the right campaign(content) to the right customer through the right channel.
ISSUE STATEMENT
Right now, pharma companies tend to send out wrong campaigns to customers through the wrong channels. This results in heavy brand/campaign onboarding and operational cost.
REQUIRED SOLUTION
Propose a Machine Learning enabled marketing solution to generate recommendations for customer targeting.
Proposed Solution
I developed a personalized Recommendation system, which recommends a pharma company, the most suitable content to pitch to a customer, and the most suitable channel to pitch that content, based on their past behavioral attributes and taking various KPIs into consideration. I addressed the problem with a Contextual Bandit algorithm.
Requirement Analysis
Functional requirements
- Given the context of a customer like (customer type, total sales etc.), the system must recommend the most suitable content to pitch to a customer, and the most suitable channel to pitch that content.
- The system must not output error if any context attribute is missing. It must recommend using the available context attributes.
- The system must not output error if any context attribute is missing. It must recommend using the available context attributes.
Non-Functional requirements
For user (pharma company)
- System must be easy to use.
- System must take 1–2 minutes max to execute.
For developer (us)
- System must be compatible with any environment (for e.g. LINUX and Windows)
- Satisfies various test cases before deployment.
- The system must be portable i.e. usability of the same software in different environments.
Architecture
Data Collection
We used 4 datasets in the project.
- Activity Data — Analysis/Activities of MRs after interacting with customer’s: Feedback, customer involvement, explaining about drugs etc.
- Demographics Data — demographics of the customer like specialty, gender, demographics like city, state, region etc.
- Multi-Channel Marketing (MCM) Data — Involves data like channel, and engagement level through the channel, more related to marketing information like offer code etc.
- Sales Data — Customer level data. Shows how many units are sold, number of sales etc.
The data has been collected from various stakeholders in a supply chain. The sales data comes from the retailers and non-retailers, while the marketing data comes from a vendor that compiles all marketing data, and the activity data is recorded by the medical representatives when they visit or make a call activity with a healthcare professional. The dataset had over 4 lakh rows of activity and multichannel marketing data.
Pre processing
1. Drop columns with more than 50% missing values.
For example, demographics data shown below had many NaN values in the columns of gender, and inconsistent customer Id’s which had to be removed.
Similarly, in MCM data, there were a lot of columns which had more than 50% missing values. For instance, the column non_target_flag had around 72% missing values, so it along with other columns having more than 50% missing values had to be dropped.
2. Handling outliers by taking log of values (rescaling).
3. All the duplicate data rows have been dropped.
4. Filling missing values in campaign description with the help of topic modeling. Topic modelling is a statistical method for discovering the high-level abstract topics that occur in a collection of documents. We considered Latent Dirichlet Allocation (LDA) for this purpose.
Feature Selection & Engineering
We selected only those features from the datasets which could be useful for building the Recommendation System, i.e. features which define state/context, action, and reward for the Recommendation System.
The features selected for reward is the customer engagement level. The agent had 15 choices for action which included all possible combination of channel and campaign code. Other columns like timestamp, attributes of the customer, type of drug, branding of the drug, no. of cumulative sales at every timestamp, and no. of cumulative call activities by medical representatives were taken into the context. The date timestamp column was not useful in its current form YYYY-MM-DD.
We engineered 3 features: Date, Month and Year from the given feature.
Merging
The 4 datasets available could not be fed individually into the model. Therefore, we prepared a base dataset by joining all the datasets. From sales and activity data, cumulative sales and cumulative calls were calculated and added to the respective tables. MCM data was joined with sales and activity on [customer_id, timestamp], while demographics data was joined on [customer_id] only. Once all the 4 tables were joined, the complete data is sorted by year and month from the MCM data timestamp.
Splitting
A bucket check, in which we run an algorithm to satisfy a subset of live user traffic in the real recommendation system, is the perfect way to evaluate a contextual bandit algorithm. Even so, this technique is not only expensive, requiring considerable engineering efforts to deploy the method in the actual world, but it can also have negative impacts on user experience. Therefore, offline assessment of contextual-bandit algorithms is useful when we attempt to refine an online recommendation system.
For offline evaluation, we have divided the dataset into training and testing, using the steps given below:
1. Sort dataset on Year, Month, Date (in the same order).
2. Split the resultant dataset into:
80% training: from December 2019 to March 2020
20% testing: April 2020 to June 2020
3. Keep only those rows of the 20% testing set which have high customer engagement. Hence, we can assume the campaign code and channel code in these rows to be the most suitable values and consider them relevant for evaluation.
Model Development
We simulated a content personalization scenario using contextual bandits to make choices between actions in a given context. The goal of the simulator was to maximize reward or minimize loss (-reward).
- Action — campaign code + channel code
- Context (15 features) — customer level demographics(age, gender etc.), date
- Reward — customer engagement level + disposition code
To study more about Contextual Bandits, check out this link !!
Model Evaluation
Training
We trained the model on different combinations of epsilon & iterations and got the following:
The agent is achieving a customer engagement of 0.72 in its best setting i.e. for Epsilon = 0.2 & Iterations = 2000.
Testing
Once the model is trained on the 80% base dataset, we tested it on the selected rows of other 20% and achieved a mean accuracy of 0.81.
I would like to thank D Cube Analytics for this wonderful opportunity!!