Abstract
VMware (VMW) is a listed software company with headquarters in Palo Alto, California selling products in the Software Defined Data Center that supports multiple devices, apps, and cloud to create an enterprise-ready cloud infrastructure. The company exclusively caters to business customers, that is, in the B2B environment. The company is characterized by 100% digital supply chain which implies that all products are downloadable from the website (www.vmware.com). The company also promotes these products online. Varied individuals across companies worldwide visit the site to familiarize themselves with the products and their features before making a decisive purchase. Along with the overview and use of the product, there are various customer-interaction triggers or ‘‘digital assets’’ that are shown to the VMW audience. These include triggers such as hands-on-learning, seminar/webinar registration, downloads, etc. Kiran R, the Director of the Data Science & Analytics team at VMW wants to know the optimal order of digital actions to be pushed towards customers to engage them effectively. Kiran’s team has a rich source of online and offline available to model user’s response to each of these digital assets. Kiran realized that the data is highly imbalanced and hence should be handled carefully. They wish to come up with a multinomial classification model for this purpose. Kiran decided that the model should fulfil the following objectives:
Learning Objective (Maximum of 500 Characters): Briefly describes teaching goals of case.
This case may be used at the advanced levels in MBA and executive MBA programs to demonstrate use of Machine Learning techniques and predictive modeling on stream data and for Digital Marketing. The case can be used for demonstrating application of traditional classification techniques such as multinomial logistic regression, classification trees as well as more contemporary and sophisticated techniques such as L2 Regularized Logistic Regression, Extreme Boosting, and Ensemble methods. The case presents the uses various ensemble methods such as random forest (with undersampling), XG Boost and a novel approach of stacking dissimilar multi-class classification models, and can be used to explain the superiority of ensemble model approach in mining highly imbalanced digital datasets. The model goes beyond conventional propensity to buy models and contributes by providing a propensity to respond model which can prove to be highly effective on stream data. In addition, the case is novel in its set-up of B2B engagement data. These alteration leads to tremendous managerial implication which can be discussed in class.