Abstract:
The COVID-19 pandemic has caused many changes to the lifestyle of people all over the world. The lockdown forced people to stay at home for many months. This has led to the changes in purchasing behavior as well such as the increase in delivery order. This research, which has received sales data of drinks during the pandemic from a large beverage company, seeks to analyze the changes in customer behavior during the pandemic by using machine learning to perform clustering and observe the changes in the purchases of each product type. We will use clustering to group customers based on their purchase behavior and create prediction models that can predict what customers will order based on the purchase history of the group of customers in the data. We will be using K-means clustering with elbow method for finding K. We will split the data into monthly sales and perform clustering on each month, and then we will perform clustering again with the data from each cluster to find global clusters that allow us to compare the clusters directly. We will then use the result to create 3 types of prediction models, namely LSTM, Random Forest Regression and XGBoost. Finally, we compare the result from the models trained by global cluster training data to the ones from the models trained by the customer’s sales training data to see if global cluster training data can compete with using sales training data. We found that the models trained by global cluster customer training data performed similarly to the ones trained by sales training data but took much less time to train and run.