Abstract:
Promotional pricing strategy is a major marketing tool for most retails. However, predicting sales when discount is offered can be difficult since there are other factors causing demand to be uncertain or highly fluctuating. The objective of this research is to identify the most suitable prediction models for beauty product unit sales in retail and capture the effects of factors impacting sales. The dataset provided by the case study retail company was available from January 2020 to December 2022 (36 months). The prediction models, including linear regression, random forest, XGBoost, artificial neural networks (ANN), and hybrid models, are constructed and evaluated using the mean absolute percentage error (MAPE). Then, to select the most appropriate model, the weighted MAPE was calculated and compared for overall performance. Moreover, factors used in machine learning models are either using all the independent variables or using significant factors from the stepwise method, and either considering or not considering factors of exogenous products in the same cluster grouped by category, subcategory, or K-means method. The result shows that the series hybrid model of random forest and XGBoost outperformed with a weighed MAPE of 27.65%, which had 0.5% lower weighted MAPE and around 5 times longer runtime than the random forest model. Thus, the most suitable model is the random forest model. Considering factors affecting sales, it was found that the promotion period factor was the most important, followed by discount percentage and price factors.