Ensemble-based method of fraud detection at self-checkouts in retail

Abstract

The authors consider the problem of fraud detection at self-checkouts in retail in condition of unbalanced data set. A new ensemble-based method is proposed for its effective solution. The developed method involves two main steps: application of the preprocessing procedures and the Random Forest algorithm. The step-by-step implementation of the preprocessing stage involves the sequential execution of such procedures over the input data: scaling by maximal element in a column with row-wise scaling by Euclidean norm, weighting by correlation and applying polynomial extension. For polynomial extension Ito decomposition of the second degree is used. The simulation of the method was carried out on real data. Evaluating performance was based on the use of cost matrix. The experimental comparison of the effectiveness of the developed ensemble-based method with a number of existing (simples and ensembles) demonstrates the best performance of the developed method. Experimental studies of changing the parameters of the Random Forest both for the basic algorithm and for the developed method demonstrate a significant improvement of the investigated efficiency measures of the latter. It is the result of all steps of the preprocessing stage of the developed method use.

Description

Keywords

classification, Ensemble-based method, Random Forest, fraud detection, retail, Ito decomposition, imbalanced dataset

Citation

Vitynskyi P. Ensemble-based method of fraud detection at self-checkouts in retail / P. Vitynskyi, R. Tkachenko, I. Izonin // Econtechmod : scientific journal. — Lublin, 2019. — Vol 8. — No 4. — P. 3–8.