Marui DU*, Zuoquan ZHANG
School of Mathematics and Statistics, Beijing Jiaotong University, China
firstname.lastname@example.org (*Corresponding author), Zuoquanzhang@163.com
Abstract: The identification of defaulting enterprises and the detection of abnormal behavior in the financial field are being faced with the problem of serious imbalance in the proportions of data samples, but numerous machine learning classification models are based on the assumption that the proportions of data samples are relatively close. Therefore, when faced with data imbalances, classification models often have low recognition rates for the minority classes and fail to achieve the desired effect of data classification. With the purpose of solving this problem, this paper proposes a data balancing technique based on a hybrid sampling technique and a boosting algorithm. This model uses a hybrid sampling technique to construct the balanced dataset. The boosting algorithm is then employed in order to improve the discriminative power of the machine learning algorithm with regard to the information on the minority class. The proposed method outperforms the random undersampling, SMOTE, hybrid sampling, SMOTEBoost, and RUSBoost algorithms for seven real-world datasets.
Keywords: Credit risk, Default recognition, Imbalance classification, Machine learning.
>>FULL TEXT: PDF
CITE THIS PAPER AS:
Marui DU, Zuoquan ZHANG, HBSBoost: A Hybrid Balancing Technique for Defaulting Enterprise Recognition, Studies in Informatics and Control, ISSN 1220-1766, vol. 31(4), pp. 67-77, 2022. https://doi.org/10.24846/v31i4y202207