Incremental Learning Method for Data with Delayed Labels

Authors

  • Haoran Gao The MOE Key Laboratory of Embedded System and Service Computation, Tongji University, Shanghai, 201804, China
  • Zhijun Ding The MOE Key Laboratory of Embedded System and Service Computation, Tongji University, Shanghai, 201804, China
  • Meiqin Pan School of Business and Management, Shanghai International Studies University, Shanghai, 200083, China

DOI:

https://doi.org/10.31577/cai_2022_5_1260

Keywords:

Delayed labels, transfer learning, concept drift, incremental learning, credit scoring

Abstract

Most research on machine learning tasks relies on the availability of true labels immediately after making a prediction. However, in many cases, the ground truth labels become available with a non-negligible delay. In general, delayed labels create two problems. First, labelled data is insufficient because the label for each data chunk will be obtained multiple times. Second, there remains a problem of concept drift due to the long period of data. In this work, we propose a novel incremental ensemble learning when delayed labels occur. First, we build a sliding time window to preserve the historical data. Then we train an adaptive classifier by labelled data in the sliding time window. It is worth noting that we improve the TrAdaBoost to expand the data of the latest moment when building an adaptive classifier. It can correctly distinguish the wrong types of source domain sample classification. Finally, we integrate the various classifiers to make predictions. We apply our algorithms to synthetic and real credit scoring datasets. The experiment results indicate our algorithms have superiority in delayed labelling setting.

Downloads

Download data is not yet available.

Downloads

Published

2022-12-31

How to Cite

Gao, H., Ding, Z., & Pan, M. (2022). Incremental Learning Method for Data with Delayed Labels. Computing and Informatics, 41(5), 1260–1283. https://doi.org/10.31577/cai_2022_5_1260