A Wasserstein Distance-Based Cost-Sensitive Framework for Imbalanced Data Classification

R. Feng, H. Ji, Z. Zhu, L. Wang

A Wasserstein Distance-Based Cost-Sensitive Framework for Imbalanced Data Classification

Číslo: 3/2023
Periodikum: Radioengineering Journal
DOI: 10.13164/re.2023.0451

Klíčová slova: Imbalanced classification, cost-sensitive, structural information, Wasserstein distance, radar emitter signal

Pro získání musíte mít účet v Citace PRO.

Přečíst po přihlášení

Anotace: Class imbalance is a prevalent problem in many real-world applications, and imbalanced data distribution can dramatically skew the performance of classifiers. In general, the higher the imbalance ratio of a dataset, the more difficult it is to classify. However, it is found that standard classifiers can still achieve good classification results on some highly imbalanced datasets. Obviously, the class imbalance is only a superficial characteristic of the data, and the underlying structural information is often the key factor affecting the classification performance. As implicit prior knowledge, structural information has been validated to be crucial for designing a good classifier. This paper proposes a Wasserstein-based cost-sensitive support vector machine (CS-WSVM) for class imbalance learning, incorporating prior structural information and a cost-sensitive strategy. The Wasserstein distance is introduced to model the distribution of majority and minority samples to capture the structural information, which is employed to weight the majority and minority samples. Comprehensive experiments on synthetic and real-world datasets, especially on the radar emitter signal dataset, demonstrated that CS-WSVM can achieve outstanding performance in imbalanced scenarios.