CQUniversity
Browse

Effective density-based concept drift detection for evolving data streams

conference contribution
posted on 2024-12-11, 00:11 authored by Z Cui, H Tian, Hong ShenHong Shen
Concept drift is a common phenomenon appearing in evolving data streams of a wide range of applications including credit card fraud protection, weather forecast, network monitoring, etc. For online data streams it is difficult to determine a proper size of the sliding window for detection of concept drift, making the existing dataset-distance based algorithms not effective in application. In this paper, we propose a novel framework of Density-based Concept Drift Detection (DCDD) for detecting concept drifts in data streams using density-based clustering on a variable-size sliding window through dynamically adjusting the size of the sliding window. Our DCDD uses XGBoost (eXtreme Gradient Boosting) to predict the amount of data in the same concept and adjusts the size of the sliding window dynamically based on the collected information about concept drifting. To detect concept drift between two datasets, DCDD calculates the distance between the datasets using a new detection formula that considers the attribute of time as the weight for old data and calculates the distance between the data in the current sliding window and all data in the current concept rather than between two adjacent windows as used in the exiting work DCDA [2]. This yields an observable improvement on the detection accuracy and a significant improvement on the detection efficiency. Experimental results have shown that our framework detects the concept drift more accurately and efficiently than the existing work.

Funding

Category 2 - Other Public Sector Grants Category

History

Editor

Park JS; Takizawa H; Shen H; Park JJ

Volume

1112 LNEE

Start Page

190

End Page

201

Number of Pages

12

Start Date

2023-08-16

Finish Date

2023-08-18

eISSN

1876-1119

ISSN

1876-1100

ISBN-13

9789819982103

Location

Jeju, Korea

Publisher

Springer

Place of Publication

Singapore

Peer Reviewed

  • Yes

Open Access

  • No

Era Eligible

  • Yes

Name of Conference

The 24th International Conference on Parallel and Distributed Computing, Applications and Technologies

Parent Title

Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings

Usage metrics

    CQUniversity

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC