Anomaly Detection for Reoccurring Concept Drift in Smart Environments
- Autori: Agate V.; Drago S.; Ferraro P.; Lo Re G.
- Anno di pubblicazione: 2022
- Tipologia: Contributo in atti di convegno pubblicato in volume
- OA Link: http://hdl.handle.net/10447/588276
Many crowdsensing applications today rely on learning algorithms applied to data streams to accurately classify information and events of interest in smart environments. Unfor-tunately, the statistical properties of the input data may change in unexpected ways. As a result, the definition of anomalous and normal data can vary over time and machine learning models may need to be re-trained incrementally. This problem is known as concept drift, and it has often been ignored by anomaly detection systems, resulting in significant performance degradation. In addition, the statistical distribution of past data often tends to repeat itself, and thus old learning models could be reused, avoiding costly retraining phases on new data, which would waste computational and energy resources. In this paper, we propose a hybrid anomaly detection system for streaming data in smart environments that accounts for concept drift and minimize the number of machine learning models that need to be retrained when shifts in incoming data distribution are detected. The system is multi-tier and relies on two different concept drift detection modules and an ensemble of anomaly detection models. An extensive experimental evaluation has been carried out, using two real datasets and a synthetic one; results show the high performance achieved by the system using common metrics such as F1-score and accuracy.