This method identifies groups of individual outlier values as well as transitions of the normal range of values in approximately normally distributed time series data. The data is not expected to have seasonality or trend components.
Outlier values are determined by applying the Generalized Extreme Studentized Deviate Test (GESD: https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm) to a time window based subset of the data. The GESD determines a threshold for the absolute deviation of a value from the mean, beyond which a value is considered an outlier. For this, the GESD requires as input parameter the maximum number of potential outlier values. The default configuration uses 1/8th of the number of data points in a time window. The start time of the fixed width time window is gradually increased from the start of the time series data until all data has been analyzed. The final step identifies groups of multiple nearby outlier values in a short time span and tags them as one anomaly.
This process can be optionally repeated with different window sizes. The default configuration uses 1x, 2x and 3x of the initial window size. The reason for this is that transitions of the normal range of values often occur over a longer time period, and the analyzed time window needs to cover a large part of the transition for it to be detected. On the other hand, larger time windows delay the identification of anomalies and can also obscure smaller anomalies if the time window covers multiple anomalies.
Method for Anomaly Detection: Standard Deviation – Sliding Temporal Window
Was this article helpful?