Anomaly Detection

The primary algorithm, Seasonal Hybrid ESD (S-H-ESD), builds upon the Generalized ESD test for detecting anomalies. S-H-ESD can be used to detect both global and local anomalies. This is achieved by employing time series decomposition and using robust statistical metrics, viz., median together with ESD. post

In addition, for long time series such as 6 months of minutely data, the algorithm employs piecewise approximation. This is rooted to the fact that trend extraction in the presence of anomalies is non-trivial for anomaly detection.

Early detection of anomalies plays a key role in ensuring high-fidelity data is available to our own product teams and those of our data partners.

This package helps us monitor spikes in user engagement on the platform surrounding holidays, major sporting events or during breaking news.

Beyond surges in social engagement, exogenic factors – such as bots or spammers – may cause an anomaly in number of favorites or followers. The package can be used to find such bots or spam, as well as detect anomalies in system metrics after a new software release.

We’re open-sourcing AnomalyDetection because we’d like the public community to evolve the package and learn from it as we have.