How to perform anomaly detection with LOF – Towards Data Science

An introduction to performing outlier detection with the Local Outlier Factor (LOF) algorithm. 8 min read

Anomaly detection, although useful, is a topic that often gets skipped in machine learning classes. There are many applications of anomaly detection, especially in areas such as fraud detection and system monitoring.

If youve followed my blog for some time, youll remember that I previously wrote an article about using Isolation Forests for anomaly detection.

Aside from Isolation Forests, there is also another anomaly detection known as the Local Outlier Factor (LOF) that also performs well in practice. In this article, I will briefly go over the LOF algorithm and also demonstrate how you can use this algorithm for anomaly detection in Python.

The LOF algorithm is an unsupervised algorithm for anomaly detection. It borrows concepts from the K-nearest neighbors algorithm and produces an anomaly score based on how isolated a point is from its local neighbors. The basic hypothesis of this algorithm is that outliers or anomalies will have a lower density (further nearest neighbors) than other points.

To fully explain how this algorithm computes anomaly scores, we need to understand four concepts in the following order:

The k-distance is the distance between a point and its k-th nearest neighbor. The value we select for k is a hyperparameter for the LOF algorithm that we can experiment with to produce different results. Consider the diagram below where the second-closest point (or second-nearest neighbor) to point A is point B so the k-distance with k=2 is

See the original post here:

How to perform anomaly detection with LOF - Towards Data Science

Related Posts

Comments are closed.