Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences (sequenceMiner)
TBMG-23491
12/01/2015
- Content
SequenceMiner was developed to address the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. sequenceMiner works by performing unsupervised clustering (grouping) of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by a detailed analysis of outliers to detect anomalies. sequenceMiner utilizes a new hybrid algorithm for computing the LCS that has been shown to outperform existing algorithms by a factor of five. sequenceMiner also includes new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. This provides analysts with a coherent description of the anomalies identified in the sequence, and why they differ from more normal sequences.
- Citation
- "Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences (sequenceMiner)," Mobility Engineering, December 1, 2015.