Significant Pattern Mining (Westfall-Young Light)
Summary
In this project, we developed an approach to improve the statistical power in significant pattern mining by using permutation-testing.
Significant pattern mining algorithms must deal with a vast search space, often containing billions or even trillions of candidate patterns. However, these patterns are often heavily inter-related, resulting in pronounced statistical redundancies. Previously existing approaches either: (1) ignore these redundancies, leading to over-conservative significance thresholds and a loss of statistical power or (2) are computationally demanding, both in terms of runtime and memory usage, limiting their applicability to small-sized datasets.
Here, we proposed a novel, fast and memory-efficient permutation testing algorithm for significant pattern mining that overcomes both limitations.
Code
A beta version of code is available in our GitHub repository external page here.
Publication
Fast and Memory-Efficient Significant Pattern Mining via Permutation Testing
Felipe Llinares-López, Mahito Sugiyama, Laetitia Papaxanthos and Karsten Borgwardt
Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2015), 2015, 725-734
external page Online | ETH Research Collection | Project page | external page GitHub