Graduation Year

2017

Document Type

Thesis

Degree

M.S.

Degree Name

Master of Science (M.S.)

Degree Granting Department

Engineering

Major Professor

Yicheng Tu, Ph.D.

Committee Member

Yao Liu, Ph.D.

Committee Member

Paul Rosen, Ph.D.

Keywords

Data Mining, Vertex Overlap, Anti-monotonic, Linear Programming, Hypergraph

Abstract

In recent years, the popularity of graph databases has grown rapidly. This paper focuses on single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. In this paper, we propose a novel framework for constructing support measures that brings together existing minimum-image-based and overlap-graph-based support measures. Our framework is built on the concept of occurrence / instance hypergraphs. Based on that, we present two new support measures: minimum instance (MI) measure and minimum vertex cover (MVC) measure, that combine the advantages of existing measures. In particular, we show that the existing minimum-image-based support measure is an upper bound of the MI measure, which is also linear-time computable and results in counts that are close to number of instances of a pattern. Although the MVC measure is NP-hard, it can be approximated to a constant factor in polynomial time. We also provide polynomial-time relaxations for both measures and bounding theorems for all presented support measures in the hypergraph setting. We further show that the hypergraph-based framework can unify all support measures studied in this paper. This framework is also flexible in that more variants of support measures can be defined and profiled in it.

Share

COinS