MS in Computer Science (M.S.C.S.)
Degree Granting Department
Computer Science and Engineering
Adriana Iamnitchi, Ph.D.
John Skvoretz, Ph.D.
Paul Rosen, Ph.D.
Graph Anonymization, Privacy Metric, Linkage Covariance
Real social graphs datasets are fundamental to understanding a variety of phenomena, such as epidemics, crowd management and political uprisings, yet releasing digital recordings of such datasets exposes the participants to privacy violations. A safer approach to making real social network topologies available is to anonymize them by modifying the graph structure enough as to decouple the node identity from its social ties, yet preserving the graph characteristics in aggregate. At scale, this approach comes with a significant challenge in computational complexity.
This thesis questions the need to structurally anonymize very large graphs. Intuitively, the larger the graph, the easier for an individual to be “lost in the crowd”. On the other hand, at scale new topological structures may emerge, and those can expose individual nodes in ways that smaller structures do not.
To answer this problem, this work introduces a set of metrics for measuring the indistinguishability of nodes in large-scale social networks independent of attack models and shows how different graphs have different levels of inherent indistinguishability of nodes. Moreover, we show that when varying the size of a graph, the inherent node indistinguishability decreases with the size of the graph. In other words, the larger a graph of a graph structure, the higher the indistinguishability of its nodes.
Scholar Commons Citation
Vadamalai, Subramanian Viswanathan, "Lost In The Crowd: Are Large Social Graphs Inherently Indistinguishable?" (2017). Graduate Theses and Dissertations.