Doctor of Philosophy (Ph.D.)
Degree Granting Department
Computer Science and Engineering
Yi-Cheng Tu, Ph.D.
Sagar Pandit, Ph.D.
Yao Liu, Ph.D.
Michael Weng, Ph.D.
Wen-Xiu Ma, Ph.D.
Big Data, Molecular Simulations, Push-Based, SDH, Streaming
Thanks to the advancement of the modern computer simulation systems, many scientific applications generate, and require manipulation of large volumes of data. Scientific exploration substantially relies on effective and accurate data analysis. The shear size of the generated data, however, imposes big challenges in the process of analyzing the system. In this dissertation we propose novel techniques as well as using some known designs in a novel way in order to improve scientific data analysis.
We develop an efficient method to compute an analytical query called spatial distance histogram (SDH). Special heuristics are exploited to process SDH efficiently and accurately. We further develop a mathematical model to analyze the mechanism leading to errors. This gives rise to a new approximate algorithm with improved time/accuracy tradeoff.
Known MS analysis systems follow a pull-based design, where the executed queries mandate the data needed on their part. Such a design introduces redundant and high I/O traffic as well as cpu/data latency. To remedy such issues, we design and implement a push-based system, which uses a sequential scan-based I/O framework that pushes the loaded data to a number of pre-programmed queries.
The efficiency of the proposed system as well as the approximate SDH algorithms is backed by the results of extensive experiments on MS generated data.
Scholar Commons Citation
Grupchev, Vladimir, "Improvements on Scientific System Analysis" (2015). Graduate Theses and Dissertations.