MS in Computer Science (M.S.C.S.)
Degree Granting Department
Computer Science and Engineering
Yicheng Tu, Ph.D.
Srinivas Katkoori, Ph.D.
Sameer Varma, Ph.D.
big data, CUDA, data processing, memory hierarchy, parallel computation, streaming
Modern simulation systems generate big amount of data, which consequently has to be analyzed in a timely fashion. Traditional database management systems follow principle of pulling the needed data, processing it, and then returning the results. This approach is then optimized by means of caching, storing in different structures, or doing some sacrifices on precision of the results to make it faster. When it comes to the point of doing various queries that require analysis of the whole data, this design has the following disadvantages: considerable overhead on traditional disk random I/O framework while reading from the simulation output files and low throughput of the data that consequently results in long latency, and, if there was any indexing to optimize selections, overhead of storing those becomes too big, too. Beside it, indexing will also cause delay during write operations and since most of the queries work with the entire data sets, indexing loses its point.
Scholar Commons Citation
Akhmedov, Iliiazbek, "Parallelization of Push-based System for Molecular Simulation Data Analysis with GPU" (2016). Graduate Theses and Dissertations.