Duggan et al. created a reference implementation of the BigDAWG system: a new architecture for future Big Data applications, guided by the philosophy that “one size does not fit all”.

Such applications not only call for large-scale analytics, but also for real-time streaming support, smaller analytics at interactive speeds, data visualization, and cross-storage-system queries.

The importance and effectiveness of such a system has been demonstrated in a hospital application using data from an intensive care unit (ICU).

In my Master’s Thesis, I implemented and evaluated a cross-system Query Executor. I focused on cross-engine shuffle joins, taking into account the skew of the data distribution.

Download the paper