Thumbnail

Pattern-Driven Design of Visualizations for High-Dimensional Data

M. Blumenschein

2020
Dissertation

Data-informed decision-making processes play a fundamental role across disciplines. To support these processes, knowledge needs to be extracted from high-dimensional (HD) and complex datasets. Visualizations play hereby a key role in identifying and understanding patterns within the data. However, the choice of visual mapping heavily influences the effectiveness of the visualization. While one design choice is useful for a particular task, the very same design can make another analysis task more difficult, or even impossible. This doctoral thesis advances the quality and pattern-driven optimization of visualizations in two core areas by addressing the research question: “How can we effectively design visualizations to highlight patterns – using automatic and user-driven approaches?” The first part of the thesis deals with the question “how can we automatically measure the quality of a particular design to optimize the layout?” We summarize the state-of-the-art in quality-metrics research, describe the underlying concepts, optimization goals, constraints, and discuss the requirements of the algorithms. While numerous quality metrics exist for all major HD visualizations, research lacks empirical studies to choose a particular technique for a given analysis task. In particular for parallel coordinates (PCP) and star glyphs, two frequently used techniques for high-dimensional data, no study exists which evaluates the impact of different axes orderings. Therefore, this thesis contributes an empirical study and a novel quality metric for both techniques. Based on our findings in the PCP study, we also contribute a formalization of how standard parallel coordinates distort the perception of patterns, in particular clusters. To minimize the effect, we propose an automatic rendering technique. The second part of the thesis is user-centered and addresses the question “how can analysts support the design of visualization to highlight particular patterns?” We contribute two techniques: The v-plot designer is a chart authoring tool to design custom hybrid charts for the comparative analysis of data distributions. It automatically recommends basic charts (e.g., box plots, violin-typed visualizations, and bar charts) and optimizes a custom hybrid chart called v-plot based on a set of analysis tasks. SMARTexplore uses a table metaphor and combines easy-to-apply interaction with pattern-driven layouts of rows and columns and an automatically computed reliability analysis based on statistical measures. In summary, this thesis contributes quality-metrics and user-driven approaches to advance the quality- and pattern-driven optimization of high-dimensional data visualizations. The quality metrics and the grounding of the user-centered techniques are derived from empirical user studies while the effectiveness of the implemented tools is shown by domain expert evaluations.

Materials
Title