DAVI Visual Analytics

Posted Dec 19, 2024

By Wei Xiong

9 min read

Lecture Recap Notes

Lecture Context:
This lecture served as a recap of key concepts covered in the first half of the semester in data visualization. The professor focused on reviewing terms, frameworks, and models that structure the process of creating and evaluating visualizations, as well as clarifying core concepts such as color perception and interaction techniques. These notes compile all discussed points, including the professor’s examples and how various concepts fit into the overall visualization process.

Key Conceptual Frameworks

The Visualization Pipeline

Definition: A foundational model describing how raw data is transformed into a final image and how user interaction loops back to influence this process.

Stages:

Raw Data: The initial dataset.
Preprocessing/Analysis: Transforming raw data into a more usable form (e.g., normalizing, aggregating).
- Output: Prepared Data
Filtering: Selecting a subset or focus of the data.
- Output: Focus Data
Mapping: Converting focus data into geometric primitives (e.g., points, lines) and their visual attributes.
- Output: Geometric Data
Rendering: Generating the final visual representation (the image).
- Output: Image
Perception (User): The user interprets the image.
Interaction (User): The user may modify parameters at any pipeline stage, creating a feedback loop. For example:
- Adjusting preprocessing steps.
- Changing mapping parameters (e.g., zooming in/out, re-encoding variables).
- Result: A new image is generated, continuing the exploration.

Key Point: Interaction is not simply at the end; it influences the entire pipeline by looping back and modifying operations at any stage.

Munzner’s Nested Model

Definition: A four-level framework for designing and evaluating visualizations.

Levels:

Domain Situation: Understanding the real-world problem context.
Data & Task Abstraction: Identifying the data types and extracting the tasks users need to perform (abstracting away from domain specifics).
Visual Encoding & Interaction Idiom: Deciding how to represent and interact with the data visually.
Example: Choosing a bar chart vs. a scatterplot, and deciding how users will zoom, pan, or filter.
Algorithm: Implementing the chosen visualization and interaction techniques efficiently.

Usage:

Before you have an implemented visualization, you must define the problem (top levels).
Use data/task abstraction to ensure your visualization is both expressive (faithful to the data) and effective (supports the user’s tasks).

Van Wijk’s Model of Visualization

Definition: A conceptual model focusing on how visualization leads to knowledge gain and motivates further exploration.

Flow:

Data (D) → processed through visualization pipeline (V) → produces Image (I) → user Perceives (P) the image.
Perception leads to Knowledge (K) gain (insights about the data and underlying phenomena).
Acquired knowledge prompts Exploration (E), which modifies the Specification (S) of the visualization (e.g., changing parameters, choosing different views), looping the process.

Key Idea:

Visualization’s value is measured by how it increases knowledge (ΔK) in a given time.
Encourages iterative refinement: each new image and insight can prompt a new round of exploration and specification changes.

Task Abstraction (Andrienko & Andrienko)

Definition: A structured approach to define tasks by relating data context (where, when) and data content (values).

Examples of Tasks:

Lookup: Given a position in data context (e.g., a specific date), find the corresponding value (e.g., stock price on January 1, 2020).
Inverse Lookup: Given a value, find the context instances that match it (e.g., when was the stock price = $200?).
Comparison: Given multiple items in context, find differences or relations between their values (e.g., comparing stock prices on Jan 1, 2020 vs. Jan 1, 2021).

Importance:

Ties data abstraction directly to task abstraction, helping choose suitable encodings and interactions.

Color Concepts: Luminance vs. Brightness

Brightness: A technical attribute in certain color models (like HSL/HSV) indicating how dark or light a color is.

Luminance: The perceived brightness. Human eyes perceive some hues (e.g., yellow) as brighter than others (e.g., blue) even if they have the same technical brightness values. Luminance is crucial for creating perceptually uniform color scales.

Key Takeaway:

Use color spaces and scales that consider human perception (e.g., HCL) for more effective visualization.
Ensures that increments in data reflect proportional perceptual differences.

Marks & Channels

Marks: Basic geometric elements that represent data items (e.g., points, lines, areas).

Channels: Visual variables you can adjust for each mark (e.g., position, size, color, shape). Channels encode data attributes onto marks.

Example:

A scatterplot uses position (x, y) to represent values of two variables. The mark is a point, and position channels encode data attributes.

Expressiveness & Effectiveness (Analyzing Charts)

Expressiveness:

Does the visualization show only what is present in the data?
Avoid showing structures or patterns not supported by the data.
Example of Non-Expressiveness: Using hue (categorical) for ordinal data, making it unclear that one data value is “larger” or “smaller.”

Effectiveness:

How easily can users perceive and interpret what’s shown?
Leverages perceptual principles and best practices.
Example: For comparison tasks, sorting bars by magnitude is more effective than alphabetical order.

Efficiency (optional third E):

Concerned with how quickly and effortlessly the user can gain insights.

Basic Interaction: Zoom & Pan

Zooming:

Adjusting the scale to see more or fewer data points at once (e.g., zooming in to see details, zooming out for the big picture).

Panning:

Shifting the visible window over the data space to examine different areas.

Smooth Navigation:

A combined approach to quickly move across scales and positions, supporting exploration tasks more fluidly.

Where It Fits:

Interaction occurs at multiple pipeline stages, often changing the mapping or rendering steps.
In design terms, zooming and panning are part of the “interaction idiom” choices.

Data Types & Charts

Data Types:

Nominal (categorical), Ordinal, Quantitative.
The chosen mapping (marks, channels) and color scales should respect data type properties.

Chart Knowledge:

Knowing basic chart types (bar charts, histograms, scatterplots, etc.) helps identify potential pitfalls.
Example: A histogram should have contiguous bars for continuous data; if spaced, it may mislead perception.

Bivariate Color Scales

Definition:

Color scales encoding two variables simultaneously, often arranged in a 2D color matrix (e.g., a gradient from blue to red on one axis and from light to dark on another).
Useful for showing relationships between two attributes at once (e.g., mapping one variable to hue and another to brightness).

Gestalt Principles in Visualization

Gestalt Principles (e.g., proximity, similarity, enclosure):

Help convey groupings or structure among data points.
Example: If points share the same shape or enclosure, viewers perceive them as belonging together.

Purpose:

Move beyond individual data points to communicate relationships and clusters.

Design Activity Framework (Munzner’s Design Process)

Steps:

Understand:
- From domain situation → define data & task abstraction.
- Identify user goals, data types, and tasks they need to perform.
Ideate:
- Generate multiple design ideas (visual encodings and interaction techniques).
- Sketch a variety of approaches.
Make (Prototype):
- Build Hi-Fi prototypes to test design ideas.
- Show prototypes to users to get feedback.
Deploy:
- Implement the final, fully-featured visualization.
- Includes performance considerations, robust interaction handling, etc.

Relation to Munzner’s Nested Model:

“Understand” maps to domain & data/task abstraction stages.
“Ideate” corresponds to choosing visual encodings & interaction idioms.
“Make & Deploy” relate to implementing and refining algorithms and interfaces.

Summary of Integration

Visualization Pipeline: Defines the technical transformation from data to image and user feedback loop.
Munzner’s Nested Model: Guides how to design and evaluate at levels of domain, data/tasks, encoding/interaction, and algorithm.
Van Wijk’s Model: Highlights knowledge gain from iterative visualization usage.
Data & Task Abstraction (Andrienko & Andrienko): Provides a structured way to define user tasks relative to data context and content.
Expressiveness & Effectiveness: Ensures that chosen encodings are faithful to the data and supportive of user tasks.
Marks & Channels, Luminance vs. Brightness, Bivariate Color Scales, Gestalt Principles: Provide concrete guidelines for effective visual encoding.
Zoom & Pan, Interaction Techniques: Address how users explore and refine their view of the data, fitting into the pipeline’s feedback loop and supporting iterative analysis.

Key Takeaway:
All these concepts interrelate. To build a good visualization:

Start by understanding user needs and data.
Abstract those needs to tasks and data structures.
Choose appropriate encodings and interactions that are both expressive of the data and effective for the user.
Prototype early, refine, and ultimately deploy a solution that enables iterative exploration and knowledge gain.

Expressiveness & Effectiveness

Expressiveness: Are we showing exactly what’s in the data, nothing more/nothing less?
Effectiveness: Is the visualization easy to read and supporting the user’s tasks?

   Data ----> Visualization ----> User Tasks
    |            Expressive?          |
    |                                  |
    |--------------Effective?----------|

Marks & Channels

Marks: Basic geometric elements (points, lines, areas)

     o (point)
     — (line)
     █ (area)

Channels: Properties we adjust to encode data
- Position (x,y)
- Size
- Color (hue, saturation, luminance)
- Shape
- Orientation

For example, a scatterplot:
Mark = Point
Channels = x-position, y-position, color, size

Luminance vs. Brightness

Brightness: Technical parameter (e.g., in HSV color model)
Luminance: Perceived brightness by the human eye

Even if two colors have same "brightness" value:
   Yellow appears lighter than Blue.

When creating color scales:
Use luminance-aware color spaces (HCL) for perceptual uniformity.

Zoom & Pan (Scale-Space Diagram)

Scale Space:

Full Data: [-------------------------]  (All data at once)
Zoom in:          [-----]             (Focus on subset)
                  ↑ Pan along this axis to see different parts

Zoom: Change scale to see details or overview.

Pan: Move the “window” across the dataset at current scale.

Bivariate Color Scales

   Attribute B
     Low     Mid     High
A   ┌─────┬─────┬─────┐
L  │ clr1 │ clr2 │ clr3 │
o  ├─────┼─────┼─────┤
w  │ clr4 │ clr5 │ clr6 │
   ├─────┼─────┼─────┤
   │ clr7 │ clr8 │ clr9 │
   
Different hues or brightness steps on two axes
representing two different variables simultaneously.

Gestalt Principles (Grouping)

Gestalt principles help show groupings:
- Proximity: Elements close together are seen as a group.
- Similarity: Similar shape/color = same group.
- Enclosure: A boundary drawn around elements = group.

 o o o    o o o
 o o o -> [o o o]
 o o o    [o o o]

Enclosure groups these items visually.

Design Activity Framework

   ┌───────────┐      ┌──────────┐      ┌──────────┐      ┌───────────┐
   │ Understand │ ---> │  Ideate   │ ---> │   Make    │ ---> │  Deploy    │
   └─────┬─────┘      └─────┬────┘      └─────┬────┘      └───────┬────┘
         │                   │                 │                 │
         ▼                   ▼                 ▼                 ▼
Domain Situation        Data/Task           Prototypes         Final System
                       Abstraction        (Hi-fi versions)
                      Visual Encoding/
                      Interaction Idiom

Data Visualization, Round

This post is licensed under CC BY 4.0 by the author.

Lecture Recap Notes

Key Conceptual Frameworks

The Visualization Pipeline

Munzner’s Nested Model

Van Wijk’s Model of Visualization

Task Abstraction (Andrienko & Andrienko)

Color Concepts: Luminance vs. Brightness

Marks & Channels

Expressiveness & Effectiveness (Analyzing Charts)

Basic Interaction: Zoom & Pan

Data Types & Charts

Bivariate Color Scales

Gestalt Principles in Visualization

Design Activity Framework (Munzner’s Design Process)

Summary of Integration

Expressiveness & Effectiveness

Marks & Channels

Luminance vs. Brightness

Zoom & Pan (Scale-Space Diagram)

Bivariate Color Scales

Gestalt Principles (Grouping)

Design Activity Framework

Trending Tags