DAVI Visual Analytics
Lecture Recap Notes
Lecture Context:
This lecture served as a recap of key concepts covered in the first half of the semester in data visualization. The professor focused on reviewing terms, frameworks, and models that structure the process of creating and evaluating visualizations, as well as clarifying core concepts such as color perception and interaction techniques. These notes compile all discussed points, including the professorβs examples and how various concepts fit into the overall visualization process.
Key Conceptual Frameworks
The Visualization Pipeline
Definition: A foundational model describing how raw data is transformed into a final image and how user interaction loops back to influence this process.
Stages:
- Raw Data: The initial dataset.
- Preprocessing/Analysis: Transforming raw data into a more usable form (e.g., normalizing, aggregating).
- Output: Prepared Data
- Filtering: Selecting a subset or focus of the data.
- Output: Focus Data
- Mapping: Converting focus data into geometric primitives (e.g., points, lines) and their visual attributes.
- Output: Geometric Data
- Rendering: Generating the final visual representation (the image).
- Output: Image
- Perception (User): The user interprets the image.
- Interaction (User): The user may modify parameters at any pipeline stage, creating a feedback loop. For example:
- Adjusting preprocessing steps.
- Changing mapping parameters (e.g., zooming in/out, re-encoding variables).
- Result: A new image is generated, continuing the exploration.
Key Point: Interaction is not simply at the end; it influences the entire pipeline by looping back and modifying operations at any stage.
Munznerβs Nested Model
Definition: A four-level framework for designing and evaluating visualizations.
Levels:
- Domain Situation: Understanding the real-world problem context.
- Data & Task Abstraction: Identifying the data types and extracting the tasks users need to perform (abstracting away from domain specifics).
- Visual Encoding & Interaction Idiom: Deciding how to represent and interact with the data visually.
Example: Choosing a bar chart vs. a scatterplot, and deciding how users will zoom, pan, or filter. - Algorithm: Implementing the chosen visualization and interaction techniques efficiently.
Usage:
- Before you have an implemented visualization, you must define the problem (top levels).
- Use data/task abstraction to ensure your visualization is both expressive (faithful to the data) and effective (supports the userβs tasks).
Van Wijkβs Model of Visualization
Definition: A conceptual model focusing on how visualization leads to knowledge gain and motivates further exploration.
Flow:
- Data (D) β processed through visualization pipeline (V) β produces Image (I) β user Perceives (P) the image.
- Perception leads to Knowledge (K) gain (insights about the data and underlying phenomena).
- Acquired knowledge prompts Exploration (E), which modifies the Specification (S) of the visualization (e.g., changing parameters, choosing different views), looping the process.
Key Idea:
- Visualizationβs value is measured by how it increases knowledge (ΞK) in a given time.
- Encourages iterative refinement: each new image and insight can prompt a new round of exploration and specification changes.
Task Abstraction (Andrienko & Andrienko)
Definition: A structured approach to define tasks by relating data context (where, when) and data content (values).
Examples of Tasks:
- Lookup: Given a position in data context (e.g., a specific date), find the corresponding value (e.g., stock price on January 1, 2020).
- Inverse Lookup: Given a value, find the context instances that match it (e.g., when was the stock price = $200?).
- Comparison: Given multiple items in context, find differences or relations between their values (e.g., comparing stock prices on Jan 1, 2020 vs. Jan 1, 2021).
Importance:
- Ties data abstraction directly to task abstraction, helping choose suitable encodings and interactions.
Color Concepts: Luminance vs. Brightness
Brightness: A technical attribute in certain color models (like HSL/HSV) indicating how dark or light a color is.
Luminance: The perceived brightness. Human eyes perceive some hues (e.g., yellow) as brighter than others (e.g., blue) even if they have the same technical brightness values. Luminance is crucial for creating perceptually uniform color scales.
Key Takeaway:
- Use color spaces and scales that consider human perception (e.g., HCL) for more effective visualization.
- Ensures that increments in data reflect proportional perceptual differences.
Marks & Channels
Marks: Basic geometric elements that represent data items (e.g., points, lines, areas).
Channels: Visual variables you can adjust for each mark (e.g., position, size, color, shape). Channels encode data attributes onto marks.
Example:
- A scatterplot uses position (x, y) to represent values of two variables. The mark is a point, and position channels encode data attributes.
Expressiveness & Effectiveness (Analyzing Charts)
Expressiveness:
- Does the visualization show only what is present in the data?
- Avoid showing structures or patterns not supported by the data.
- Example of Non-Expressiveness: Using hue (categorical) for ordinal data, making it unclear that one data value is βlargerβ or βsmaller.β
Effectiveness:
- How easily can users perceive and interpret whatβs shown?
- Leverages perceptual principles and best practices.
- Example: For comparison tasks, sorting bars by magnitude is more effective than alphabetical order.
Efficiency (optional third E):
- Concerned with how quickly and effortlessly the user can gain insights.
Basic Interaction: Zoom & Pan
Zooming:
- Adjusting the scale to see more or fewer data points at once (e.g., zooming in to see details, zooming out for the big picture).
Panning:
- Shifting the visible window over the data space to examine different areas.
Smooth Navigation:
- A combined approach to quickly move across scales and positions, supporting exploration tasks more fluidly.
Where It Fits:
- Interaction occurs at multiple pipeline stages, often changing the mapping or rendering steps.
- In design terms, zooming and panning are part of the βinteraction idiomβ choices.
Data Types & Charts
Data Types:
- Nominal (categorical), Ordinal, Quantitative.
- The chosen mapping (marks, channels) and color scales should respect data type properties.
Chart Knowledge:
- Knowing basic chart types (bar charts, histograms, scatterplots, etc.) helps identify potential pitfalls.
- Example: A histogram should have contiguous bars for continuous data; if spaced, it may mislead perception.
Bivariate Color Scales
Definition:
- Color scales encoding two variables simultaneously, often arranged in a 2D color matrix (e.g., a gradient from blue to red on one axis and from light to dark on another).
- Useful for showing relationships between two attributes at once (e.g., mapping one variable to hue and another to brightness).
Gestalt Principles in Visualization
Gestalt Principles (e.g., proximity, similarity, enclosure):
- Help convey groupings or structure among data points.
- Example: If points share the same shape or enclosure, viewers perceive them as belonging together.
Purpose:
- Move beyond individual data points to communicate relationships and clusters.
Design Activity Framework (Munznerβs Design Process)
Steps:
- Understand:
- From domain situation β define data & task abstraction.
- Identify user goals, data types, and tasks they need to perform.
- Ideate:
- Generate multiple design ideas (visual encodings and interaction techniques).
- Sketch a variety of approaches.
- Make (Prototype):
- Build Hi-Fi prototypes to test design ideas.
- Show prototypes to users to get feedback.
- Deploy:
- Implement the final, fully-featured visualization.
- Includes performance considerations, robust interaction handling, etc.
Relation to Munznerβs Nested Model:
- βUnderstandβ maps to domain & data/task abstraction stages.
- βIdeateβ corresponds to choosing visual encodings & interaction idioms.
- βMake & Deployβ relate to implementing and refining algorithms and interfaces.
Summary of Integration
- Visualization Pipeline: Defines the technical transformation from data to image and user feedback loop.
- Munznerβs Nested Model: Guides how to design and evaluate at levels of domain, data/tasks, encoding/interaction, and algorithm.
- Van Wijkβs Model: Highlights knowledge gain from iterative visualization usage.
- Data & Task Abstraction (Andrienko & Andrienko): Provides a structured way to define user tasks relative to data context and content.
- Expressiveness & Effectiveness: Ensures that chosen encodings are faithful to the data and supportive of user tasks.
- Marks & Channels, Luminance vs. Brightness, Bivariate Color Scales, Gestalt Principles: Provide concrete guidelines for effective visual encoding.
- Zoom & Pan, Interaction Techniques: Address how users explore and refine their view of the data, fitting into the pipelineβs feedback loop and supporting iterative analysis.
Key Takeaway:
All these concepts interrelate. To build a good visualization:
- Start by understanding user needs and data.
- Abstract those needs to tasks and data structures.
- Choose appropriate encodings and interactions that are both expressive of the data and effective for the user.
- Prototype early, refine, and ultimately deploy a solution that enables iterative exploration and knowledge gain.
Expressiveness & Effectiveness
- Expressiveness: Are we showing exactly whatβs in the data, nothing more/nothing less?
- Effectiveness: Is the visualization easy to read and supporting the userβs tasks?
Data ----> Visualization ----> User Tasks
| Expressive? |
| |
|--------------Effective?----------|
Marks & Channels
Marks: Basic geometric elements (points, lines, areas)
o (point)
β (line)
β (area)
Channels: Properties we adjust to encode data
- Position (x,y)
- Size
- Color (hue, saturation, luminance)
- Shape
- Orientation
For example, a scatterplot:
Mark = Point
Channels = x-position, y-position, color, size
Luminance vs. Brightness
Brightness: Technical parameter (e.g., in HSV color model)
Luminance: Perceived brightness by the human eye
Even if two colors have same "brightness" value:
Yellow appears lighter than Blue.
When creating color scales:
Use luminance-aware color spaces (HCL) for perceptual uniformity.
Zoom & Pan (Scale-Space Diagram)
1
2
3
4
5
Scale Space:
Full Data: [-------------------------] (All data at once)
Zoom in: [-----] (Focus on subset)
β Pan along this axis to see different parts
Zoom: Change scale to see details or overview.
Pan: Move the βwindowβ across the dataset at current scale.
Bivariate Color Scales
1
2
3
4
5
6
7
8
9
10
11
Attribute B
Low Mid High
A βββββββ¬ββββββ¬ββββββ
L β clr1 β clr2 β clr3 β
o βββββββΌββββββΌββββββ€
w β clr4 β clr5 β clr6 β
βββββββΌββββββΌββββββ€
β clr7 β clr8 β clr9 β
Different hues or brightness steps on two axes
representing two different variables simultaneously.
Gestalt Principles (Grouping)
Gestalt principles help show groupings:
- Proximity: Elements close together are seen as a group.
- Similarity: Similar shape/color = same group.
- Enclosure: A boundary drawn around elements = group.
o o o o o o
o o o -> [o o o]
o o o [o o o]
Enclosure groups these items visually.
Design Activity Framework
βββββββββββββ ββββββββββββ ββββββββββββ βββββββββββββ
β Understand β ---> β Ideate β ---> β Make β ---> β Deploy β
βββββββ¬ββββββ βββββββ¬βββββ βββββββ¬βββββ βββββββββ¬βββββ
β β β β
βΌ βΌ βΌ βΌ
Domain Situation Data/Task Prototypes Final System
Abstraction (Hi-fi versions)
Visual Encoding/
Interaction Idiom