HCI Track-A Data Visualization Week46
Data Visualization Lecture Notes
Introduction
As we approach the end of the semester, we delve into one of the most exciting topics in Human-Computer Interaction (HCI): Data Visualization. This lecture is particularly special as it aligns with the professor’s research area and offers insights into why visualization is a powerful tool for understanding complex data.
Why Data Visualization?
- Amplifies Cognition: Visualization helps us process and understand large volumes of data by leveraging our advanced visual perception capabilities.
- External Cognition: By representing data visually, we offload cognitive processes onto external artifacts, making complex analysis more manageable.
- Insight over Numbers: Echoing Richard Hamming’s sentiment, “The purpose of computing is insight, not numbers.” Visualization turns raw data into meaningful information.
Historical Examples
John Snow’s Cholera Map (1854)
- Context: In Soho, London, a cholera epidemic broke out in a poor neighborhood known for crime and unsanitary conditions.
- Challenge: At the time, the transmission of cholera was not well understood; many believed it was airborne.
- Solution: Dr. John Snow plotted cholera cases on a map, revealing a cluster around the Broad Street water pump.
- Outcome: By removing the pump handle, the cholera outbreak subsided. This was a pioneering use of data visualization to solve a real-world problem.
Key Points:
- Visualization transformed complex tables of data into an understandable format.
- It enabled the identification of the contaminated water source.
- Highlighted the importance of visual tools in epidemiology.
William Playfair’s Charts (Late 1700s - Early 1800s)
John Snow’s Cholera Map (1854)
- Contribution: Invented several types of graphs, including the line chart, bar chart, and pie chart.
- Application: Used to represent economic data such as imports and exports.
- Impact: Made complex economic data accessible to policymakers who lacked advanced mathematical training.
Napoleon’s March by Charles Minard (1869)
- Visualization: A flow map depicting Napoleon’s 1812 Russian campaign.
- Features:
- The width of the line represents the size of the army.
- Temperature chart below illustrates the harsh winter conditions.
- Insights:
- Showed the devastating losses due to cold and starvation.
- Illustrated multiple variables (army size, temperature, geography) in a single graphic.
Why It’s Important:
- Considered one of the greatest statistical graphics ever made.
- Demonstrates how visualization can tell a complex story succinctly.
Florence Nightingale’s Polar Area Chart (1858)
- Context: Crimean War and poor conditions in field hospitals.
- Visualization: Polar area chart, also known as “Nightingale’s Rose.”
- Purpose:
- Showed causes of death among soldiers.
- Highlighted that most deaths were due to preventable diseases, not battle wounds.
- Result: Influenced the British government to improve sanitary conditions, revolutionizing healthcare.
The Power of Visualization
Anscombe’s Quartet
- Description: Four datasets with nearly identical statistical properties (mean, variance, correlation).
- Visualization:
- Plotting the datasets reveals distinct patterns and outliers.
- Lesson:
- Statistical measures alone can be misleading.
- Visual exploration is crucial for accurate data interpretation.
Human Perception
- Visual System: Our most sophisticated sense, capable of detecting patterns, anomalies, and trends.
- Application: Visualization leverages this to help us understand complex data quickly.
Key Concepts in Data Visualization
Data Visualization Pipeline
- Raw Data: Collection of unprocessed data.
- Data Tables: Structured format like spreadsheets or databases.
- Visual Structures: Geometric representations (charts, graphs).
- Views: Interactive elements like zooming or filtering.
- Human Interaction: Users interpret and manipulate the visualization.
Shneiderman’s Information-Seeking Mantra
- Overview First: Get a broad understanding of the data.
- Zoom and Filter: Focus on areas of interest.
- Details on Demand: Access specific information as needed.
Data Types and Visualization Techniques
- One-Dimensional Data: Line charts.
- Two-Dimensional Data: Scatter plots.
- Three-Dimensional Data: 3D models.
- Multidimensional Data: Parallel coordinates, scatter plot matrices.
- Temporal Data: Timelines, time-series plots.
- Network Data: Node-link diagrams, adjacency matrices.
- Spatial Data: Geographic maps, heatmaps.
Tools for Data Visualization
D3.js
- Description: A JavaScript library for creating dynamic and interactive data visualizations in web browsers.
- Features: Provides control over the final visual result.
Vega-Lite
- Description: A high-level grammar of interactive graphics.
- Usage: Simplifies the creation of common charts and visualizations with less code.
Tableau
- Overview: A powerful tool for creating interactive visualizations without coding.
- Features:
- Drag-and-drop interface.
- Supports various data sources.
- Ideal for dashboards and storytelling.
Tableau Demonstration:
- Dataset Used: Car specifications from the 1970s to 1980s.
- Visualizations Created:
- Scatter plots comparing horsepower and displacement.
- Size and color encoding to represent weight and origin.
- Insights Gained:
- Correlation between engine size and horsepower.
- Visual differentiation of cars based on geographical origin.
Importance of External Cognition
- Definition: Using the environment to aid cognitive processes.
- Benefits:
- Offloads memory demands.
- Facilitates problem-solving through visual means.
- Example: Solving mathematical problems with pen and paper.
Conclusion
Data visualization is a critical aspect of HCI that transforms raw data into meaningful insights. By leveraging our visual perception and cognitive abilities, we can interpret complex datasets, identify patterns, and make informed decisions. Tools like D3.js, Vega-Lite, and Tableau empower us to create effective visualizations tailored to various data types and analytical needs.