DAVI Graph Visualization
DAVI Graph Visualization
Lecture Notes: Graph Visualization
Introduction
- Importance of Graph Visualization: Graphs represent relational data without inherent temporal or spatial frames of reference, such as social networks, computer networks, call graphs, and traffic networks.
- Data Relations and Values: Nodes can represent entities with attributes (e.g., authors with publication counts), and edges represent relationships (e.g., citations between authors).
Node-Link Diagrams
Characteristics of Good Node-Link Diagrams
- Aesthetic Constraints:
- Straight or orthogonal edges.
- Minimize edge crossings.
- Minimize edge bends.
- Maintain uniform edge lengths.
- Maximize symmetries.
- Maximize crossing angles for readability.
Force-Directed Layouts
- Concept: Simulate physical forces to position nodes:
- Attractive Forces: Pull connected nodes closer (Hooke’s Law for springs).
- Repulsive Forces: Push all nodes apart to prevent overlap (Coulomb’s Law for electric charges).
Algorithm Steps
- Initialization: Place nodes randomly or use a pre-layout based on attributes.
- Iteration:
- Simulated Annealing:
Optimize the picture
Optimizations
- Barnes-Hut Algorithm: Reduces computation from \(O(n^2)\) to \(O(n \log n)\) by approximating distant node influences.
- Additional Forces:
- Cluster Attraction: Pull nodes in the same cluster closer.
- Cluster Repulsion: Push nodes in different clusters apart.
- Centering Force: Keeps the graph centered in the display area.
Handling Node Attributes
- Glyphs: Encode additional data using node size, color, or shape.
- Edge Attributes: Vary edge thickness or style to represent weights or types.
Matrix Representations
- Use Cases:
Robinson Matrices
- Definition: An adjacency matrix rearranged so that similar nodes are clustered, and values decrease away from the diagonal.
- Ordering: Critical for revealing patterns; algorithms aim to minimize “Robinson violations.”
Cost Functions for Ordering
- Number of Robinson Violations: Counts instances where the desired decreasing pattern is not followed.
- Sum of Deviations: Weights violations by how much they deviate from the expected pattern.
- Weighted Deviations: Further weights deviations based on their distance from the diagonal.
Implicit Tree Visualizations
Interval Graphs
- Definition: Nodes represent intervals; edges exist between overlapping intervals.
- Applications: DNA sequencing in bioinformatics.
Permutation Diagrams
- Definition: Nodes placed on parallel axes; crossing lines indicate connections.
- Limitation: Best suited for permutation graphs.
Tree Maps
Adjacency Examples
Both methods are suitable for hierarchical data visualization but exhibit different advantages in varying scenarios:
Feature | Icicle Plot | Sunburst Plot |
---|---|---|
Layout Structure | Linear nesting, displaying hierarchy vertically | Circular nesting, displaying hierarchy radially |
Applicable Scenarios | Deep hierarchical tree structures | Broader hierarchical relationships needing global perspective |
Space Utilization | Lower space efficiency | Higher space efficiency, accommodating more nodes |
Interactivity | Primarily traditional static representation, less interactive | Better suited for dynamic, interactive visualization |
Intuitiveness | Easy to understand parent-child relationships | Aesthetically pleasing, suitable for emphasizing global distribution |
Slice-and-Dice Treemap
- Method: Recursively divide space to represent hierarchical data.
- Issue: Can produce rectangles with poor aspect ratios, making area comparison difficult.
Squarified Treemap
- Goal: Optimize rectangle shapes to be closer to squares.
- Algorithm:
- Sort items by size.
- Place largest items first to fill space efficiently.
- Issue:Does not maintain order
Ordered Treemap
- Purpose: Maintain a given order of items while optimizing aspect ratios.
- Algorithm:
- Use a pivot element to divide items.
- Recursively apply to subgroups.
- Pivot Selection Strategies:
- By Size: Based on item weights.
- Middle: Middle item in the ordered list.
- Split Size: Divides list into two groups with equal total weights.
Cascaded Treemaps
- Overlap Representation: Children nodes slightly overlap parent nodes.
- Benefit: Allows labeling of internal nodes and provides interaction handles.
Visualization of Partially Ordered Sets
Sugiyama Framework
- Purpose: Provides a method for drawing layered graphs (e.g., food webs).
- Steps:
1. Layer Assignment
produces a hanging of the graph from its source node(s) -> assigns y-coordinates to each node
- Process: Assign nodes to layers based on their positions in the hierarchy.
- Handling Edges: Introduce dummy nodes for edges that span multiple layers to ensure edges only connect adjacent layers.
2. Crossing Minimization
determines an order of the nodes on each layer that minimizes edge crossings
- Barycentric Method:
- Calculate the average position of each node’s neighbors in the adjacent layer.
- Reorder nodes to minimize edge crossings.
- Perform iterative sweeps (top-down and bottom-up) until the order stabilizes.
3. Horizontal Coordinate Assignment
produces an x-coordinate for each node from the ordering
- Goal: Assign x-coordinates to minimize edge bends.
- Method: Position nodes to allow for straight-line edges where possible.
Visualization Critique Example
Gun Deaths Visualization Issues
- Inappropriate Use of Color:
- Time represented using a diverging color scale without a meaningful midpoint.
- Diverging scales imply a neutral value, which doesn’t exist in this context.
- Non-Mutually Exclusive Categories:
- Categories overlap (e.g., “Handgun” appearing in multiple segments), making comparisons invalid.
- Ineffective Data Representation:
- Stack bar charts are not ideal for time series data.
- Alternative: Use cumulative line graphs for clearer temporal comparisons.
Best Practices Highlighted
- Expressiveness: Choose visual encodings that accurately represent the data without introducing misleading implications.
- Effectiveness: Use visualization methods that allow for easy and accurate interpretation of the data.
- Avoid Overlapping Categories: Ensure categories are mutually exclusive when comparisons are intended.
- Appropriate Use of Color: Select color scales that align with the data type (sequential, diverging, qualitative).
Key Takeaways
- Graph Visualization Techniques:
- Force-directed layouts are intuitive but computationally intensive for large graphs.
- Matrix representations are useful for dense graphs but require careful ordering to reveal patterns.
- Implicit visualizations like treemaps efficiently represent hierarchical data but have limitations regarding order and aspect ratios.
- Optimizing Visualizations:
- Incorporate additional forces or constraints to highlight clusters or specific patterns.
- Use algorithms like the Sugiyama framework for layered graphs to minimize crossings and bends.
- Visualization Principles:
- Match the visualization technique to the data characteristics and the insights you wish to convey.
- Be mindful of visual encoding choices to maintain both expressiveness and effectiveness.
This post is licensed under CC BY 4.0 by the author.