Visual encoding is the process of mapping data into visual structures to create images on a screen. It acts as a translation layer between raw information and the visual elements (like colors, shapes, and sizes) that users perceive. Mastering this process allows you to present complex data in a way that viewers can understand and recall faster.
What is Visual Encoding?
In data visualization, encoding is a set of rules used to convert data points into specific visual marks. While the term may sound technical, it describes a common cognitive function: how the brain converts images, objects, and scenes into mental representations for storage and recall.
For marketers, visual encoding is essentially a formula: "Every time a specific data point changes, change a specific visual element." This consistency ensures that the audience can decode the information without confusion.
Why Visual Encoding matters
- Faster comprehension: The brain processes visual information more efficiently than verbal or spatial data.
- Improved recall: [Visual memory benefits from prolonged encoding time regardless of stimulus type] (Li et al.).
- Reduced cognitive load: Organizing information into patterns prevents the brain from being overwhelmed by raw numbers.
- Universal communication: Visual symbols often bypass language barriers, making data accessible to a wider audience.
- Increased accuracy: Clear encoding leads to more accurate data recall, which supports better decision making.
How Visual Encoding works
The process begins by identifying the type of data you are working with. The corpus identifies three primary categories:
- Quantitative: Exact numbers (e.g., traffic counts or revenue).
- Ordered (Qualitative): Items that can be ranked (e.g., priority levels like "High," "Medium," and "Low").
- Categorical: Distinct types that cannot be ranked (e.g., different marketing channels or fruit types).
Once the data type is defined, you apply visual variables to represent those differences. Humans are particularly sensitive to "retinal variables," which were [introduced by Jacques Bertin approximately 40 years ago] (Bertin).
Types of Visual Variables
Planar Variables
These are the X and Y axes. They work for all data types and are highly effective for showing quantitative data. While a Z-axis (3D) exists, [3D charts look horrible on screen in 95.8% of cases] (Apptio) and should generally be avoided.
Retinal Variables
- Size: Effective for quantitative data. Large elements suggest importance or higher volume.
- Color Hue: Best for separating categories (e.g., using red for "Bugs" and blue for "User Stories").
- Color Value: Useful for ordered data. Darker shades often represent higher values in a sequence.
- Shape: Allows users to differentiate between categories, such as using circles for one group and squares for another.
- Texture: Less common in digital displays as it is harder to perceive than color or size.
- Orientation: Tricky to use, though humans can easily distinguish between horizontal and vertical lines.
Gestalt Principles in Encoding
The brain does not process visual elements in isolation. According to Gestalt theory, it organizes stimuli into meaningful patterns through several "laws of grouping":
- Similarity: We group things that look alike (size, shape, color).
- Proximity: We group elements that are physically close to each other.
- Continuity: We perceive elements arranged in a line or curve as related.
- Closure: The mind automatically fills in gaps to create complete shapes.
- Symmetry: Symmetrical items are perceived as a single group.
- Connectedness: Elements linked by lines or touch are seen as a unit.
Best practices
Map data to area, not radii. When using circles to represent scalar data, you must map the data to the circle’s area rather than its radius. [Humans are better at comparing relative areas] (vis4.net).
Limit your color palette. You should use no more than a dozen colors to encode categories. Exceeding this makes it difficult for the brain to differentiate between types quickly.
Use diverging scales for bipolar data. If your data has positive and negative values (like temperature or profit/loss), use a diverging scale with two different colors. Using a single color for both positive and negative ranges is a mistake.
Match variables to data types. Do not use color hues to represent exact numbers, and do not use size to represent categories that have no inherent rank.
Common mistakes
Mistake: Using 3D effects for flat data. Fix: Stick to X and Y axes for clarity; 3D often distorts the viewer's perception of size and position.
Mistake: Mapping quantity to circle radius. Fix: Map the quantity to the total area of the shape so the visual growth matches the data growth.
Mistake: Overloading a single chart. Fix: While you can use all retinal variables at once, it can cause cognitive overload. Aim for a maximum of three to four variables (e.g., X, Y, Color, and Size).
Mistake: Ignoring cultural contexts. Fix: Symbols and colors can have different meanings across cultures. Ensure your visual encoding is appropriate for your specific target audience.
Examples
The "Hamilton" Interactive In a visualization of the musical Hamilton, color encodes which character is speaking (e.g., purple for Aaron Burr). Circle area encodes the length of lyrics, and grouping is used to cluster lines by specific songs.
Olympic Medal Mapping A New York Times visualization used a world map (X and Y variables) where circle size encoded the medal count for each nation. Colors were used to differentiate between continents.
Sport Performance Heatmaps Heatmaps on a basketball court use X and Y coordinates to show position. Color encodes the points per region (intensity), and the size of the marks indicates the number of shot attempts.
FAQ
What is the difference between visual encoding and data visualization? Visual encoding is a specific step within data visualization. It is the technical and cognitive process of choosing which visual mark (like a blue triangle) represents which piece of data (like a "High Priority Bug"). Data visualization is the broader field of creating the final graphic.
Which visual variable is the most effective? Planar variables (X and Y position) are generally considered the most effective for all data types. Among retinal variables, color hue is excellent for categories, while size and color value are best for quantitative or ordered data.
How many variables should I use at once? For simple data sets, X and Y axes are often enough. You should only use retinal variables (like size or shape) when you need to present three or more data sources on the same chart.
Why is 3D encoding discouraged? The corpus indicates that 3D charts fail to communicate effectively in the vast majority of cases. 3D perspective makes it difficult to judge the exact values on a flat screen and often obscures data points behind one another.