Eye Tracking Explained: Technology and Core Metrics

Eye tracking measures either the point of gaze (where someone is looking) or the motion of an eye relative to the head. For marketers and SEO practitioners, it provides behavioral evidence of attention, cognitive load, and decision-making moments that precede clicks and conversions. Unlike aggregate analytics, it reveals the specific visual elements that capture attention, create confusion, or get ignored entirely.

What is Eye Tracking?

Eye tracking is the process of measuring eye positions and movements to understand visual attention. An eye tracker is a device that captures this data using sensors or cameras to detect where a person looks, for how long, and in what sequence.

The technology records three core movement motifs:

Fixations: Relatively longer periods (0.2–0.6 seconds) of steady focus where the brain processes visual detail.
Saccades: Rapid, point-to-point jumps (20–100 ms) between fixations that shift the foveal "spotlight" to new areas.
Smooth pursuits: Continuous movements that track moving objects.

Developed from early reading research in the 1800s, modern eye tracking uses video-based methods, electrooculography (EOG), or specialized contact lenses to capture gaze data for applications in psychology, marketing, usability testing, and assistive technology.

Why Eye Tracking matters

Eye tracking bridges the gap between what users say they do and what they actually do. For conversion optimization and content strategy, it provides specific advantages:

Quantify attention before action: Reveals what users look at but do not click, exposing missed opportunities and false affordances.
Optimize information hierarchy: Validates whether visual flow matches intended conversion paths, ensuring critical CTAs fall within natural gaze patterns.
Measure cognitive load: Pupil dilation and blink rates indicate mental effort, helping identify confusing interface elements that increase bounce risk.
Test real-world environments: Wearable trackers capture authentic shopping behavior, shelf scanning, and mobile use outside the lab.
Validate SERP features: Research indicates that authorship snippets received more attention than paid ads or the first organic result (Search Engine Journal).
Accelerate recruitment screening: In a study of LinkedIn profile screening, recruiters spent just 3 seconds to screen candidate profiles, with gaze heatmaps revealing which elements triggered immediate attention (Element's Blog).

How Eye Tracking works

Most commercial systems use video-based eye tracking with infrared or near-infrared illumination. The process follows these steps:

Illumination: Near-infrared light directed at the eye creates corneal reflections (glints) and illuminates the pupil.
Image capture: Cameras record the eye region, capturing the pupil center and corneal reflection (the first Purkinje image).
Feature extraction: Algorithms detect the vector between the pupil center and the corneal reflection to calculate gaze direction.
Calibration: The participant looks at specific points on screen to map eye features to screen coordinates. Accurate calibration is essential; misalignment produces erroneous data.
Event classification: Software segments raw data into fixations, saccades, and blinks based on velocity thresholds and duration criteria.
Visualization and analysis: Heatmaps aggregate fixation density, scanpaths show gaze sequences, and areas of interest (AOIs) calculate metrics like dwell time and time to first fixation.

Advanced systems use deep learning for gaze estimation. A Deep Integrated Neural Network (DINN) trained on over 2,400 subjects correctly diagnosed eye states 96%-99.5% of the time for applications like driver drowsiness detection (Zhao et al., 2017).

Types of Eye Tracking

Type	Mechanism	Best For	Limitations
Screen-based (Remote)	Cameras mounted on or near a monitor track eyes from a distance	Web usability, A/B testing, reading studies	Requires stationary position; limited head movement
Wearable (Glasses)	Cameras mounted on glasses-like frames with forward-facing scene cameras	In-store behavior, mobile UX, sports training	Slippage affects accuracy; complex data analysis
Webcam-based	Uses computer vision on standard webcams	Remote unmoderated testing, large sample sizes	Lower accuracy than dedicated hardware
VR/AR Integrated	Sensors embedded in headsets	Immersive environment testing, foveated rendering	Expensive; requires specialized software
EOG	Electrodes measure electric potential around eyes	Sleep research, situations with eyes closed	Poor gaze-direction accuracy; invasive setup

Screen-based systems use either bright-pupil (coaxial illumination creating red-eye effect) or dark-pupil (offset illumination) techniques. Bright-pupil tracking offers better contrast and works across different iris pigmentation and lighting conditions.

Best practices

Define AOIs before testing. Identify specific areas of interest (buttons, images, text blocks) prior to data collection. Post-hoc definitions introduce bias and prevent statistical comparison of dwell time across participants.

Script specific tasks. Eye movement depends heavily on the question asked. The Yarbus effect demonstrates that the same stimulus produces different scanpaths depending on the user's goal. Task-based scenarios ("Find the shipping cost") yield actionable data; free browsing produces noise.

Combine with qualitative methods. The strong eye-mind hypothesis assumes no lag between fixation and processing, but covert attention means users may think about elements they are not looking at. Pair eye tracking with think-aloud protocols or retrospective interviews to disambiguate gaze patterns.

Verify calibration. Signal drift and slippage reduce accuracy over time. Check calibration points at the start and end of sessions. The vector between pupil center and corneal reflection must remain stable for valid gaze estimation.

Analyze scanpaths, not just heatmaps. Heatmaps aggregate data and hide individual differences. Scanpaths reveal the sequence of information processing and cognitive strategies that aggregate views obscure.

Account for the foveal spotlight. The fovea covers only about 2° of visual angle (roughly thumbnail width at arm's length). Users rely on peripheral vision for layout and saccade targeting, so ensure key navigational cues exist outside the immediate focal area (Pupil Labs).

Common mistakes

Mistake: Assuming gaze equals interest. Users fixate on confusing elements, error messages, or unusual visuals out of necessity, not preference. You will see high dwell time on broken features or misleading graphics. Fix: Cross-reference fixation data with usability metrics, task success rates, or verbal protocols to distinguish interest from confusion.

Mistake: Testing without environmental controls. Varying light conditions, screen glare, or reflections from glasses corrupt corneal reflection detection. Fix: Control lighting for remote testing. For wearable outdoor studies, check data quality markers and exclude segments with excessive slippage or occlusion.

Mistake: Confusing eye tracking with gaze tracking. Eye tracking measures eye-in-head rotation. Gaze tracking requires adding head position to determine line of sight in world coordinates. Fix: Use head-mounted displays with scene cameras or remote systems with head tracking if you need world-relative gaze vectors.

Mistake: Over-interpreting AI predictions. Convolutional neural networks can predict chess moves from eye tracking with saliency maps more than 54% similar to actual player attention, but this does not reveal why the player chose that move (Louedec et al., 2019). Fix: Treat AI-derived insights as hypotheses requiring validation through behavioral outcomes, not conclusions.

Mistake: Ignoring accessibility needs. Some users with motor impairments rely on eye tracking as input devices (eye mice). Poor calibration or interface design that triggers false positives creates barriers. Fix: Test assistive technology interfaces with actual users who have severe speech and motor impairment (SSMI), not just able-bodied participants.

Examples

SERP Layout Optimization An eye tracking study of search engine results pages found that authorship snippets (Google Authorship) captured more attention than paid advertisements or even the first organic result. This data supported strategic investment in author markup and rich snippets to maximize visibility above the fold.

Recruitment Workflow Analysis Research using heat maps analyzed how recruiters screen LinkedIn profiles. The 3-second screening pattern revealed that recruiters focused on profile photos, current position titles, and previous employers while largely ignoring detailed descriptions. This insight drove recommendations for concise, scannable resume formatting.

Yellow Pages Advertising A study of print advertising in Yellow Pages directories found that ad size, graphics, color, and copy all influence attention patterns. Eye tracking quantified how frequently consumers fixated on target logos versus adjacent competitor ads, allowing advertisers to benchmark visual attention share.

Driver Monitoring Systems Eye tracking technology integrated with convolutional neural networks classifies eye states (open, closed, blinking) to detect drowsiness. With 96%-99.5% accuracy on training data exceeding 2,400 subjects, these systems trigger alerts when fixation patterns indicate fatigue, reducing accident risk in commercial fleets.

FAQ

What is the difference between eye tracking and gaze tracking? Eye tracking measures the rotation of the eye relative to the head. Gaze tracking determines the line of sight in world coordinates, which requires adding head position data to eye-in-head measurements. Remote screen-based systems assume a fixed head position, while wearable systems use scene cameras to calculate true gaze direction.

Can I use webcam eye tracking for serious UX research? Webcam-based systems offer lower accuracy and sampling rates than dedicated hardware, but they enable large-scale remote studies. They are suitable for detecting general attention patterns and areas of interest, but less reliable for precise fixation duration measurements or saccade velocity analysis required in clinical or high-stakes usability contexts.

What does a fixation tell us about user intent? A fixation indicates that high-resolution foveal vision is processing a specific area, but the cognitive state is ambiguous. The same fixation might indicate interest, confusion, recognition, or simply waiting for a page element to load. Always pair fixation data with task outcomes or retrospective interviews to interpret meaning.

How many participants do I need for an eye tracking study? Unlike click analytics, eye tracking data exhibits high inter-individual variability in scanpaths. For quantitative claims about fixation duration or dwell time, plan for at least 30-40 participants per condition. For qualitative insights and pattern recognition, smaller samples (8-12) can suffice if you focus on scanpath similarities rather than statistical testing.

Why is calibration necessary? Eye trackers calculate gaze direction from the relationship between pupil center and corneal reflection, but eye anatomy varies between individuals. Calibration maps these features to known screen coordinates. Without calibration, or with poor calibration, the system cannot accurately determine where the user is looking on the screen.

Is eye tracking data subject to privacy regulations? Yes. Eye tracking data can indirectly reveal information about ethnicity, personality traits, emotional states, and health conditions through machine learning inference. With eye tracking becoming standard in smartphones and laptops, collection of this biometric data falls under GDPR, CCPA, and other privacy frameworks requiring explicit consent and secure storage.

Eye Tracking Explained: Technology and Core Metrics

What is Eye Tracking?

Why Eye Tracking matters

How Eye Tracking works

Types of Eye Tracking

Best practices

Common mistakes

Examples

FAQ

Related Terms

Cognitive Load

Heat Map

Saccade

Scanpath