An ETag (entity tag) is an HTTP response header that acts as a unique identifier for a specific version of a web resource. It allows browsers to check if a file has changed since the last time it was downloaded.
Using ETags improves site speed and reduces server load by preventing the transfer of unchanged data.
An ETag is an opaque string assigned by a web server to a specific version of a resource found at a URL. If the content at that URL changes, the server assigns a new ETag. These tags function like digital fingerprints, allowing systems to compare two versions of a resource quickly without examining the entire file.
Why ETag matters
ETags are a primary mechanism for web cache validation and optimistic concurrency control. They provide several performance and functional advantages:
- Bandwidth efficiency: Servers do not need to send a full response if the content has not changed.
- Faster load times: Browsers can use locally stored files immediately after a quick validation check.
- Collision prevention: They help detect "mid-air collisions" when multiple users try to update the same resource simultaneously.
- Resource monitoring: Web page monitoring systems use ETags to detect changes efficiently.
- Widespread adoption: [The ETag response header is used on roughly 25% of web responses] (Fastly).
How ETag works
The process of using an ETag to validate a resource is called revalidation. It follows a specific request-response cycle.
- Initial Request: The browser requests a resource (like an image or HTML file).
- Server Response: The server sends the file along with an ETag header, such as
ETag: "675af34563dc-tr34". - Caching: The browser stores the file and the ETag.
- Subsequent Request: When the resource is needed again but has become stale, the browser sends the ETag back in an
If-None-Matchheader. - Server Comparison: The server compares the browser's ETag with the current version on the server.
- Validation: If the tags match, the server sends a
304 Not Modifiedstatus with no body. If they do not match, the server sends the new version with a200 OKstatus and a new ETag.
Strong and weak validation
The HTTP specification allows for two types of ETag validators. They are distinguished by the presence of a W/ prefix.
Strong validators
A strong ETag indicates that the resource is byte-for-byte identical. This is the default type. Strong ETags are required for byte-range requests, which allow a client to request only a portion of a large file.
Weak validators
A weak ETag starts with W/, such as ETag: W/"0815". This indicates that the two representations are semantically equivalent but not necessarily identical at the byte level. Weak ETags are easier for servers to generate, making them useful for dynamically generated content where a byte-for-byte match is impractical.
Best practices
Follow these principles to ensure ETags improve performance rather than hindering it.
- Assign validators to all resources: Ensure every resource has either an ETag or a Last-Modified header so browsers can revalidate them.
- Use content hashes: Generate ETags using collision-resistant hash functions like MD5 or SHA-1. [Analysis shows MD5 and SHA-1 are the most common hash lengths used on the web] (Fastly).
- Ensure validity: Always surround ETag values with double quotes.
- Account for encoding: ETags should be content-coding aware. For example, a gzipped version of a file should have a different ETag than the uncompressed version.
- Prioritize Last-Modified for simplicity: If one-second resolution is sufficient, use the Last-Modified header instead, as it is often easier for servers to track.
Common mistakes
- Mistake: Failing to update the ETag after changing the resource. Fix: Ensure the generation logic (hash or revision number) triggers a new tag immediately upon content updates to avoid serving stale data.
- Mistake: Using invalid syntax like spaces or missing quotes. Fix: Always wrap the opaque string in double quotes and avoid internal spaces.
- Mistake: Creating "double-weak" tags like
W/W/"123". Fix: Use only a singleW/prefix for weak validation. - Mistake: Using ETags for user tracking. Fix: Avoid persisting ETags to track users without consent. [Hulu and KISSmetrics faced legal action for using ETags as "undeletable" tracking cookies] (Wikipedia).
ETag vs. Last-Modified
| Feature | ETag | Last-Modified |
|---|---|---|
| Logic | Unique identifier/hash | Timestamp of change |
| Precision | High (exact version) | Lower (one-second) |
| Complexity | Higher to generate | Lower to maintain |
| Range Requests | Supported (Strong) | Supported |
FAQ
What is the "mid-air collision" problem?
This occurs when two people edit the same page at the same time. If User A and User B both open a wiki page, they both receive the same ETag. If User A saves first, the server updates the ETag. When User B tries to save, their If-Match header will not match the new server ETag, and the server will return a 412 Precondition Failed error, preventing User B from accidentally overwriting User A's changes.
How does ETag affect SEO? ETags indirectly help SEO by improving crawl efficiency. When search engine bots crawl a site, ETags allow them to identify unchanged content quickly. This saves the bot's "crawl budget," allowing it to spend more time indexing new or updated pages.
Can ETags be used for tracking? Yes. Since ETags are stored in the browser cache and sent back to the server, a server can assign a unique ETag to a specific user to track them across sessions. This is often called a "zombie cookie" because it can persist even if a user deletes their standard cookies.
What happens if a website is "buggy" with ETags? If a site fails to update an ETag after changing content, the browser will continue to use the old, cached version. [To detect this, some developers suggest occasionally omitting the If-None-Match header] (Wikipedia) to see if the server returns different content for the same ETag.
Does compression change the ETag?
Yes. Because a gzipped file is byte-for-byte different from an uncompressed file, they represent different representations. Common practice includes adding a suffix like +gzip to the ETag for compressed versions.