Web Development

HTML Entities: Character References & Usage Guide

Use HTML entities to display reserved characters and symbols safely. This guide covers named references, numeric codes, and syntax best practices.

14.8k
html entities
Monthly Search Volume
Keyword Research

HTML Entities are character sequences used to display reserved characters, symbols, and invisible marks that browsers would otherwise interpret as code or truncate. These sequences, also called character references, allow you to safely include signs like < or & in your content without breaking the page structure. Using entities ensures that search engines and browsers render your text exactly as intended.

What is an HTML Entity?

The term "HTML Entity" comes from Standard Generalized Markup Language (SGML) and refers to a reference to information that can be defined once and used throughout a document. In modern web development, it is used as a synonym for a character reference.

When a browser encounters an entity, it replaces the code with the corresponding character. Because certain characters have special meanings in HTML (reserved characters), you must replace them with entities to prevent the browser from mixing them with tags.

Why HTML Entities matter

  • Avoid broken code. If you use a less-than sign (<) in your text, the browser may think you are starting a new HTML tag and hide the subsequent text.
  • Controlled formatting. Entities like the non-breaking space (&nbsp;) prevent the browser from collapsing multiple spaces into one or breaking a line in a disruptive place.
  • Universal symbol display. They allow the display of currency symbols, mathematical operators, and Greek letters that may not be available on a standard keyboard.
  • Cross-protocol consistency. [HTML 5 adopts XML entities as named character references] (Wikipedia), ensuring symbols render reliably across different markup languages.

How HTML Entities work

You can represent a character using either an entity name or a numeric reference. Numeric references use the character's Unicode code point.

  1. Entity Name: Uses a mnemonic name that is easy to remember.
    • Example: &amp; for an ampersand (&).
  2. Decimal Number: Uses a specific number code preceded by #.
    • Example: &#60; for a less-than sign (<).
  3. Hexadecimal Number: Uses a code preceded by #x.
    • Example: &#x3c; for a less-than sign (<).

In XML and XHTML, the trailing semicolon is mandatory for all entities. While some older HTML versions allowed omitting it in specific cases, it is strongly recommended to always include it to ensure interoperability.

Types of HTML Entities

The following table shows the most common reserved characters and symbols used in SEO and content management.

Character Description Name Number
< Less than &lt; &#60;
> Greater than &gt; &#62;
& Ampersand &amp; &#38;
" Double quote &quot; &#34;
' Single quote/Apostrophe &apos; &#39;
© Copyright sign &copy; &#169;
® Registered trademark &reg; &#174;
Trademark &trade; &#8482;
Non-breaking space &nbsp; &#160;

Best practices

  • Always include the semicolon. Even if a browser renders an entity without one, omiting it can cause parsing errors in XML based systems or newer HTML5 environments.
  • Use entity names for readability. Mnemonic names like &copy; are easier for humans to read and audit in source code than numeric codes like &#169;.
  • Use non-breaking spaces for units. Keep measurements or times together (e.g., 10&nbsp;PM or 100&nbsp;km/h) to prevent confusing line breaks.
  • Prefer standard characters for Dutch ligatures. For words using "ij," use the separate letters "i" and "j" instead of the &ijlig; entity, as the latter is discouraged.
  • Apply diacritical marks for rare accents. Use constructs like a&#768; to produce characters not present in your standard character set.

Common mistakes

Mistake: Treating entity names as case insensitive. Fix: Always use lowercase for standard entities. Names like &LT; or &COPY; will often fail to render.

Mistake: Using entities for common letters. Fix: Only use entities for characters that are reserved, not on your keyboard, or require special formatting. Overusing entities makes your HTML harder to read.

Mistake: Forgetting that [ &quot; was omitted from the HTML 3.2 specification but restored as of HTML 4.0] (Wikipedia). Fix: While modern browsers handle &quot; easily, ensure your legacy systems recognize these standards if you are working with very old web archives.

Mistake: Using multiple &nbsp; to create page margins. Fix: Use CSS for layout and margins. Only use &nbsp; to add "real" spaces within text where the browser would normally truncate them.

Examples

Example scenario: Displaying code on a blog If you want to show the code <a> on a webpage without the browser creating a link, you would write: &lt;a&gt;

Example scenario: Legal footers To display a copyright notice for the current year, you use: Copyright &copy; 2024 Your Company

Example scenario: Mathematical formulas To write "Angle A is less than or equal to 90 degrees," you use: &ang; A &le; 90&deg;

FAQ

What is a non-breaking space? A non-breaking space (&nbsp;) serves two purposes. First, it prevents the browser from breaking a line between two words. Second, it prevents browsers from truncating multiple spaces. If you type ten normal spaces, the browser removes nine. Using &nbsp; ensures all ten spaces appear.

How do I choose between an entity name and a number? Entity names are easier to remember and read. However, not every Unicode character has a name. Numeric references (decimal or hexadecimal) are available for almost every valid character in the character set.

Are HTML entities case sensitive? Yes. You must use the exact casing specified in the standard. For example, &middot; is for a middle dot, but the name must be lowercase.

Can I create my own HTML entities? No. HTML5 does not allow users to define additional entities because it no longer accepts Document Type Definitions (DTD) inside the document. In XHTML, you can technically define them in a DTD, but it is not recommended for general web use.

What are diacritical marks? These are glyphs added to letters, such as the grave ( ` ) or acute ( ' ) accents. You can combine them with normal alphanumeric characters to create specific accented letters that may not exist in your encoding.

Start Your SEO Research in Seconds

5 free searches/day • No credit card needed • Access all features