Data Science

Ngram Viewer: Research Linguistic & Cultural Trends

Use Ngram Viewer to chart word frequencies in digitized books. Discover cultural trends, configure search settings, and interpret historical data.

22.2k
ngram viewer
Monthly Search Volume
Keyword Research

Entity Tracking: * Ngram Viewer: An online search engine that charts the frequency of specific search strings appearing in digitized books over time. * N-gram: A contiguous sequence of $n$ items from a text, such as a single word (1-gram) or a phrase like "nursery school" (2-gram). * Corpus: A specialized collection of digital texts used for linguistic analysis, such as "American English" or "English Fiction." * Culturomics: The study of cultural trends through the high-volume quantitative analysis of digitized texts. * Optical Character Recognition (OCR): Technology that converts scanned book pages into machine-readable text, occasionally introducing errors. * Smoothing: A graph setting that averages data over a range of years to make long-term trends easier to identify. * Wildcards: Search characters, such as asterisks, used to represent unknown words or varied phrasing in a query.

The Google Books Ngram Viewer is a search engine that graphs the frequency of words and phrases used in printed sources. By analyzing millions of books, it allows you to visualize linguistic and cultural changes over several centuries. Marketers and researchers use it to identify when terms entered the public lexicon or to find historical shifts in consumer interests.

What is Ngram Viewer?

Google software engineers and Harvard researchers developed the tool to open a new window into quantitative research. It was [released on December 16, 2010] (Wikipedia), providing a way to quantify the rate of linguistic change.

The tool operates on a massive scale. At its launch, the [database contained 500 billion words from 5.2 million books] (Huffington Post). The search engine [charts frequencies of search strings in sources published between 1500 and 2022] (Wikipedia). While the tool serves a scholarly audience, its simple interface makes it accessible for anyone looking to browse cultural trends throughout history.

Why Ngram Viewer matters

The tool provides a heuristic value, meaning it helps you generate fresh research questions rather than offering definitive proof.

  • Track term popularity: See how specific keywords or brand-related terms have risen or fallen in usage over centuries.
  • Identify cultural origins: Discover when a concept first came into vogue. For instance, the word "embarrassing" started a significant rise in usage around 1750.
  • Compare linguistic trajectories: View multiple terms on one graph to see which one dominated a specific era.
  • Access direct sources: You can click the dates below the graph to view the specific books where your search terms appeared.

How Ngram Viewer works

The process for generating a chart is straightforward and relies on basic search inputs:

  1. Enter search terms: Type words or phrases into the search box, separating each with a comma.
  2. Select the corpus: Choose the language or specific regional database, such as British English or American English.
  3. Set the time frame: Define the year range for your search, starting as early as 1500.
  4. Adjust smoothing: Set the smoothing level (usually 0 to 5) to balance between seeing yearly fluctuations or a smoother moving average.
  5. View the Y-axis: The graph shows how often a term appears as a percentage of all words in the selected corpus for that year.

To ensure data quality, [matches are only displayed if they are found in at least 40 books] (Google Ngram Info).

Best practices

Use wildcards for flexibility. If you want to see what words frequently follow a term, use wildcards to uncover common associations in the literature.

Peruse the underlying books. Use the links provided at the bottom of the page to see the context of your keywords in Google Books. This helps you understand if a word's meaning has changed over time.

Compare related terms. Do not look at a single word in isolation. Graphing a term alongside its synonyms (e.g., comparing "shameful" and "mortifying") provides a more complete picture of usage trends.

Cross-reference with etymological tools. Because Ngram Viewer is not an etymological dictionary, use the Online Etymology Dictionary to confirm the earliest known use of an English word.

Common mistakes

Mistake: Treating the results as an exact count of word popularity in modern speech.
Fix: Remember that the data is limited to what Google has scanned, which is subject to selection bias and an increasing amount of scientific literature over time.

Mistake: Ignoring OCR errors in older texts.
Fix: Be cautious with texts from before 1800, where [OCR technology often confuses the "long s" (ſ) with the letter "f"] (Wired).

Mistake: Assuming all language data is equally reliable.
Fix: Note that some languages have much smaller datasets. For example, [frequencies for Chinese may only be accurate from 1970 onward] (Wikipedia), with earlier data containing significant noise.

Mistake: Relying on outdated data for very recent trends.
Fix: While Google [updated the viewer in July 2020 with data through 2019] (Search Liaison), it does not provide real-time web search data.

Examples

Example scenario (Cultural Trend): A researcher wants to know when the concept of "nursery school" became common. By entering the term and setting the date range from 1800 to 2000, they can identify the exact decade where the phrase's frequency began to climb.

Example scenario (Contextual Shift): A writer investigating the word "embarrassing" uses the tool to see its adoption in the late eighteenth century. By clicking on the 1750–1800 date range below the graph, they can read the specific passages from that era to see if the word meant the same thing then as it does today.

FAQ

What languages does Ngram Viewer support?

The tool supports searches in English, Chinese (Simplified), French, German, Hebrew, Italian, Russian, and Spanish. It also offers specialized corpora for American English, British English, and English Fiction.

How often is the data updated?

The data is updated periodically. The most recent major update mentioned [occurred in 2020, bringing the dataset forward to include books from 2019] (Search Liaison).

Is the Ngram Viewer the same as Google Trends?

No. Ngram Viewer tracks words in printed books over centuries, while Google Trends tracks terms people are currently typing into the Google search engine. Ngram is historical and literary: Google Trends is modern and behavior-based.

Why does the graph start to decline for some words even if they are still popular?

This can happen because of a surge in scientific or specialized literature in the corpus. As the total volume of words in the database increases with new technical books, the relative percentage of common cultural words may appear to decline even if their absolute usage remains high.

Can I export the data for my own research?

Yes. The tool allows you to download the raw data or share the visualization via social media and iframes for embedding on other websites.

Start Your SEO Research in Seconds

5 free searches/day • No credit card needed • Access all features