BERT and MUM: How Google’s AI Models Understand Search Queries

Summarize This Article with:
.

Introduction: The Rise of AI-Powered Search Understanding

Google’s AI models BERT and MUM represent two of the most significant advances in how search engines understand human language and search intent. These models power much of the intelligence behind modern search results, including how pages are evaluated, ranked, and presented to users. Understanding how BERT and MUM work helps SEO professionals create content that aligns with Google’s most sophisticated evaluation systems, and crucially, with the way real people search today.

Search is no longer about matching keywords. It is about matching meaning. When someone types “best way to fix a leaking pipe without calling a plumber,” they are not just looking for pages that contain those exact words. They are looking for a solution to a problem, delivered in a way that is practical, actionable, and trustworthy. BERT and MUM are the systems that enable Google to bridge this gap between raw queries and genuine user needs.

This guide explains both models in detail, their architectural differences, their impact on search, and what you should do differently as a content strategist or SEO practitioner in the age of semantic search.

1. The Context: How Google Search Evolved Before BERT and MUM

To understand why BERT and MUM matter, you need to understand what came before them. For most of its early history, Google operated as a keyword-driven retrieval engine. You typed words, Google found pages containing those words, and rankings were determined largely by link authority and on-page keyword signals.

The first major shift toward semantic understanding came with Hummingbird in 2013. Hummingbird was designed to interpret natural language queries and analyze the sentiment behind search terms rather than just matching strings. It began the process of moving Google from “strings to things.”

In 2015, RankBrain introduced machine learning into the ranking mix. RankBrain could analyze new or unusual queries and map them to known concepts, even if the exact phrasing had never been seen before. It studied user behavior signals such as click-through rates and dwell time to continuously refine its understanding.

In 2018, Neural Matching enhanced Google’s ability to understand how words relate to concepts, using neural networks to match queries to pages that covered the same topic even when different vocabulary was used. This was an important precursor to BERT’s deeper contextual analysis.

Then came 2019, and BERT changed everything. Followed in 2021 by MUM, which expanded the boundaries of what search AI could do. These two models sit at the core of Google’s current search intelligence, and they are the focus of everything that follows.

2. BERT: Bidirectional Encoder Representations from Transformers

2.1 What Is BERT?

BERT, which stands for Bidirectional Encoder Representations from Transformers, was introduced by Google researchers in October 2018 and deployed in Google Search in October 2019. It represented a fundamental rethinking of how machines process language.

The defining innovation of BERT is bidirectionality. Earlier language models processed text in one direction, reading words sequentially from left to right (or right to left) and building understanding incrementally as they went. This meant that a word’s meaning was derived only from the words that preceded it, not from the full sentence context.

BERT reads the entire sentence simultaneously. It considers every word in relation to every other word, capturing dependencies that unidirectional models simply cannot detect. This is what makes BERT so effective at understanding nuance, ambiguity, and implied meaning.

2.2 How BERT Works Technically

BERT is built on the Transformer architecture, specifically using the encoder portion of the Transformer. The Transformer, introduced in the landmark 2017 paper “Attention Is All You Need,” revolutionized NLP by replacing recurrent processing with a parallel attention-based mechanism.

Two key technical components define BERT:

Self-Attention Mechanism: Every word in a sentence is compared against every other word to compute relevance scores. In the sentence “The animal didn’t cross the street because it was too tired,” a traditional model might not know whether “it” refers to the animal or the street. BERT’s attention mechanism computes the relationship between “it” and every other word, correctly linking “it” to “animal” through the context provided by “tired.”

Masked Language Modeling (MLM): During training, BERT randomly masks 15% of the words in a sentence and learns to predict them from the surrounding context. This forces the model to develop deep bidirectional understanding. If you mask the word “bank” in “I sat on the [MASK] and ate lunch,” BERT must use the full context to determine that the masked word is likely “riverbank,” not “bank” as in a financial institution.

Next Sentence Prediction (NSP): BERT is also trained to predict whether two sentences logically follow each other. This helps the model understand relationships between sentences and paragraphs, which is critical for evaluating content coherence.

Google fine-tuned BERT specifically for search, training it on massive volumes of query and document pairs so that it could learn to match complex, conversational queries to the most relevant pages even when the exact keywords did not appear on those pages.

2.3 BERT’s Real Impact on Search Queries

Google reported that BERT affected approximately 10% of all English-language search queries at launch, making it one of the largest single improvements in search history. The types of queries that benefited most were longer, more conversational, and more preposition-dependent searches.

Consider a query like “2019 brazil traveler to usa need a visa.” Before BERT, Google might have interpreted this as a general search about travel to the USA and shown generic visa information pages. After BERT, Google understands that the critical word is “to” — a Brazilian traveler is going to the USA, and the query is asking whether that traveler needs a US visa. The preposition “to” completely changes the meaning, and BERT captures that distinction.

Another classic example: “can you get medicine for someone pharmacy.” Without BERT, Google might show results about how to get medicine at a pharmacy generally. With BERT, it understands the nuance: the searcher is asking whether you can pick up someone else’s prescription. This is a fundamentally different intent, and BERT routes the query to pages that address that specific scenario.

2.4 What BERT Means for Content Creation

BERT rewards content that communicates clearly and naturally. Here is what that means in practical terms for anyone writing for search:

Write for humans first. BERT is designed to understand natural language, so unnatural keyword-stuffed prose does not perform better. It often performs worse because it obscures the meaning that BERT is trying to extract. If your content reads like it was written by a person trying to help another person, you are on the right track.

Resolve ambiguity explicitly. If your topic contains terms with multiple meanings, provide clear contextual clues early. If you are writing about a “bank” in the context of fishing, mention the river, the water, or the shoreline in the first paragraph. This helps BERT classify your content correctly and match it to the right queries.

Answer related questions within the same page. Since BERT evaluates how comprehensively a page addresses a topic, content that naturally incorporates answers to semantically related questions will be evaluated as more complete. Do not just answer the primary query. Anticipate what the reader wants to know next.

Use precise language. Prepositions, conjunctions, and qualifiers are not filler words. They carry significant meaning that BERT analyzes. “Best running shoes for flat feet” is different from “best running shoes with flat soles.” Be precise in how you phrase your headings and body text because BERT is paying attention to those distinctions.

Structure is signal. Clear heading hierarchies, logical paragraph flow, and well-organized information send strong structural signals. BERT can parse these signals to understand which sections cover which subtopics, improving how your content is matched to long-tail queries.

3. MUM: Multitask Unified Model

3.1 What Is MUM?

MUM, the Multitask Unified Model, was announced by Google in May 2021 and represents the next major evolution in search AI. Google described MUM as being 1,000 times more powerful than BERT, and its capabilities extend far beyond what BERT was designed to do.

MUM’s name reveals its three core design characteristics:

Multitask: MUM is trained to perform many different types of tasks simultaneously. It can understand language, generate language, translate between languages, analyze images, and reason across multiple information sources, all within a single unified model. This is fundamentally different from earlier models that were typically trained for one specific task at a time.

Unified: Rather than being a collection of separate models working in sequence, MUM is a single integrated system. This unified architecture allows it to combine insights from different modalities and languages in ways that separate systems cannot.

Model: MUM uses the T5 (Text-to-Text Transfer Transformer) framework, which treats every NLP problem as a text generation task. Whether the input is a question, an image, or a multilingual document, MUM processes and generates text output. This uniform approach simplifies the architecture while enabling extraordinary versatility.

3.2 MUM’s Three Defining Capabilities

Multimodality: Understanding Across Formats

MUM is natively multimodal. It can process and understand information across text, images, and video simultaneously, and Google has indicated that future expansions will include audio and other modalities.

This means MUM can evaluate a search query that references visual information and assess whether a page’s images, infographics, and video thumbnails match the intent behind that query. If someone searches “how to tie a bowline knot,” MUM can evaluate whether a page contains a clear step-by-step illustration in addition to textual instructions.

Google’s illustrative example is telling: in the future, you might take a photo of your hiking boots and ask “can I use these to hike Mt. Fuji?” MUM would analyze the image, understand what type of boots they are, connect that understanding to what it knows about the conditions on Mt. Fuji, and provide a reasoned answer.

Multilingual Knowledge Transfer

MUM is trained across 75 languages simultaneously, and it can transfer knowledge between languages. If MUM learns something from content published in Japanese, it can apply that understanding when answering queries in English, Spanish, or Hindi.

This breaks down one of the most significant barriers in global information access. High-quality content written in languages with smaller web presences can now inform search results for queries in widely spoken languages. For website owners, this means that content about a specific topic published in any language may contribute to that topic’s global knowledge graph and influence rankings across languages.

Multi-Step Reasoning

MUM can handle complex queries that require analyzing multiple pieces of information and drawing reasoned conclusions. This is the capability that most dramatically expands what a search engine can do.

Google’s canonical example asks MUM to handle the query: “I hiked Mount Adams and now I want to hike Mount Fuji next fall. What should I do differently to prepare?” Answering this requires understanding both mountains, comparing their characteristics (elevation, climate, terrain difficulty, seasonal conditions), and identifying specific differences relevant to preparation.

MUM can understand that both mountains are roughly the same elevation but that fall on Mt. Fuji is a rainy season, so the hiker might need waterproof gear. It can surface subtopics such as fitness recommendations, gear comparisons, and permit requirements, drawing from diverse sources across the web.

Google noted that users currently issue an average of eight separate queries to answer complex questions like this. MUM aims to collapse those eight searches into one comprehensive result.

3.3 MUM’s Technical Architecture

MUM is built on the T5 (Text-to-Text Transfer Transformer) framework, which treats every task as receiving text input and producing text output. This unified approach simplifies what would otherwise be a complex pipeline of separate models.

T5 uses an encoder-decoder Transformer architecture, which differs from BERT’s encoder-only design. The encoder processes the input (text, image descriptions, or multimodal tokens) to create a rich internal representation. The decoder then generates output text from that representation. This generation capability is something BERT, as an encoder-only model, cannot do.

The scale difference is enormous. While BERT-large has approximately 340 million parameters, MUM operates at a scale that Google describes as 1,000 times more powerful, leveraging massive training datasets across 75 languages and multiple content types.

This scale enables MUM to develop what Google calls a “more comprehensive understanding of information and world knowledge.” MUM does not just match queries to text. It understands concepts, relationships, entities, and context at a level that approaches how a knowledgeable human would process information.

4. BERT vs. MUM: A Side-by-Side Comparison

While both BERT and MUM are AI models that improve Google’s language understanding, they were designed for different eras of search with different capabilities and ambitions. Understanding the differences helps you prioritize your content strategy effectively.

FeatureBERT (2019)MUM (2021)
ArchitectureTransformer encoder-onlyT5 encoder-decoder framework
Scale~340 million parameters1,000x more powerful than BERT
Input ModalitiesText onlyText, images, video (audio in future)
Language SupportEnglish at launch, expanded to ~70 languages75 languages from launch, with cross-lingual transfer
Primary StrengthContextual word understanding within sentencesMulti-step reasoning, cross-format analysis, knowledge transfer
Generation CapabilityNo (encoder-only, cannot generate text)Yes (encoder-decoder, can generate text)
Training ApproachMasked language modeling + next sentence predictionMultitask learning across many objectives simultaneously
SERP ImpactBetter featured snippets, improved query matchingRicher SERPs, visual results, cross-lingual sources, topic exploration features
Query Types HandledConversational, preposition-sensitive queriesComplex multi-intent queries requiring research and comparison
SEO ImplicationWrite naturally, structure clearly, resolve ambiguityBuild comprehensive, multimodal, multilingual content ecosystems

The most important thing to understand is that MUM does not replace BERT. These models serve different functions within Google’s search system. BERT remains the workhorse for day-to-day query understanding and content evaluation. MUM is deployed for the more complex tasks that require its advanced capabilities. They operate in parallel, complementing each other.

5. What Both Models Mean for SEO Strategy

5.1 The Fundamental Shift: From Keywords to Concepts

The progression from keyword-density ranking to BERT to MUM signals a clear and irreversible direction. Search engines are getting better at understanding meaning, and the distance between what a user types and what they truly want is shrinking with each new model iteration.

SEO strategies that succeed in this environment do not try to outsmart algorithms with mechanical tricks. They align with what the algorithms are trying to do: serve the best possible answer to the user’s question. This alignment is the single principle that unifies all effective modern SEO.

5.2 Content Depth Over Content Volume

Publishing many shallow pages targeting individual keywords is a strategy that BERT and MUM penalize implicitly. These models evaluate whether a single page comprehensively covers a topic. A page that addresses all the questions a user might have about a subject will, all else being equal, outrank five thin pages that each address only one narrow question.

Your content planning should start from topic clusters, not keyword lists. For every core topic you cover, ask: what would a user need to know before, during, and after engaging with this topic? Answer those questions within the same content ecosystem, using internal links to connect related pieces.

5.3 Entity-Based Content Architecture

Both BERT and MUM excel at understanding entities, which are the people, places, things, and concepts that give information its structure. Your content should explicitly define and connect entities within your topic area.

When you write about a subject, identify the key entities involved, use consistent terminology, and provide clear definitions for each. If you are writing about “BERT,” for example, explicitly connect it to related entities such as “Transformer architecture,” “natural language processing,” “attention mechanism,” “Google Search,” and “RankBrain.” This entity web helps Google understand the semantic territory your content covers.

Schema markup supports this by formally declaring entity relationships. Using Article, FAQ, HowTo, and Organization schema types gives BERT and MUM structured signals about what your content contains and how it connects to the broader knowledge graph.

5.4 Optimizing for Multimodal Discovery

MUM’s multimodality means that the images, videos, infographics, and diagrams on your pages are not decorative. They are content that MUM evaluates alongside your text. Here is how to ensure your visual content supports your SEO:

Descriptive file names and alt text: Every image should have a file name and alt attribute that describes what the image shows and how it relates to the page topic. Do not use “IMG_4721.jpg.” Use “bert-bidirectional-encoder-architecture-diagram.jpg.”

Video transcripts and captions: If you embed videos, include full transcripts on the page. MUM can process the transcript text to understand what the video covers, even if it cannot directly watch the video.

Infographics with surrounding context: Infographics are powerful, but MUM needs text to understand them. Always surround your infographics with explanatory paragraphs that describe the data, key takeaways, and conclusions shown in the visual.

Unique, original visuals: Stock photos that do not add real information may not hurt you, but they do not help you with MUM either. Original diagrams, charts, screenshots, and illustrations that contribute substantive information to the page are what MUM values.

5.5 EEAT Signals Matter More Than Ever

BERT and MUM do not directly measure Expertise, Authoritativeness, and Trustworthiness. But they enhance Google’s ability to evaluate whether content actually demonstrates these qualities. A page that claims to be authoritative but contains factual errors, vague language, or contradictory information will be identified more accurately as low-quality by these models.

Demonstrate expertise by citing sources, linking to primary research, including author credentials, and acknowledging the limitations of your claims. Write with the precision and depth that signals genuine knowledge rather than surface-level content aggregation.

5.6 Internal Linking and Topical Clusters

BERT and MUM evaluate pages not in isolation but as part of the website’s broader topical structure. A strong internal linking architecture that groups related pages together helps these models understand that your site is a comprehensive resource on a subject.

Every piece of content should link to other relevant pages on your site. These internal links should use descriptive anchor text that clearly signals what the linked page covers. “Click here” tells BERT nothing. “Our guide to BERT’s attention mechanism” tells it exactly what to expect.

Build hub pages that serve as comprehensive overviews of broad topics, with spoke pages that dive deep into specific subtopics. This hub-and-spoke structure is exactly what BERT and MUM are optimized to understand and reward.

6. Practical Content Optimization Checklist for BERT and MUM

Here is a concrete checklist you can apply to any piece of content to ensure it is optimized for Google’s AI-driven search models:

1. Check for natural language. Read your content aloud. Does it sound like a human explaining something to another human? If it sounds robotic or keyword-stuffed, rewrite it.

2. Verify topic coverage. Does this page answer the top 10 related questions users ask about this topic? If not, expand your coverage. Use People Also Ask boxes and related searches for direction.

3. Define your entities. Are the key entities in your topic clearly named, defined, and connected? Can a reader unfamiliar with the subject follow your entity references?

4. Audit your headings. Do your H2s and H3s form a coherent outline of the topic? If someone read only your headings, would they understand the structure and scope of your content?

5. Review visual content. Does every image have descriptive alt text? Do your videos have transcripts? Do your visuals add substantive information rather than just decoration?

6. Validate internal links. Does this page link to at least 3-5 other relevant pages on your site? Do the anchor texts clearly describe the linked content?

7. Check for ambiguity. Are there terms in your content that could mean different things? Have you provided sufficient context to resolve those ambiguities?

8. Assess EEAT signals. Does the page cite sources? Does it show who wrote it and why they are qualified? Does it acknowledge limitations or alternative perspectives?

9. Implement structured data. Have you applied the appropriate schema types (Article, FAQ, HowTo, BreadcrumbList) to your page?

10. Verify mobile and page speed. None of the above matters if your page loads slowly or breaks on mobile devices. Technical performance remains a prerequisite for everything else.

7. The Future: What Comes After MUM

BERT and MUM are not the end of Google’s AI roadmap. They are waypoints on a longer journey toward search that behaves more like a knowledgeable expert than a library catalog.

Google has already begun integrating MUM capabilities into visible search features, including generative AI-powered overviews, topic exploration tools, and visual search enhancements. The trend is toward fewer clicks, more answers directly in the SERP, and content discovery that spans formats and languages seamlessly.

For content creators and SEO professionals, the path forward is clear: build substantial, well-structured, multimedia-rich content ecosystems that genuinely serve user needs. The algorithms will continue to get smarter. Content that is genuinely useful to humans will always have a place in search results because that is what these algorithms are designed to find.

If you want to stay ahead, stop optimizing for today’s algorithms and start optimizing for what the algorithms are becoming: systems that evaluate content the way a knowledgeable, discerning human would.

Frequently Asked Questions

Has BERT been fully rolled out across all Google searches?

Yes. Google confirmed that BERT was applied to nearly all English-language queries starting in late 2019 and has since expanded its coverage to over 70 languages. BERT is now a deeply integrated component of Google’s core ranking and query understanding systems. It is not a separate filter or penalty layer. It is part of how Google reads and interprets every search.

Is MUM replacing BERT in Google Search?

MUM is not a replacement for BERT. The two models serve different functions within Google’s search architecture. BERT handles the broad task of contextual language understanding for routine queries. MUM is deployed for complex, multimodal, or multilingual tasks that require reasoning across multiple information sources. They work together as complementary systems.

How should I optimize my content specifically for MUM?

Since MUM is multimodal and multilingual, optimize across formats. Create high-quality images with descriptive alt text and file names. Include video content with full transcripts. Use schema markup to define entities and relationships. Write with clear structure so MUM can identify which sections of your content address which user needs. The core principle is to create genuinely useful, well-organized content that works across text, images, and video.

Does BERT understand entities and concepts, or just words?

BERT processes text at the token level, but through its training on massive datasets, it develops internal representations that correspond to entities and concepts. It understands that “Eiffel Tower,” “Paris,” and “France” are related entities without having them explicitly defined in a knowledge graph. This entity awareness is what allows BERT to match queries to pages even when different vocabulary is used to describe the same concept.

What is the connection between BERT, MUM, and featured snippets?

Both BERT and MUM improve Google’s ability to identify content suitable for featured snippets. BERT helps match complex, preposition-sensitive queries to the specific paragraphs that answer them. MUM extends this capability to multimodal featured snippets that might include images, comparison tables, or translated content. Well-structured content with clear headings, concise answers, and supporting detail is most likely to be selected for featured snippets under both models.

Do I need to change my keyword research approach because of BERT and MUM?

Yes, but the change is evolutionary, not revolutionary. Continue doing keyword research, but treat individual keywords as entry points into broader topics, not as isolated targets. For every keyword you target, map out the related questions, the subtopics users explore before and after that query, and the entities involved. Build content that covers the entire topic, not just the single keyword. This topic-first approach aligns with how BERT and MUM evaluate content completeness.

How does MUM’s cross-lingual capability affect English-language content?

MUM can transfer knowledge from content written in any of its 75 supported languages into English-language search results. This means that if authoritative, detailed content about a niche topic exists primarily in Japanese, German, or Korean, MUM can use that knowledge to inform English search results. For English content creators, this means your content now competes in a truly global information market. Depth, accuracy, and genuine expertise matter more than ever because MUM can source better information from any language.