Glossary · Glossary

Indexing

Indexing is the process of storing and organizing discovered content so it can be considered for search results.

Updated Jun 3, 2026 Reviewed Jun 3, 2026 en

Indexing is the process of storing, organizing, and understanding discovered content so it can be considered for search results. A page usually needs to be crawlable before it can be indexed, but crawling alone does not guarantee indexing.

For content teams, indexing is the point where a public page becomes part of a search system’s usable source library. If a glossary definition, guide, or report is not indexed, it is far less likely to appear in organic results or become a supporting link in search-driven AI features.

Why it matters

Indexing is the bridge between publication and visibility. A page can be live on a site, linked internally, and listed in a sitemap, but if search systems do not index it, the page has little search surface. That affects classic SEO and can also reduce the chance that AI answer systems discover or cite the page through search-connected retrieval.

Indexing is especially important for canonical and multilingual decisions. Search systems may index the canonical version of a duplicate set, ignore a thin variant, or attribute performance to a selected representative URL.

How it differs

Crawling is fetching a URL. Crawlability is whether fetching is possible. Indexing is the later decision to store and organize the content.

Noindex is an instruction asking search engines not to include a page in search results. A canonical URL is different: it asks systems to treat one URL as the representative version among duplicate or similar URLs.

Example states

State	Meaning	Typical action
Discovered but not crawled	The URL is known but not fetched yet	Improve internal links and sitemap coverage
Crawled but not indexed	The page was fetched but not selected for the index	Review quality, duplication, canonical signals, and directives
Indexed	The page is stored and eligible for search retrieval	Monitor queries, clicks, and page relevance
Indexed under another URL	A different canonical was selected	Check canonical tags, redirects, internal links, and duplicate content

How teams use it

Teams review indexing after publishing important pages, changing canonicals, adding noindex, launching localized routes, or cleaning thin pages. For Geolyze, that means accepted glossary pages should be indexable, while raw/ materials, drafts, internal source notes, and unreviewed content should stay out of the sitemap and public index.

A simple page-level diagnostic asks:

Can crawlers access the URL?
Is the page allowed to be indexed?
Does the canonical point to the intended page?
Is the content useful and distinct?
Is the URL present in the public sitemap only if it should be indexed?

Common misunderstanding

Sitemap inclusion is not an indexing guarantee. A sitemap helps search engines discover important URLs, but search systems still decide whether each page is useful, allowed, canonical, and distinct enough to index.