RAG in ServiceNow

Oliver Nowak
Oct 10
4 min read

When it comes to AI, Retrieval-Augmented Generation (RAG) is one of the most fundamental building blocks of any intelligent application.

Most people think of RAG as an external process, a kind of bolt-on intelligence layer that fetches data from a document store, shoves it into a Large Language Model (LLM), and hopes for the best. But inside ServiceNow, RAG is not an afterthought. It's embedded right into the fabric of the platform through AI Search.

RAG is by no means a new concept, I first came across it months and months ago. But recently I've become more much interested in fully understanding how it works and the mechanics behind it. Particularly in the face of slightly daunting words like indexing, retrieval, chunking, scoring, re-ranking and embeddings. To me it seems, once you understand how ServiceNow structures these, you can start to tune RAG for your own use cases, adapt it to different data sources, and unlock a new level of control over AI in the platform.

A person on a ladder reaches for a book in a dimly lit library, surrounded by stacks of books and warm sunlight streaming through a window.

The Anatomy of Semantic Search

At the core of ServiceNow's RAG implementation is Semantic Search. This is built on a set of interlinked configuration tables that define how information is stored, indexed and retrieved.

The hierarchy looks something like this:

Search Application - The top-level container for the search experience (Now Assist, Virtual Agent, Portal, Mobile etc.)
Search Profile - Defines how the search behaves i.e. what data sources it uses, and how it ranks and filters results.
Search Source - A specific table, such as kb_knowledge or incident, that provides the data to be searched.
Indexed Source - Marks a source as enabled for semantic or hybrid search.
Semantic Index Config - The configuration that controls vector embeddings, the heart of semantic search.
Semantic Field - Individual fields within tables whose text is embedded into vectors.

Together, these form the semantic backbone of the platform. This is what converts ordinary records into vectorised meaning. When the RAG Search API is invoked, the system navigates through these relationships to fetch semantically similar results.

Think of it the indexing path as:

Semantic Field → Semantic Index Config → Indexed Source → Search Source → Search Profile → Search Application

From Query to Generation

When a user or an AI agent asks a question like "How do I reset my VPN password?", ServiceNow translates that query into an embedding and compares it against the stored vectors. The RAG Search API handles this behind the scenes, returning the most semantically relevant results based on similarity scores and metadata filters.

This is the retrieval path:

User → RAG Search API → Search Profile → Semantic Index → Retrieve → LLM → Response

The result is not just a keyword match, but a contextually relevant response grounded in your own platform data, which is the essence of RAG.

Going Beyond the Defaults

Where things get really interesting is when you start customising this process.

Behind the scenes, RAG in ServiceNow can be visualised as a layered pipeline. Each layer offers a place to inject intelligence or control:

Agent Orchestration Layer - Logic that decides when to call RAG, which profiles to use, and how to combine the data sources.
Retrieval Layer - Handles chunking, embedding and scoring. You can customise chunk sizes or embedding models for different content types.
Ranking Layer - Refines the results with hybrid or re-ranking logic. For example, boost recent content or weight technical documentation more heavily.
Prompt Composition Layer - Constructs the message to the LLM, inserting the retrieved text in a may that maximises relevance.
Post-Processing Layer - Verifies, formats, and cites the answer before surfacing it to the user.

This architecture allows you to treat RAG as a programmable system, not just a static capability. For example, you could build a scripted Rest API that queries specific search profiles via the RAG Search API, re-ranks results with your own logic, and injects them into a Now LLM prompt.

That's where the magic happens. ServiceNow's native RAG capabilities become part of a broader agentic workflow, where agents reason, retrieve, and act. All within the boundaries of the same platform.

The Bigger Picture

AI Search and the RAG Search API are not just utilities but foundational concepts for agentic AI inside ServiceNow.

When you layer orchestration logic on top of them, you create an ecosystem where AI agents can reason contextually, fetch authoritative data, and execute tasks without leaving the platform.

You move from "asking questions" to operating workflows because retrieval is not just about finding information, but enabling intelligent action.

Closing Thoughts

RAG in ServiceNow isn’t a mystery box anymore. It’s a structured, transparent system that you can tune, extend, and experiment with. Once you understand the hierarchy of search applications, profiles, sources, and semantic indices, you realise you’re sitting on a fully-fledged RAG engine.

And as the era of AI agents continues to develop, this foundation matters more than ever. The systems that succeed won’t be the ones with the flashiest generative models, they’ll be the ones that know how to retrieve the right context, at the right time, for the right task.

Because in the end, the intelligence of an agent depends on what it knows, and that is dependent on what it can find.