Introduction to Azure AI Search and Retrieval Augmented Generation (RAG)

July 8, 2024
14 min read

Azure AI Search, previously known as "Azure Cognitive Search," offers secure and scalable information retrieval capabilities for user-owned content in various search applications, including traditional and conversational formats. Note that several of the images in this article are from Microsoft and they still use the term Azure Cognitive Search.

Image of Azure AI Search Logo
Figure 1: Azure AI Search Logo | Used with permission of Microsoft

This service is crucial for applications that handle text and vectors, such as catalog or document search, data exploration, and modern chat-style copilot apps using proprietary data. When setting up a search service with Azure AI Search, you have access to the following features:

  • Search engine capabilities: Supports vector search, full text, and hybrid search over a search index.
  • Advanced indexing: Offers rich indexing features, including integrated data chunking and vectorization (currently in preview). It also provides lexical analysis for text and optional AI enhancements for content extraction and transformation.
  • Rich query syntax: Enables a variety of query types such as vector queries, text search, hybrid queries, fuzzy search, autocomplete, geo-search, and more.
  • Azure's robust infrastructure: Benefits from Azure's scalability, security, and extensive reach.
  • Seamless azure integration: Integrates smoothly with Azure's data layer, machine learning capabilities, Azure AI services, and Azure OpenAI.
Image of Azure AI architecture with prompts and data sources
Figure 2: How Azure AI Search works together with OpenAI Service | Used with permission of Microsoft | View Full Size

How to Setup Azure AI Search in Your RAG Application

To set up Azure AI Search in your RAG (retrieval-augmented generation) application, follow these step-by-step instructions to seamlessly integrate powerful search capabilities into your system.

1. Create an Azure AI Search resource

  • Sign in to the Azure Portal: Go to the Azure Portal and log in with your credentials.
  • Create a new resource: Click Create a resource and search for Azure AI Search.
  • Configure the search service: Fill in the details such as name, subscription, resource group, location, and pricing tier.
  • Review and create: Review the configuration and click Create to deploy the Azure AI Search service.

2. Configure the search index

  • Access your search service: Navigate to the newly created Azure AI Search resource.
  • Create an Index: In the Azure AI Search blade, go to "Indexes" and create a new index. Define the schema for your index, including fields, data types, and attributes like retrievable, filterable, searchable, etc.

3. Upload data

  • Import data: Use the "Import data" wizard to connect to your data source (e.g., Azure SQL Database, Azure Blob Storage) and import data into your search index.
  • Sync from other systems: You can connect databases to sync the information automatically to Azure Search.
Screenshot of Azure AI Search content processed
Figure 3: Azure AI Search content process from ingest to exploration | Used with permission of Microsoft

How to Consume Azure AI Search Data in Your RAG Application

To consume Azure AI Search data in your RAG application, integrate the search API for efficient data retrieval.

Consuming Azure AI Search in a .NET Application

  1. Install Azure AI Search SDK. Use the NuGet package manager to install the Azure.Search.Documents package.
  2. Initialize the search client. Create an instance of SearchClient using your service endpoint and index name.
var endpoint = new Uri("https://[service name]");
var indexName = "[index name]";
var apiKey = new AzureKeyCredential("[your api-key]");
var client = new SearchClient(endpoint, indexName, apiKey);

     3. Execute search queries. Use the SearchClient to send search queries and receive results.

var options = new SearchOptions { /* Set options like size, filters, etc. */ };
var response = client.Search<[YourModelType]>("your search query", options);
foreach (var result in response.Value.GetResults())
// Process each search result

How to Handle API Responses in Your Application from Azure AI Search

The search method returns a SearchResults object, which you can iterate over to access individual results.

Remember to replace placeholders like [service name], [index name], [your query], and [your api-key] with actual values from your Azure AI Search service. Ensure your data model ([YourModelType]) matches the schema of your search index for the .NET client.

What Are Vectors and How Are They Stored in Vector Databases?

Vectors and vector databases form the backbone of modern AI systems, particularly in the realm of information retrieval and organization.

  • Understanding vectors: In AI, vectors are numerical representations of data, whether text, images, or sounds. These representations allow AI systems to process and understand complex data forms efficiently.
  • Vector databases: These specialized databases store and manage vector representations. They enable quick retrieval of similar vectors, which is crucial in tasks like semantic search, where the system finds items similar in meaning rather than just in keyword match.
  • Impact on AI performance: The use of vectors and vector databases significantly enhances AI performance. They allow for more nuanced understanding and retrieval of information, facilitating more sophisticated AI applications.
Screenshot of Azure AI Vector Search data ingest and vectorization process
Figure 4: How to add data in Azure AI Vector Search as vector | Used with permission of Microsoft

How Azure AI Search Works with RAG

Azure AI Search enhances RAG applications by providing advanced search capabilities to retrieve relevant data quickly and accurately. By leveraging Azure AI Search, RAG applications can access vast amounts of information, ensuring more precise and contextually relevant responses.

Benefits of RAG

RAG techniques enhance search functionalities by not only retrieving the most relevant documents or data but also generating insights and answers based on the retrieved information. This approach is particularly useful in scenarios where the query is complex, or the information required is not directly available in a single document.

When the information is retrieved from the Azure AI Search it is sent to the LLM. Then it creates a response based on that information and it can infer or create an answer that was not previously present in the documents.

The Mechanics of RAG

In a RAG system, the retrieval component of Azure AI Search plays a crucial role. When a query is made, the system first retrieves a set of relevant documents or data points. The generative component then processes this retrieved information, using machine learning models to generate responses or insights that are contextually enriched by the underlying data.

Image of Azure AI RAG process
Figure 5: Azure AI RAG method and steps need it | Used with permission of Microsoft

Integration of AI Models

Azure AI Search seamlessly integrates with various AI models available in the Azure ecosystem, such as natural language processing (NLP) models. This integration is key to implementing RAG, as it allows the system to understand and interpret queries in a human-like manner and generate more refined and accurate responses.

Image of RAG architecture
Figure 6: RAG architecture in Azure | Used with permission of Microsoft

Data Indexing and Retrieval

Efficient data indexing is critical in RAG systems. Azure AI Search provides tools and features to index data across various formats and sources. The retrieval process is optimized for speed and relevance, ensuring that the generative models have access to the best possible information for response generation.

Creating a RAG-based Search Application Using Azure AI Search with .NET

A RAG-based search application combines retrieval-augmented generation techniques with robust search capabilities to deliver precise and contextually relevant results. By using Azure AI Search with .NET, you can enhance your RAG-based search application, leveraging Azure's powerful search algorithms to improve data retrieval and user experience. Azure AI Search is not strictly required to create an RAG-based application, but it significantly enhances the application's efficiency and accuracy.

Setting Up the Environment

To create a RAG-based search application using Azure AI Search and .NET, developers first need to set up an Azure AI Search service through the Azure portal. This involves choosing the appropriate service tier, configuring indexes, and importing data. You can find the details in this article:

Indexing Data for RAG

Data indexing for RAG requires a strategic approach. Developers need to ensure that the data is comprehensive and well-structured. Using Azure's SDKs, particularly the .NET SDK, developers can index data by defining the schema, fields, and attributes. Code snippets in C# can illustrate how to connect to the Azure AI Search service, create indexes, and upload data.

Connecting to the Azure AI Search Service

using Azure;
using Azure.Search.Documents;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
// Define the endpoint and key for your Azure AI Search service
string endpoint = "https://<your-search-service-name>";
string apiKey = "<your-api-key>";
Uri serviceEndpoint = new Uri(endpoint);
AzureKeyCredential credential = new AzureKeyCredential(apiKey);
// Create a SearchIndexClient to manage and create indexes
SearchIndexClient indexClient = new SearchIndexClient(serviceEndpoint, credential);
// Create a SearchClient to upload and search documents
SearchClient searchClient = new SearchClient(serviceEndpoint, "<index-name>", credential);
Image of Azure AI Search process to index data
Figure 7: Azure AI Search process to index data | Used with permission of Microsoft

Querying with RAG Techniques

Implementing RAG-based querying involves using .NET to send queries to the Azure AI Search service and processing the returned results. Developers can use the SearchClient in the Azure.Search.Documents library to execute search queries. The queries can be augmented with machine learning models to generate enriched responses.

// step 1
// use llm to get query if retrieval mode is not vector
string? query = null;
if (overrides?.RetrievalMode != "Vector")
   var getQueryChat = chat.CreateNewChat(@"You are a helpful AI assistant, generate a search query for follow-up questions");

   var result = await chat.GetChatCompletionsAsync(getQueryChat, cancellationToken: cancellationToken);
   if (result.Count != 1) {
      throw new InvalidOperationException("Failed to get search query");
query = result[0].ModelResult.GetOpenAIChatResult().Choice.Message.Content;

// step 2

// use query to search related docs
var documentContentList = await _searchClient.QueryDocumentsAsync(query, embeddings, overrides, cancellationToken);
   string documentContents = string.Empty;
   if (documentContentList.Length == 0 {documentContents = "no source available.";}
  else {
      documentContents = string.Join("\r", documentContentList.Select(x =>$"{x.Title}:{x.Content}"));


Handling Search Results

The search results from a RAG system can be complex, containing both retrieved documents and generated insights. Developers need to design their applications to handle and present these results effectively. This might involve parsing JSON responses and integrating them into the user interface of the application.

// step 3
// put together related docs and conversation history to generate answer
var answerChat = chat.CreateNewChat("You are a system assistant who helps the company employees with their healthcare " +
"plan questions, and questions about the employee handbook. Be brief in your answers");
// add chat history
foreach (var turn in history) {
    if (turn.Bot is { } botMessage) {answerChat.AddAssistantMessage(botMessage);}
// format prompt
answerChat.AddUserMessage(@$" ## Source ##
## End ##

You answer needs to be a Json object with the following format.

   ""answer"": // the answer to the question, add a source reference to the end of each sentence. e.g. Apple is a fruit [reference1.pdf] [reference2.pdf]. If no source is available, put the answer as I don't know. 
   ""thoughts"": // brief thoughts on how you came up with the answer, e.g. what sources you used, what you thought about, etc. }}"); 
   // get answer 
   var answer = await chat.GetChatCompletionsAsync( answerChat, cancellationToken: cancellationToken); 
   var answerJson = answer[0].ModelResult.GetOpenAIChatResult().Choice.Message.Content; 
   var answerObject = JsonSerializer.Deserialize<JsonElement>(answerJson); 
   var ans = answerObject.GetProperty("answer").GetString() ?? throw new InvalidOperationException("Failed to get answer"); 
   var thoughts = answerObject.GetProperty("thoughts").GetString() ?? throw new InvalidOperationException("Failed to get thoughts"); 
// step 4 
// add follow up questions if requested if (overrides?.SuggestFollowupQuestions is true) 
   var followUpQuestionChat = chat.CreateNewChat(@"You are a helpful AI assistant"); followUpQuestionChat.AddUserMessage($@"Generate three follow-up question based on the answer you just generated. 

   # Answer {ans} 

   # Format of the response 
   Return the follow-up question as a Json string list. e.g. 
      ""What is the deductible?"", 
      ""What is the co-pay?"", 
      ""What is the out-of-pocket maximum?"" 
   var followUpQuestions = await chat.GetChatCompletionsAsync( followUpQuestionChat, cancellationToken: cancellationToken); 
   var followUpQuestionsJson = followUpQuestions[0].ModelResult.GetOpenAIChatResult().Choice.Message.Content; 

   var followUpQuestionsObject = JsonSerializer.Deserialize<JsonElement>(followUpQuestionsJson); var followUpQuestionsList = followUpQuestionsObject.EnumerateArray().Select(x => x.GetString()).ToList(); 
      foreach (var followUpQuestion in followUpQuestionsList) 
          ans += $" <<{followUpQuestion}>> "; 

Pros and Cons of Using Azure AI

The integration of Azure AI Search presents a mix of advantages and drawbacks, each impacting its application in different scenarios. Here's a summary of these pros and cons:


  • Enhanced quality of search results: The combination of RAG with Azure AI Search leads to search results that are not just relevant but also rich in contextual understanding.
  • Scalability: Azure AI Search's ability to efficiently handle large datasets and a high volume of queries is a significant advantage.
  • Access to advanced AI models: Integration within the broader Azure ecosystem allows for the utilization of cutting-edge AI and machine learning models, enhancing the generative capabilities of RAG.


  • Complex setup and management: Implementing and maintaining a RAG system in Azure AI Search can be complex, particularly for those who are new to Azure or lack machine learning expertise.
  • Cost: The use of advanced AI features in Azure AI Search can incur substantial costs, requiring careful budget planning by organizations.
  • Potential limitations in customization: While Azure AI Search does offer customization options, it may not provide the same level of flexibility as a fully custom-built solution.

Comparison With Other Platforms

Compared to other platforms like Elasticsearch or Amazon Cloud Search, Azure AI Search with RAG offers unique AI integration capabilities. However, these other platforms may offer more flexibility in certain areas or different pricing models that might be more suitable for specific use cases.

Licensing and Cost Considerations for Azure AI Search

When implementing Azure AI Search, it's essential to understand the licensing and cost implications. Azure AI Search offers various pricing tiers, enabling you to choose a plan that fits your budget and performance needs. Additionally, considering the licensing options ensures compliance and optimal usage of Azure's powerful search capabilities.

Licensing Models for Azure AI Search

  • Azure AI Search provides various licensing options, catering to different needs from small-scale applications to enterprise-level usage.
  • The licensing is generally based on Pay-as-you-go and it has different tiers based on the number of indexes, data volume, and query throughput.

You can learn more about pricing details here, but keep in mind that the prices will change as Microsoft adds more features:

Cost Structure for Azure AI Search

The cost of Azure AI Search in RAG applications varies based on several factors:

  • Basic costs are associated with the Azure AI Search service itself, which is priced according to scale and usage as shown in figure 8.
  • Additional costs stem from using AI and machine learning models, with variations depending on complexity and frequency of use.

Developers must be aware of the cost implications of various features and usage patterns.

Screenshot of Azure AI Search pricing tier, costs are shown in Euros.
Figure 8: Azure AI Search pricing tier | Used with permission of Microsoft | View Full Size

Budgeting and Cost Optimization for Using Azure AI Search

  • Effective cost optimization requires careful planning of the Azure AI Search implementation, taking into account data scale, query frequency, and necessary AI features.
  • Utilizing Azure's cost management tools is beneficial for tracking and managing expenses.
  • Optimizing data indexing and query strategies can lead to reduced usage and thus lower costs.

Use Cases and Potential Applications of RAG with Azure AI Search

Discover how RAG combined with Azure AI Search can revolutionize your applications. From enhancing customer support systems to optimizing content management, explore the diverse use cases and powerful potential of integrating these technologies.

RAG with Azure AI Search Can Lead to Innovative Search Applications

RAG techniques, when implemented with Azure AI Search, can transform search applications across various domains. In e-commerce, for example, RAG can enhance product searches by providing more nuanced and contextually relevant results. In academic research, RAG can help researchers find and synthesize information from a vast array of sources more efficiently.

Image of Azure AI solution architecture of RAG application
Figure 9: Azure AI solution architecture of a RAG application where a user asks questions about his data | Used with permission of Microsoft | View Full Size

Real-World Scenarios Using RAG with Azure AI Search

In customer service, a RAG-based Azure AI Search system can improve the efficiency of knowledge base searches, helping customers find solutions faster and more accurately. In content management systems, RAG-based Azure AI Search can enable powerful content discovery features, allowing users to find relevant content based on a deep understanding of their queries.

Screenshot knowledge base application based on Azure AI Search
Figure 10: Knowledge base application based on Azure AI Search | Used with permission of Microsoft View Full Size

Future of RAG in Search

The future of RAG in search applications is promising, with potential advancements in AI and machine learning models offering even more sophisticated and accurate search capabilities. As Azure AI Search continues to evolve, we can expect to see more innovative uses of RAG in various industries, driving forward the capabilities of search technology.

The integration of vector search and state-of-the-art retrieval with generative AI apps, as exemplified by Azure AI Search and the principles of RAG, heralds a new era in AI's capabilities. These technologies not only enhance the efficiency and effectiveness of AI applications but also open up new possibilities for their use in various sectors. As these technologies continue to evolve, we can expect even more innovative and impactful applications in the future.


Rubén Toribio

Rubén Toribio

Rubén Toribio is a software developer with over 13 years of experience in the field, specializing in web development using Microsoft technologies such as SharePoint, .NET, and Azure. He is also Microsoft Certified: Azure Developer Associate and Microsoft Certified: SharePoint Developer, demonstrating his expertise in these areas.

Rubén has a deep understanding of SharePoint development and extensibility, building custom solutions. Throughout his career, Rubén  has been involved in numerous complex projects. He is highly motivated, constantly seeking out new opportunities to learn and stay up-to-date with.

Rubén is passionate about sharing his knowledge and helping others succeed. He is an active member of the tech community, regularly participating in speaking engagements, training sessions and workshops.