Building a RAG Application using AWS GenAI Toolings

💡

Prerequisites: 1. You should have the valid AWS account with SageMaker Studio access in the AWS region. 2. You should have Embedding and LLM model running and its endpoints should be available.

Introduction

In the earlier lab, you have experienced using Retrieval Augmented generation as a design pattern for Question and Answering systems to aid a generative AI model with factual documents and information as additional context. The information can be retrieved from enterprise search systems or local databases or even public search engines. However, there could be various challenges emerging for a self-managed setup. The following 3 areas are some examples:

Complexity

Large Model Size

Model Sharding

Model Serving

Inference WorkFlows

Technical Expertise

Infrastructure Setup

Cost

Model compilation

Model hosting cost

Operational Overhead

Models management

Performance

Model compression

Latency

Throughput

In this lab we show you how you can build a similar domain specific search application using AWS's fully managed and little-to-no-code services - Bedrock.

Key Components

LLM (Large Language Model): Anthropic's Claude models are available through Amazon Bedrock. This model will be used to understand the document chunks and provide an answer in human friendly manner.

Semantic Store: Using AWS Kendra, we can have a fully managed and scalable no-code semantic database without writing any code. In this notebook we will use this Kendra's connectors to crawl and store both the embeddings and the documents.

Enable Bedrock Models

Bedrock setup and model access

Inside AWS console, search for bedrock at the search bar as shown below.

Inside Bedrock page, click on Model Access at the left navigation menu

In the model management page, click on "Enable specific models”

In the pop up menu, select these 4 models shown in the screenshot :

Amazon: Titan Embeddings G1 - Text

Amazon: Titan Text G1 - Express

Anthropic: Claude

Anthropic: Claude Instant

And click Save changes at the bottom right.

Since our demo is not having any vision based RAG applications. We are ok to use model that used for text search or retrieval use case.

You will see a notification saying model access will be available in a few minutes.

For practice, you can now go to Bedrock Playground by clicking on Chat,Text or Image. Go ahead to converse with Bedrock via different prompts.\

Once the access granted, you will see something like below.

Bedrock API Basics

In this section you should learn...

The basics of interacting with Amazon Bedrock models when both generating text and embedding text for semantic search

Getting Started

You need to have the Amazon SageMaker Studio instance open to start using the notebook. If you haven't done the Amazon SageMaker Studio setup, Follow this blog first.

To start the lab, we will open the notebook titled 01_workshop_setup.ipynb in the lab2/lab-notebooks directory as shown below. To do this, double click the notebook.