Introduction & Motivation

RAG, RAG, and more RAG.

Back in 2023, LLMs became the buzzword with the introduction of ChatGPT. Soon after, “prompt engineering” took off. Then came RAG. Today, we are talking about agents, multi agents, and even “vibe coding.” Fast forward again and now MCP is in the spotlight. All of this is a reminder that trends come and go.

Behind the buzzwords, though, there are genuinely valuable skills worth learning. In my honest opinion, RAG is one of them.

The motivation for this project came from seeing so many RAG powered apps and wanting to build something similar myself. When I built this for the first time, I also struggled to find a clear guide on cleaning up resources, which ultimately led to this blog post. I even built a RAG app called ChatIO recently, but like many of my side projects, it never made it past localhost 😅

What are we building?

We’ll be creating a Chat with PDF experience using Amazon Bedrock Knowledge Bases. We will build two versions:

One will use OpenSearch as the vector database and the
Other will use S3 vectors.

Later, we will compare their costs to see which option is more affordable.

This post will include real screenshots of error messages and their fixes because the best way to learn is by breaking things first.

Architecture

Before we start clicking buttons, let’s look at the architecture diagram above. Our application follows a standard RAG (Retrieval-Augmented Generation) workflow, which handles data in two distinct phases:

The Ingestion Phase (Preparation) This is what happens "offline" to get your data ready.

Storage: We upload the PDF to Amazon S3.
Vectorization: Amazon Bedrock picks up the file, splits it into smaller chunks, and uses the Amazon Titan model to convert those chunks into "embeddings" (numerical vectors).
Indexing: These vectors are stored in a managed vector store so they can be searched later.

2. The Retrieval Phase (Chatting) This is what happens in real-time when you ask a question.

Search: You type a question in the console. Bedrock converts your question into vectors and searches your Knowledge Base for the most relevant PDF chunks.
Generate: Bedrock takes your question plus those relevant chunks and sends them to Anthropic Claude.
Answer: Claude writes a response based specifically on the information found in your PDF.

Amazon Bedrock

Amazon Bedrock is officially defined as a machine learning platform for building generative AI applications on AWS. But for developers like you and I, I prefer to think of it as a central generative AI hub.

Instead of having to build and manage complex, expensive server infrastructure for every separate model, Bedrock handles the heavy lifting. Like many AWS services, it’s ✨fully managed✨. You might notice a similarity to LiteLLM: both tools let you easily swap between different models without rewriting your entire application. But they sit at different parts of the stack. While LiteLLM is a code library that unifies your API calls, Bedrock is the actual infrastructure hosting the models. It’s the big boy!

This setup is perfect for our project because it gives us native access to a huge catalog of foundation models. We can use Amazon Titan for our 'finder' (embeddings) and Anthropic Claude for our 'explainer' (chat), all running securely within our AWS environment

We’re also going to explore a feature called Knowledge Bases. Have you ever talked with a smart customer support bot? Those usually rely on the company’s knowledge base. Knowledge Bases are ways of augmenting AI tools that were trained on general data by giving them access to your specific data.

Implementation

We’ll achieve this in a couple of steps.

Step 1: Upload data source to S3

Before our Amazon Bedrock application can "chat with" your PDFs, we need a secure and scalable place to store them. Amazon S3 (Simple Storage Service) is the perfect, cost-effective solution for this. In this step, we'll navigate to the S3 console and create a new "bucket"—which is like a container for your files.

From the AWS Console, navigate to S3 and click Create bucket.

This will take you to the main configuration page. First, you'll need to give your bucket a globally unique name—something like myawsbucket will already be taken, so add your initials or a project name.

The most important setting here is the AWS Region. Make sure to choose the same region you plan to use for Amazon Bedrock (like us-east-1 (N. Virginia)). This keeps all your services in the same data center, which is faster and cheaper.

As you scroll down, we'll configure the security settings. For our app, we want to keep Block all public access checked. This is the secure default, and our application will be given private, secure access later. We also want to enable Default encryption using SSE-S3. This is a standard best practice that tells S3 to automatically encrypt our documents for us.

Once those settings are in place, you can click Create bucket.

Once the bucket has been created, upload your PDFs. This one is pretty straight forward.

Yay! Now, you have an S3 bucket with the documents.

Step 2: Request model access

Now that we have a place to store our PDFs, we need to get the "brains" for our application. By default, most foundation models in Amazon Bedrock are not enabled. We need to manually request access before we can use them. In the Bedrock console, navigate to Model access (it's at the bottom of the left-hand navigation menu).

For this, we are going to be using the Titan Embedding model and Claude Sonnet as our chat model. You can use any embedding or chat model of your choice. Now, you may be wondering, what do these terms even mean?

To my understanding, an embedding model is basically a translator that turns your text into numerical representations that can be compared for similarity. Think of it like this: when you store PDFs or documents, they're just big blocks of text that are hard to search through efficiently. The embedding model breaks these documents down into smaller, digestible chunks and converts each chunk into what we call "vectors."

For example, let’s say you have a document that says "Our return policy allows 30 days for refunds." The embedding model might convert this into something like [0.2, 0.8, 0.1, 0.9...] - a bunch of numbers that represent the meaning.

When someone asks "How long do I have to return something?", that question also gets converted to numbers like [0.3, 0.7, 0.2, 0.8...]. The system then finds which document chunks have the most similar number patterns.

The chat model (in our case, Claude Sonnet) is the one that actually talks to you. Once the embedding model finds the relevant information chunks, the chat model takes those chunks and your original question, then generates a natural, conversational response. So the embedding model is the "finder" and the chat model is the "explainer."

Step 3: Create the knowledge base with your data source

Alright, this is the main event! Now that we have our S3 bucket ready and our models enabled, we can tie everything together by creating the Knowledge Base itself.

From the Amazon Bedrock console, navigate to Knowledge Bases and click the Create Knowledge Base button.

This will open a multi-step wizard, which is what you see in the screenshot.

On the first page (Provide Knowledge Base details):

Knowledge Base name: Give your KB a clear, descriptive name, like my-pdf-chat-kb. You could also use the default.
IAM permissions: This step is critical. We need to give Bedrock permission to act on our behalf. If you already have a service role, you can use that. Else, select Create and use a new service role and give the new role a descriptive name. This IAM role allows Bedrock to securely access our S3 bucket, call the embedding model, and (in the next step) write to our vector database.

Once you've filled that out, click Next.

On the second page (Configure data source):

This is where we tell our Knowledge Base where to find its, well... knowledge.

Data source name: You can give this a simple name, like pdf-document-source or use the default
Data source location: Click the Browse S3 button. A new window will pop up, allowing you to select the S3 bucket we created back in Step 1.
Chunking & Parsing: leave the default options — Amazon Bedrock default parser and default chunking

After you select your bucket, you can click Next. This will take us to the next step, where we'll choose our embedding model and vector database.

Step 4: Select the embedding model

Now we're on the "Configure data storage and processing" page.

Embeddings model: Select the Titan Embeddings model. This is the "translator" that will turn our text chunks into vectors.
Vector store: Choose Quick create a new vector store. This is the simplest, recommended option.
Vector store type: Select Amazon OpenSearch Serverless from the dropdown. This creates a powerful, high-performance vector database for you.

Once these are selected, click Next to go to the final review step.

Step 5: Test our setup

After you've configured your vector store, you'll be taken to a final "Review and create" page. Double-check that all your settings are correct and click Create Knowledge Base.

It will take a few minutes for the Knowledge Base to be created. Once it's ready, scroll down to the Data source section, select your data source and click Sync. This action kicks off the ingestion process, where Bedrock reads your PDFs from S3, uses the Titan model to create vectors, and loads them into your OpenSearch database. This may take a few minutes, depending on the size of your documents.

Once the "Last sync status" shows as Completed, it's time for the moment of truth!

Test Your Knowledge Base

Click the Test button in the top right (or on the left-hand menu).

Select Test Mode: In the "Configurations" window, make sure you select Retrieval and response generation. This tells Bedrock to use both our "finder" (the vector database) and our "explainer" (the chat model) to give a natural answer.
Select the Chat Model: Click the Select model button. This is where we choose our "explainer." As shown in the screenshot, a window will pop up. Navigate to the Anthropic provider and select Claude 3.5 Sonnet.
Chat with Your Data: Click "Apply," and the chat window is ready. Ask a question that can only be answered by your PDFs, and watch as Bedrock retrieves the information and uses Claude to give you a perfect, conversational answer.

And just like that, you've built a fully functional "Chat with your PDF" application using Amazon Bedrock! 🎉

Testing, Errors, and the All-Important Cost

Now, one thing I've learned from my many AI escapades: don’t overestimate its ability. My first prompt is always a simple, "What are your capabilities?". It gave a great summary of what it could do based on my bank statements. So far, so good

But then I asked more specific question and it showed half of the deposits. After I kept re-testing and asking more questions to try and get a better answer, I finally ran into the infamous error

Error Fetching
Too many requests, please wait before trying again. 
(Service: BedrockRuntime, Status Code: 429, Request ID: xxx-9226-xxx-8efe-xxxxx)

Anyways, the point is, it works & is accurate sometimes. That 429 error isn't a bug, by the way. It's throttling, which means you've hit the API rate limits. You just have to wait a few minutes and try again. When you "Sync" or test, Bedrock makes a ton of calls to the embedding model, and it's easy to hit the default quota.

Let's Check the Bill

Now for the most important part. What did this experiment cost?

Amazon OpenSearch Serverless: $4.25
Amazon Bedrock: $0.38
Route 53: $0.50
Total: $5.32

OpenSearch Serverless was 80% of the cost 💀 & it continues billing until the collection is fully deleted. This process takes about 5-10 minutes to fully remove and unfortunately you are charged for OCUs during this time. S3 Vectors on the other hand stops billing immediately after bucket deletion. So, there are no ongoing costs during deletion.

You know what’s funny, I immediately deleted (there’s a right & wrong way to do this BTW—check the clean-up section) all the resources after asking questions.

I have actually built something like this before and the accuracy was on point! A lot has happened between then and now, and it doesn’t seem to be as accurate anymore. Oh well!

I am curious about a few things though:

Why is it impossible to see the OpenSearch cluster?
Why do I keep running into a 'limit reached' error?

The Low-Cost Alternative: S3 Vectors

That high OpenSearch cost is exactly why the S3 Vectors option is good.

You can follow the exact same steps as above, but for the "Vector store type," just choose Amazon S3 Vectors.

Unfortunately, I couldn't test this out properly before running into the same 429 throttling error. S3 Vectors is still in preview, so I guess we'll have to wait. Will update when S3 Vectors becomes more generally available.

What's Next?

This manual setup is great for learning. Now, we know what to do and we can automate with IaC tools like Terraform or Cloud Formation. That’s what we’ll be doing in Part 2 of this series. And then for Part 3, we will build a simple React wrapper to handle the 429 errors and make this something we can use in prod. You can take this further by creating an agent and using this to do xyz.

CLEAN-UP

When I built this for the first time, I couldn’t most expensive part of this setup is the Opensearch Serverless Vector store. There’s a right and wrong way to clean this up—please do not be like me.

Step 1: Change Data Deletion Policy (CRITICAL FIRST STEP)

Before deleting anything, update the data deletion policy to RETAIN.

In the AWS Console:

Go to Bedrock Console → Knowledge bases
Select your Knowledge Base
Under Data source, select the data source
Click Edit
Expand Advanced settings
Change Data deletion policy from DELETE → RETAIN
Click Submit

Why this matters:
If you delete the vector store before the Knowledge Base, deletion will fail.
Setting the policy to RETAIN prevents Bedrock from trying to auto-delete the data source or vector store.

Step 2: Delete the Data Source

In Bedrock Console → Knowledge bases, select your Knowledge Base.
In the Data source section, select the data source.
Click Delete and confirm.
Wait for the status to show Deleted before proceeding.

Step 3: Delete the Knowledge Base

Go to Bedrock Console → Knowledge bases
Select your Knowledge Base
Click Delete
Type delete to confirm and click Delete

Step 4: Delete the Vector Store

For S3 Vectors:

Open S3 Console → Buckets
Find the S3 vector bucket (usually includes “bedrock” in the name)
Select it and click Delete
Follow the deletion prompts

For OpenSearch Serverless:

Open OpenSearch Console → Serverless → Collections
Select your collection
Click Delete
Type the collection name to confirm and delete

Step 5: Delete Security Policies (OpenSearch only)

Go to OpenSearch Console → Serverless → Security policies
Delete in this order:

Data access policy
Network policy
Encryption policy

Select each policy and click Delete after confirming.

Step 6: Delete the IAM Role

Go to IAM Console → Roles
Search for your Bedrock KB role (e.g., AmazonBedrockExecutionRoleForKnowledgeBase_*)
Select it and click Delete
Confirm deletion

S3 Vectors

Delete the index first

aws s3vectors delete-index --vector-bucket-name <VECTOR_DB_BUCKET_NAME> --index-name <VECTOR_DB_INDEX_NAME> --region us-east-1

aws s3vectors delete-vector-bucket --vector-bucket-name YOUR_VECTOR_BUCKET_NAME --region YOUR_REGION

Common Deletion Errors & Solutions

If you’ve worked with Bedrock Knowledge Bases, you’ve probably seen deletion errors that seem impossible to resolve.
Below are some of the most common ones I ran into, what causes them, and what actually worked to fix them 👇

Error 1: "Failed to delete knowledge base - DELETE_UNSUCCESSFUL"

You may notice your Knowledge Base is stuck in the DELETE_UNSUCCESSFUL state with an error saying
“Unable to delete data from vector store for data source.”

This usually happens when the vector store is deleted first or when your data deletion policy is set to DELETE.
In some cases, it’s just a permissions issue on your vector store.

Here’s what helps:

Go to your Knowledge Base → Data Source → Edit.
Under Advanced settings, change the Data deletion policy to RETAIN and click Submit.
Try deleting the Knowledge Base again.

If it’s still stuck, recreate the vector store with the exact same name:

For OpenSearch: recreate the collection
For S3 Vectors: recreate the bucket

Once recreated, delete the KB again (with RETAIN), then manually delete the vector store afterward.

Here’s what it looks like in the console:

Sync Data Source button in AWS Bedrock Console

Error 2: Knowledge Base stuck in "DELETING"

Sometimes your KB stays in DELETING for hours or even days, and you can’t do anything with it.
This is often a backend cleanup delay or a known console language bug.

Try these steps:

Change your AWS Console language to English — it sounds odd, but this fixes the stuck state for many users.

If that doesn’t work, use the AWS CLI (make sure to indent code properly below 👇):

aws bedrock-agent update-data-source \ --knowledge-base-id YOUR_KB_ID \ --data-source-id YOUR_DS_ID \ --data-deletion-policy RETAIN

aws bedrock-agent delete-knowledge-base --knowledge-base-id YOUR_KB_ID

If it’s still not deleting after 24 hours, open an AWS Support ticket and include your KB ID.

Error 3: "Policy is attached to 0 entities but must be attached to a single role"

You might see this error when trying to change the data deletion policy.
It usually means the IAM role associated with your Knowledge Base was deleted, renamed, or had its permissions removed.

What to do:

Go to IAM → Roles → [Your KB Role].
Make sure the role still exists and has the following permissions:
- AmazonBedrockFullAccess
- AmazonOpenSearchServiceFullAccess or AmazonS3FullAccess (depending on your vector store)
If the role was deleted or detached:
- Recreate or reattach the correct role to your data source.
If the role no longer exists and you can’t recover it:
- Contact AWS Support — the console won’t let you fix this manually.

Error 4: "vectorIngestionConfiguration.parsingConfiguration cannot be updated once created"

If you get this error while editing your data source, don’t panic — it’s not a blocker.
The parsing configuration becomes immutable after the data source is created, so AWS won’t let you modify it later.

Here’s what to do instead:

When editing your data source, avoid changing anything under Chunking and parsing configurations.
Expand Advanced settings → Data deletion policy and set it to RETAIN.
Leave everything else as-is, then click Submit.

That’s all you need to successfully retry your Knowledge Base deletion without triggering this error again.

Error 5: "Unable to delete Knowledge Base: Data Source still in use"

This one pops up when you try to delete your Knowledge Base before removing the attached data source.
It’s a dependency issue — Bedrock prevents you from deleting a KB that still references an active data source.

Fix it:

Go to your Knowledge Base in the AWS Console.
Under Data Sources, delete all connected data sources first.
Wait a few seconds for the status to change to Deleted.
Now, go back and delete your Knowledge Base — it should succeed this time.

If the data source deletion seems stuck in a “Deleting” state:

Refresh after 2–3 minutes.
If it still doesn’t clear, open the CloudWatch logs for more details, or retry deletion from the CLI using:
```
aws bedrock delete-data-source --data-source-id 
```

Error 6: "The request failed because the vector store index was not found"

This occurs when Bedrock expects a vector store (OpenSearch index or S3 vector path) that was deleted, renamed, or moved outside of Bedrock.

How to fix it:

Confirm the vector store exists:
- OpenSearch: verify the collection/index name in the OpenSearch console.
- S3 Vectors: verify the bucket and prefix path in the S3 console.
If it was removed, either:
- Recreate the vector store with the same name, or
- Update the Knowledge Base configuration to point to the new store.
If the store exists but the error persists:
- Remove and re-add the data source in the console, then retry the operation, or reindex the data.

Error 7: "One or more data sources need to be synced before this Knowledge Base can be tested"

You’ll see this error when you try to test or query your Knowledge Base before syncing the connected data sources.
It simply means Bedrock hasn’t indexed the data yet.