Artificial Intelligence, AWS,

How to Build a Private ChatGPT on AWS with Bedrock (2026 Guide)

private gpt on aws
AWS partner dedicated to startups

AWS partner dedicated to startups

  • 2000+ Clients
  • 5+ Years of Experience
  • $10M+ saved on AWS

Your team wants ChatGPT. Your security officer wants to know where the data goes. Your compliance team wants GDPR and HIPAA guarantees. Your CFO is tired of per-seat subscription fees.
Amazon Bedrock solves all four problems at once, and this guide shows you exactly how to build a fully private, production-ready AI assistant on AWS that your legal, security, and finance teams will all sign off on.

What Is a “Private ChatGPT” and Why Do You Need One?

When companies use the public version of ChatGPT or similar tools, their prompts (which often contain internal documents, customer data, financial figures, or proprietary code) are sent to a third-party server outside their control. Even with enterprise agreements in place, you’re relying on contractual promises rather than architectural guarantees.

A private ChatGPT built on AWS Bedrock works differently: your data never leaves your AWS account. Queries are processed inside your own Virtual Private Cloud (VPC), your prompts are never used to train any model, and every interaction is logged in your own CloudTrail for audit purposes.

This matters most for:

  • Healthcare organizations handling Protected Health Information (PHI) under HIPAA
  • Financial services firms with PCI-DSS or SOC 2 requirements
  • EU-based companies with GDPR data residency obligations
  • SaaS companies that cannot risk exposing client data to third-party AI services
  • Any team with proprietary code, internal documents, or trade secrets

What You’ll Build

By the end of this guide, you’ll have a working private AI assistant that:

  • Answers questions using your own internal documents (via RAG)
  • Runs entirely inside your AWS account with no data leaving
  • Uses Claude (Anthropic) or any other Bedrock foundation model
  • Has proper IAM controls, encryption, and audit logging in place
  • Can be accessed by your team via a clean chat interface

Architecture overview:

Users > [Chat UI / API Gateway] > [Lambda] > [Bedrock (Claude / Llama / Titan)]
                                                    |
                                         [Bedrock Knowledge Base]
                                                    |
                                    [S3 (your documents)] > [OpenSearch / Vector DB]

All traffic stays inside your VPC via AWS PrivateLink

Prerequisites

Before you start, you’ll need:

  • An AWS account with administrator access (or at minimum, permissions for Bedrock, IAM, S3, Lambda, and API Gateway)
  • AWS CLI installed and configured (aws configure)
  • Basic familiarity with the AWS Console
  • Your internal documents in PDF, TXT, Word, or HTML format (for the RAG knowledge base)

Step 1: Enable Amazon Bedrock and Request Model Access

Amazon Bedrock is a fully managed service, so there is no infrastructure to provision. But you do need to explicitly request access to the foundation models you want to use.

In the AWS Console:

  1. Navigate to Amazon Bedrock > Model Access (left sidebar)
  2. Click Manage model access
  3. Select the models you want to enable. For a general-purpose private ChatGPT, we recommend starting with:
    • Anthropic Claude Sonnet 4.6 – best balance of intelligence and cost
    • Amazon Titan Text – AWS-native, no third-party model provider
    • Meta Llama 3 – strong open-source alternative
  4. Click Request model access – approval is typically instant for most models

Note: Model access is regional. If you’re deploying in eu-west-1 for GDPR compliance, request access in that specific region.

Step 2: Set Up IAM Roles with Least Privilege

This is the step most tutorials skip, and the one that matters most for security. Never use root credentials or overly broad policies with Bedrock.

Create a dedicated IAM role for your application:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:RetrieveAndGenerate",
        "bedrock:Retrieve"
      ],
      "Resource": [
        "arn:aws:bedrock:eu-west-1::foundation-model/anthropic.claude-sonnet-4-5-20251001",
        "arn:aws:bedrock:eu-west-1:YOUR_ACCOUNT:knowledge-base/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-private-knowledge-base-bucket",
        "arn:aws:s3:::your-private-knowledge-base-bucket/*"
      ]
    }
  ]
}

Key principle: Only grant access to the specific model ARNs and knowledge base resources your application needs. Do not use wildcards (*) on Bedrock resources in production.

Step 3: Create Your Knowledge Base (RAG Setup)

This is what makes your private ChatGPT actually useful. Instead of answering from generic training data, it answers from your internal documents.

Bedrock Knowledge Bases automates the full RAG pipeline: upload documents, chunk them, embed them as vectors, store in a vector database, then retrieve relevant chunks at query time.

3a. Upload Your Documents to S3

# Create a private S3 bucket (block all public access)
aws s3api create-bucket \
  --bucket your-company-knowledge-base \
  --region eu-west-1 \
  --create-bucket-configuration LocationConstraint=eu-west-1

# Enable versioning and server-side encryption
aws s3api put-bucket-versioning \
  --bucket your-company-knowledge-base \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption \
  --bucket your-company-knowledge-base \
  --server-side-encryption-configuration '{
    "Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "aws:kms"}}]
  }'

# Upload your documents
aws s3 sync ./internal-docs/ s3://your-company-knowledge-base/docs/

Bedrock Knowledge Bases supports PDF, TXT, HTML, Markdown, Word (.docx), and CSV files.

3b. Create the Knowledge Base in the Console

  1. Go to Amazon Bedrock > Knowledge Bases > Create knowledge base
  2. Name: company-internal-kb
  3. IAM permissions: Create a new service role (Bedrock will configure the permissions automatically)
  4. Data source: Amazon S3 – select your bucket and prefix (docs/)
  5. Embedding model: Select Amazon Titan Embeddings v2 (best price/performance for most use cases)
  6. Vector store: For production, select Amazon OpenSearch Serverless. For lower-cost setups, use Amazon Aurora PostgreSQL with pgvector

Cost note: OpenSearch Serverless has a minimum cost of approximately $700/month (4 OCUs). For smaller teams, using Aurora PostgreSQL with pgvector significantly reduces this floor. See the cost breakdown section below.

  1. Click Create knowledge base and wait for the initial sync to complete (around 5 to 10 minutes for a few hundred documents)

Step 4: Build the API Layer

Now we connect the front end to Bedrock via a serverless API. This is the simplest production-grade setup: API Gateway > Lambda > Bedrock.

Lambda Function

import json
import boto3

bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='eu-west-1')

KNOWLEDGE_BASE_ID = 'YOUR_KB_ID'
MODEL_ARN = 'arn:aws:bedrock:eu-west-1::foundation-model/anthropic.claude-sonnet-4-5-20251001'

def lambda_handler(event, context):
    body = json.loads(event['body'])
    user_query = body.get('message', '')

    if not user_query:
        return {'statusCode': 400, 'body': json.dumps({'error': 'No message provided'})}

    try:
        response = bedrock_agent_runtime.retrieve_and_generate(
            input={'text': user_query},
            retrieveAndGenerateConfiguration={
                'type': 'KNOWLEDGE_BASE',
                'knowledgeBaseConfiguration': {
                    'knowledgeBaseId': KNOWLEDGE_BASE_ID,
                    'modelArn': MODEL_ARN,
                    'generationConfiguration': {
                        'promptTemplate': {
                            'textPromptTemplate': """You are a helpful assistant for our company.
                            Use the following context to answer the question.
                            If you don't know the answer, say so clearly.

                            Context: $search_results$

                            Question: $query$"""
                        }
                    }
                }
            }
        )

        answer = response['output']['text']
        citations = response.get('citations', [])

        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': 'https://your-internal-domain.com'
            },
            'body': json.dumps({
                'answer': answer,
                'sources': [c['retrievedReferences'] for c in citations]
            })
        }

    except Exception as e:
        return {'statusCode': 500, 'body': json.dumps({'error': str(e)})}

Deploy this Lambda with the IAM role you created in Step 2, and attach it to an API Gateway HTTP API endpoint.

Step 5: Security Hardening

This is where a properly built internal ChatGPT separates from a toy prototype. These steps are non-negotiable for production deployments.

5a. VPC Endpoints (Keep Traffic Off the Public Internet)

By default, your Lambda function calls Bedrock over the public internet. For true data isolation, route all traffic through VPC endpoints:

# Create VPC endpoint for Bedrock
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxxxxxx \
  --service-name com.amazonaws.eu-west-1.bedrock-runtime \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-xxxxxxxx \
  --security-group-ids sg-xxxxxxxx

# Also create endpoint for S3 (gateway type, free)
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxxxxxx \
  --service-name com.amazonaws.eu-west-1.s3 \
  --vpc-endpoint-type Gateway \
  --route-table-ids rtb-xxxxxxxx

With VPC endpoints in place, traffic between your Lambda and Bedrock never leaves the AWS network.

5b. Encrypt Everything with KMS

# Create a customer-managed KMS key for your AI workload
aws kms create-key \
  --description "Private ChatGPT encryption key" \
  --key-usage ENCRYPT_DECRYPT \
  --tags TagKey=Project,TagValue=PrivateChatGPT

Use this key for S3 bucket encryption (knowledge base documents), CloudWatch Logs encryption (query logs), and any DynamoDB tables storing conversation history.

5c. Enable Bedrock Guardrails

Bedrock Guardrails let you define content filters, block specific topics, and detect and mask PII, all before responses reach your users.

In the AWS Console: Bedrock > Guardrails > Create guardrail

Key policies to configure:

  • Harmful content filters: Set to HIGH for hate speech, violence, and misconduct
  • Denied topics: Add your company-specific off-limits topics (e.g., “competitor products,” “legal advice”)
  • Sensitive information: Enable PII detection and masking (SSNs, credit card numbers, etc.)
  • Grounding check: Enable to reduce hallucinations by verifying responses against retrieved context

5d. Enable CloudTrail Logging

Every Bedrock API call should be logged for audit purposes:

aws cloudtrail create-trail \
  --name bedrock-audit-trail \
  --s3-bucket-name your-audit-logs-bucket \
  --include-global-service-events \
  --is-multi-region-trail

aws cloudtrail start-logging --name bedrock-audit-trail

This gives you a complete audit trail of who asked what, when. That is essential for HIPAA and SOC 2 compliance.

Step 6: Add a Chat Interface

For an internal tool, you have two practical options:

Option A: Open WebUI (fastest to deploy)

Open WebUI is an open-source ChatGPT-style interface that connects to any OpenAI-compatible API. To use it with Bedrock, you need the Bedrock Access Gateway (BAG), an open-source AWS project that translates OpenAI API calls into native Bedrock requests. Deploy both with AWS ECS Fargate. Your Open WebUI container and the BAG container run in private subnets, communicate via AWS Cloud Map (service discovery), and are exposed to your users through an Application Load Balancer with HTTPS.

Option B: Custom React frontend

For more control over the UI, branding, or UX (for example, showing source citations from your knowledge base), build a simple React app and deploy it via Amazon CloudFront and S3. Use Amazon Cognito for user authentication. This gives you SSO integration, MFA enforcement, and proper session management without building auth from scratch.

Compliance: What Bedrock Covers Out of the Box

One of the biggest advantages of building on Bedrock rather than rolling your own LLM infrastructure is the compliance coverage you get included:

StandardBedrock Status
GDPRCompliant – data stays in your chosen AWS region
HIPAAEligible – must sign AWS BAA
SOC 1, 2, 3In scope
ISO 27001, 27017, 27018Certified
FedRAMP HighAuthorized (GovCloud US-West)
CSA STAR Level 2Certified

AWS Bedrock does not use your prompts or responses to train any model, and your data is never shared with model providers. This is an architectural guarantee. Bedrock processes your data using its infrastructure, but the model providers (Anthropic, Meta, etc.) never receive your inference data.

For GDPR specifically: deploy in an EU region (eu-west-1eu-central-1, or eu-north-1) and your data never crosses EU borders.

Cost Breakdown

Understanding Bedrock pricing upfront prevents bill shock. Here is a realistic breakdown for a team of 50 people using an internal ChatGPT daily:

Inference costs (Claude Sonnet 4.6):

  • Input tokens: $3.00 per million tokens
  • Output tokens: $15.00 per million tokens
  • Typical query: ~2,000 input tokens + ~500 output tokens
  • At 500 queries/day: approximately $90 to $150/month in inference

Knowledge Base infrastructure:

  • OpenSearch Serverless (production): ~$700/month minimum (4 OCUs)
  • Aurora PostgreSQL with pgvector (cost-optimized): ~$50 to $100/month
  • Embedding model (Amazon Titan): ~$0.02 per 1,000 tokens (one-time on document ingestion)

Other services:

  • Lambda: Essentially free at this scale (~$1 to $5/month)
  • API Gateway: ~$3.50/million requests
  • CloudWatch Logs: ~$5 to $15/month
  • VPC Endpoints: ~$7/endpoint/month x 2 = ~$14/month

Realistic total for a 50-person team:

  • Using OpenSearch Serverless: approximately $850 to $950/month
  • Using pgvector instead: approximately $200 to $300/month

Compare this to ChatGPT Enterprise at $30/user/month x 50 users = $1,500/month, with none of the data control or compliance coverage.

Cost optimization tips:

  • Use prompt caching for repeated system prompts – reduces input token costs by up to 90%
  • Use batch inference for non-real-time workloads (document summarization, analysis) – 50% cheaper than on-demand
  • Choose Claude Haiku for simple classification or routing tasks; reserve Sonnet/Opus for complex queries
  • Use pgvector on Aurora instead of OpenSearch Serverless if your knowledge base is under 50,000 documents

When to Go Further

This guide covers a solid production-ready setup for most teams. But some scenarios call for more:

Add Bedrock Agents if you want your AI assistant to take actions, not just answer questions. Agents can call APIs, update databases, trigger workflows, or use tools autonomously. This is the path to building an AI “employee” rather than just a chatbot.

Consider fine-tuning only if your domain is highly specialized and RAG is not achieving the accuracy you need. Fine-tuning on Bedrock requires significantly more investment in data preparation, compute, and ongoing maintenance. For most teams, a well-configured RAG setup delivers better ROI.

Upgrade to SageMaker AI if you need full control over model training, custom model architectures, or inference infrastructure. SageMaker is a more powerful but more complex platform than Bedrock.

Frequently Asked Questions

Does AWS Bedrock train on my data?
No. AWS explicitly states that your content is never used to improve base models and is never shared with model providers. Your data is encrypted in transit and at rest.

Can I use Bedrock for HIPAA-compliant applications?
Yes, with conditions. You must sign an AWS Business Associate Agreement (BAA), deploy in HIPAA-eligible regions, and implement proper access controls, encryption, and audit logging. Bedrock is HIPAA-eligible, but compliance is a shared responsibility.

What is the difference between RAG and fine-tuning?
RAG (Retrieval Augmented Generation) connects your AI to a searchable knowledge base at query time. No model changes are needed, and it is easy to update. Fine-tuning modifies the model’s weights using your data, which is better for style and format consistency but expensive and harder to maintain. For most enterprise use cases, RAG is the right starting point.

How is this different from just using the ChatGPT Enterprise tier?
ChatGPT Enterprise processes your data on OpenAI’s servers, covered by a data processing agreement. With Bedrock, your data never leaves your AWS account. It is an architectural guarantee rather than a contractual one. You also have full observability via CloudTrail and can enforce your own IAM and VPC policies.

Can I use models other than Claude?
Yes. Bedrock provides access to models from Anthropic (Claude), Meta (Llama), Amazon (Titan, Nova), Mistral, Cohere, and Stability AI through a single API. You can switch models by changing a single parameter without rebuilding your application.

Summary

Building a private ChatGPT on AWS Bedrock gives you enterprise AI capabilities without sacrificing data control, compliance, or security. The architecture is straightforward: your documents in S3, a Knowledge Base for RAG, Lambda and API Gateway for the API layer, Guardrails for content safety, and VPC endpoints to keep everything off the public internet.

The result is a system that is faster to deploy than building your own LLM infrastructure, cheaper than per-seat ChatGPT subscriptions at scale, and architecturally guaranteed to keep your data private.

Need Help Setting This Up?

Cloudvisor specializes in helping companies build secure, cost-optimized AI infrastructure on AWS. Whether you are starting from scratch or trying to get an existing Bedrock setup production-ready, our team of certified AWS experts can handle the architecture, security hardening, and ongoing optimization.

Or explore our AI Readiness Assessment to understand how prepared your current AWS infrastructure is for AI workloads.

Share this article: