Back to Projects
INTERNAL

zEt/n;h-b& !P JyDs>m g+a^U9O3

Knowledge Search with RAG & Bedrock

Unified search across 21+ enterprise data sources (Slack, Teams, Confluence, Google Drive, Jira) with hybrid RAG, cited AI answers, and automatic failover across 100+ LLM models.

AWS BedrockRAGECSTerraformLiteLLMCelery
Internal — Enterprise

Lead architect and infrastructure engineer. Hub-and-spoke AWS deployment, cross-account Bedrock access, Terraform + Azure DevOps CI/CD across DEV/STG/PRD.

21+Data Connectors
100+LLM Models
5ECS Services
3Environments

ARCHITECTURE

Enterprise AI Search Platform architecture diagram
Scroll to zoom·Click fullscreen for detail

HIGHLIGHTS

HYBRID SEARCH

Vector similarity (pgvector) and full-text search (tsvector) run in parallel, merged with Reciprocal Rank Fusion. Optional reranking via Pinecone, Cohere, or Flashrank

LLM ROUTING WITH FAILOVER

LiteLLM sits in front of 100+ models. If Claude is rate-limited, the request goes to GPT-4o without the user noticing. Cooldown logic prevents hammering failing providers

CROSS-ACCOUNT BEDROCK

ECS tasks in spoke accounts assume IAM roles in a separate AI account via STS with external ID validation. Credentials rotate hourly. 4 AWS accounts total

CITED AI ANSWERS

Answers stream in with source citations. Chunk IDs are preserved through the pipeline so users can click through to the original document

TECH STACK
Frontend:Next.js 15React 19Vercel AI SDKElectricSQL
Backend:FastAPIPythonCelerySQLAlchemy (async)Alembic
AI & Search:AWS BedrockLiteLLMpgvectortsvectorLangGraph
Infrastructure:TerraformECS FargateAurora Serverless v2CloudFrontWAF