INTERNAL

zEt/n;h-b& !P JyDs>m g+a^U9O3

Knowledge Search with RAG & Bedrock

Unified search across 21+ enterprise data sources (Slack, Teams, Confluence, Google Drive, Jira) with hybrid RAG, cited AI answers, and automatic failover across 100+ LLM models.

AWS BedrockRAGECSTerraformLiteLLMCelery

Internal — Enterprise

Lead architect and infrastructure engineer. Hub-and-spoke AWS deployment, cross-account Bedrock access, Terraform + Azure DevOps CI/CD across DEV/STG/PRD.

21+Data Connectors

100+LLM Models

5ECS Services

3Environments

ARCHITECTURE

Scroll to zoom·Click fullscreen for detail

HIGHLIGHTS

HYBRID SEARCH

Vector similarity (pgvector) and full-text search (tsvector) run in parallel, merged with Reciprocal Rank Fusion. Optional reranking via Pinecone, Cohere, or Flashrank

LLM ROUTING WITH FAILOVER

LiteLLM sits in front of 100+ models. If Claude is rate-limited, the request goes to GPT-4o without the user noticing. Cooldown logic prevents hammering failing providers

CROSS-ACCOUNT BEDROCK

ECS tasks in spoke accounts assume IAM roles in a separate AI account via STS with external ID validation. Credentials rotate hourly. 4 AWS accounts total

CITED AI ANSWERS

Answers stream in with source citations. Chunk IDs are preserved through the pipeline so users can click through to the original document

TECH STACK

Frontend:Next.js 15React 19Vercel AI SDKElectricSQL

Backend:FastAPIPythonCelerySQLAlchemy (async)Alembic

AI & Search:AWS BedrockLiteLLMpgvectortsvectorLangGraph

Infrastructure:TerraformECS FargateAurora Serverless v2CloudFrontWAF

PiggyBack All Projects SPRS