Document RAG Chatbot for a Global Investment Bank (Okta SSO, Kubernetes)
A global investment bank
Overview
A production-grade, document-grounded RAG chatbot tailored for a global investment bank, the most mature of the platform’s per-client deployments. It pairs a GPT-4 + LangChain + Pinecone retrieval pipeline with hardened enterprise auth and a self-hosted Kubernetes deployment.
The Challenge
A global investment bank needed conversational access to dense internal documentation, delivered inside its own security perimeter and identity provider. That meant not just a working RAG chatbot, but enterprise SSO, real-time streaming responses, and a deployment the client could host and operate to its own standards.
What We Built
A Next.js application (built on the gpt4-langchain-pdf-chatbot base) with a richer enterprise footprint than the sibling forks. Authentication is handled via Okta (@okta/okta-react, @okta/okta-auth-js, @okta/okta-signin-widget). The RAG path ingests PDFs (pdf-parse) into Pinecone and streams GPT-4 answers using both SSE (sse.js, @microsoft/fetch-event-source) and Socket.IO for real-time delivery. The UI uses Mantine plus Radix, React Hook Form + Zod, with html-to-text and react-markdown for content handling. It ships with full deployment tooling: Dockerfile, Nginx config, a PM2 ecosystem file, and Kubernetes manifests split across prod and dev, plus a visual-guide and docs. Development spanned April 2023 to mid-2024, indicating sustained, evolving delivery rather than a one-off.
Technologies & Approach
The proven LangChain + Pinecone + GPT-4 RAG core, wrapped in the security and operational scaffolding a tier-one financial institution requires: Okta SSO, dual SSE/WebSocket streaming, and a Kubernetes/Nginx/PM2 deployment the client controls.
Outcome / Impact
Delivered the platform’s flagship enterprise chatbot deployment, document-grounded answers behind the bank’s own identity provider and infrastructure, sustained and iterated over more than a year.
Capabilities Demonstrated
- Enterprise SSO integration (Okta) for regulated finance
- Document-grounded GPT-4 Q&A with LangChain + Pinecone
- Dual streaming transport (SSE + WebSocket)
- Self-hosted Kubernetes/Nginx/PM2 deployment
- Long-running, hardened enterprise delivery