← All work
Product · 2023

Healthcare Document RAG Chatbot (LangChain + Pinecone)

A US health-insurance provider

Overview

An early build RAG chatbot for a US health-insurance provider, answering questions over health-insurance and medical-insurance documents. Built on the platform’s gpt4-langchain-pdf-chatbot foundation, it grounds GPT-4 answers in ingested insurance PDFs via Pinecone.

Why It Exists

Health-insurance members and staff struggle to find answers buried in dense plan and contract documents. This build validated whether a document-grounded chatbot could answer plan questions accurately enough to be worth pursuing for the healthcare vertical.

What We Built

A Next.js chatbot (docs/medical-insurance source set) using LangChain for the RAG pipeline: an ingestion script chunks and embeds PDFs (pdf-parse) into a Pinecone index (@pinecone-database/pinecone), and the chat route streams GPT-4 answers grounded in retrieved passages (@microsoft/fetch-event-source, react-markdown, remark-gfm). The UI is a lightweight Next.js + Tailwind + Radix surface. The commit history is short (a few days in May 2023), consistent with a focused vertical evaluation rather than a long-lived build.

Technologies & Approach

LangChain + Pinecone + GPT-4 over Next.js, the same proven RAG-over-PDF pattern the platform applied across enterprise verticals, here pointed at healthcare insurance content with streaming answers for responsiveness.

Outcome / Impact

Validated document-grounded Q&A over insurance material for the healthcare vertical and exercised the reusable PDF-to-Pinecone RAG pipeline. Scoped as a short evaluation, it fed into the platform’s broader, productized chatbot stack.

Capabilities Demonstrated

  • RAG over regulated healthcare/insurance documents
  • LangChain + Pinecone ingestion and retrieval
  • Streaming GPT-4 chat with Markdown rendering
  • Rapid vertical build delivery
More work See all →