← All work
Product · 2024

Retrieval-Augmented Search over Media Content (LangChain + Meilisearch)

An influencer-marketing media-intelligence platform

Overview

A retrieval-augmented-generation build that puts natural-language Q&A and semantic search over the platform’s article and post corpus, using LangChain with Meilisearch as the vector store and OpenAI embeddings.

Why It Exists

The platform holds a huge corpus of titled news articles and social posts. Keyword search alone misses semantic intent. This R&D explored RAG-style retrieval, embedding posts and answering queries grounded in retrieved content.

What We Built

Two Python scripts (index.py, search.py) wiring LangChain’s Meilisearch vector store to OpenAI text-embedding-3-small embeddings (1536-dim), with a MultiQueryRetriever and ChatOpenAI for answer synthesis. Documents are templated from each post’s title and description, indexed in a hosted Meilisearch instance.

Technologies & Approach

LangChain for retrieval orchestration, Meilisearch for fast hybrid/vector search, OpenAI for embeddings and generation. A lightweight script-based build rather than a service.

Outcome / Impact

Validated semantic retrieval and RAG over the existing Meilisearch-indexed corpus, informing how AI search could be layered onto the core engine. (Note: build contained hard-coded credentials, flagged for rotation before any production use.)

Capabilities Demonstrated

  • Building RAG pipelines with LangChain
  • Using Meilisearch as a vector store with OpenAI embeddings
  • Multi-query retrieval for improved semantic recall
More work See all →