Music Metadata Enrichment Pipeline (Firebase / Genkit / LLM)
Overview
A serverless music-metadata enrichment pipeline built on Firebase Functions and Google’s Genkit AI framework. It takes playlist/track data and enriches it using LLM-driven flows, persisting results to Cloud Firestore, with a custom model plugin and prompt library.
Why It Exists
Raw playlist and track data is often sparse, missing genres, descriptions, moods, or normalized attributes that make it useful downstream. This pipeline uses LLMs to fill those gaps programmatically, turning thin metadata into richer, structured records without manual curation.
What We Built
A Firebase Functions (Node 22) codebase organized around Genkit: a genkit.ts setup, genkit-functions, a prompts directory for prompt-driven flows, utils, and a custom deepseekPlugin model integration alongside the OpenAI plugin (genkitx-openai). Source data is gathered via the Apify client and enrichment results are written to Cloud Firestore (firestore.rules and indexes are defined). The project includes Genkit’s evaluator package for assessing flow output quality and runs locally against the Firebase and Firestore emulators.
Technologies & Approach
TypeScript on Firebase Functions, with Genkit orchestrating AI flows over multiple model providers (OpenAI plus a custom DeepSeek plugin). Cloud Firestore is the system of record, Apify supplies source data, and Genkit evaluators provide a quality-assessment loop. The emulator-based workflow keeps iteration fast and local.
Outcome / Impact
The pipeline demonstrates a clean, serverless pattern for LLM-based data enrichment: pluggable model providers, prompt-driven flows, evaluation built in, and Firestore persistence, a reusable blueprint for enriching any sparse catalog of structured records.
Capabilities Demonstrated
- Building LLM enrichment pipelines with Genkit flows and prompts
- Integrating multiple model providers, including custom plugins (DeepSeek, OpenAI)
- Serverless data processing on Firebase Functions with Firestore persistence
- Output-quality evaluation with Genkit evaluators