Multi-Source Data Collection Services on Cloud Run
A media-monitoring / data-orchestration platform
Overview
Cloud-native data-collection services that pull content from many upstream providers into the platform. Each source runs as a configurable service on Google Cloud Run with managed PostgreSQL and Redis, dynamic service configuration, automated CI/CD, and Terraform-managed infrastructure.
The Challenge
A media-monitoring platform must integrate a wide range of heterogeneous data providers, each with its own API, schema, and rate limits, and run them reliably and independently at scale, while keeping infrastructure reproducible.
What We Built
A TypeScript codebase organized around a handler-per-source pattern: Google, LexisNexis, LinkedIn, Newscatcher, Podchaser, Truth Social, and X/Twitter each have a dedicated handler, model, and (where needed) service, registered through a handler registry. Dependency injection via tsyringe wires the services together; Sequelize models map source-specific articles plus connectors and subscriptions; ioredis backs caching/coordination and Pub/Sub handles messaging. A rate-limiter service with a circuit-breaker decorator protects upstream calls. Per-source Dockerfiles, a Makefile, deploy scripts, and substantial Terraform (including PubSub modules) drive Cloud Run deployments, with Jest test suites per provider.
Technologies & Approach
TypeScript + Express on Cloud Run, Sequelize over PostgreSQL, ioredis, and Google Cloud Pub/Sub. tsyringe provides clean DI; a circuit-breaker-wrapped rate limiter adds resilience. Terraform and shell deploy scripts make the whole environment reproducible and CI/CD-driven.
Outcome / Impact
Provides the breadth of source coverage feeding the platform’s ingestion pipeline, with an extensible handler model that makes adding new providers a contained, testable change.
Capabilities Demonstrated
- Multi-provider ingestion behind a uniform handler/registry pattern
- Dependency-injected, testable TypeScript service architecture
- Resilience via rate limiting and circuit breakers
- Cloud Run + Pub/Sub + Terraform infrastructure-as-code