← All work
Infrastructure · 2025

Multi-Source Data Collection Services on Cloud Run

A media-monitoring / data-orchestration platform

Overview

Cloud-native data-collection services that pull content from many upstream providers into the platform. Each source runs as a configurable service on Google Cloud Run with managed PostgreSQL and Redis, dynamic service configuration, automated CI/CD, and Terraform-managed infrastructure.

The Challenge

A media-monitoring platform must integrate a wide range of heterogeneous data providers, each with its own API, schema, and rate limits, and run them reliably and independently at scale, while keeping infrastructure reproducible.

What We Built

A TypeScript codebase organized around a handler-per-source pattern: Google, LexisNexis, LinkedIn, Newscatcher, Podchaser, Truth Social, and X/Twitter each have a dedicated handler, model, and (where needed) service, registered through a handler registry. Dependency injection via tsyringe wires the services together; Sequelize models map source-specific articles plus connectors and subscriptions; ioredis backs caching/coordination and Pub/Sub handles messaging. A rate-limiter service with a circuit-breaker decorator protects upstream calls. Per-source Dockerfiles, a Makefile, deploy scripts, and substantial Terraform (including PubSub modules) drive Cloud Run deployments, with Jest test suites per provider.

Technologies & Approach

TypeScript + Express on Cloud Run, Sequelize over PostgreSQL, ioredis, and Google Cloud Pub/Sub. tsyringe provides clean DI; a circuit-breaker-wrapped rate limiter adds resilience. Terraform and shell deploy scripts make the whole environment reproducible and CI/CD-driven.

Outcome / Impact

Provides the breadth of source coverage feeding the platform’s ingestion pipeline, with an extensible handler model that makes adding new providers a contained, testable change.

Capabilities Demonstrated

  • Multi-provider ingestion behind a uniform handler/registry pattern
  • Dependency-injected, testable TypeScript service architecture
  • Resilience via rate limiting and circuit breakers
  • Cloud Run + Pub/Sub + Terraform infrastructure-as-code
More work See all →