← All work
Tooling · 2025

Metadata-Repair Utility (Metascraper / Browserless)

An influencer-marketing media-intelligence platform

Overview

A small Node.js utility for repairing and backfilling article metadata across the platform’s corpus, using Metascraper and Browserless to re-extract titles, authors, dates, images and descriptions.

Why It Exists

Ingested articles sometimes have missing or wrong metadata. This tool re-fetches pages and re-extracts clean metadata to remediate data quality in the core store.

What We Built

A focused index.js script combining metascraper (with the title/author/date/image/description/url rules), html-get + browserless for robust headless-browser fetching, and url-metadata as a fallback extractor.

Technologies & Approach

Metascraper for rule-based extraction, Browserless for rendering JS-heavy pages headlessly, plus a secondary extractor for resilience, a pragmatic, single-purpose data-fix tool.

Outcome / Impact

Improved metadata completeness and accuracy across the platform’s content without touching the main ingestion code path.

Capabilities Demonstrated

  • Robust web metadata extraction with Metascraper + headless browsers
  • Targeted data-quality remediation scripts
More work See all →