← All work
Tooling · 2014–15

University Timetable Scraper (CS Department)

Overview

A small scraping tool that pulls the class timetable (“orar”) for the Computer Science department of Ovidius University of Constanța and parses it into a usable structure. Part of a pair of timetable utilities the developer built as a student.

Why It Exists

The university published timetables only as awkward HTML pages. This tool was written to scrape and re-process that schedule data so it could be consumed programmatically (e.g. for a cleaner personal timetable view).

What We Built

A PHP scraper (run.php) backed by Composer-installed HTML-parsing libraries, sunra/php-simple-html-dom-parser, amstaff/simplehtmldom and niels/ganon, for extracting timetable rows from the source markup. A casper.js script plus a bundled CasperJS install handle headless-browser navigation where the schedule pages needed scripted interaction rather than a plain HTTP fetch.

Technologies & Approach

PHP for orchestration and DOM parsing (three different HTML-DOM libraries, reflecting trial-and-error to handle messy markup), with CasperJS/PhantomJS for headless browsing. A pragmatic, scrape-and-parse approach typical of student automation projects.

Outcome / Impact

A working personal automation utility. It proves early competence in web scraping, resilient HTML parsing, and headless-browser automation against real-world, non-API data sources.

Capabilities Demonstrated

  • Web scraping of real-world HTML sources
  • Robust HTML/DOM parsing (multiple parser libraries)
  • Headless-browser automation (CasperJS/PhantomJS)
  • Dependency management with Composer
More work See all →