University Timetable Scraper (CS Department)
Overview
A small scraping tool that pulls the class timetable (“orar”) for the Computer Science department of Ovidius University of Constanța and parses it into a usable structure. Part of a pair of timetable utilities the developer built as a student.
Why It Exists
The university published timetables only as awkward HTML pages. This tool was written to scrape and re-process that schedule data so it could be consumed programmatically (e.g. for a cleaner personal timetable view).
What We Built
A PHP scraper (run.php) backed by Composer-installed HTML-parsing libraries, sunra/php-simple-html-dom-parser, amstaff/simplehtmldom and niels/ganon, for extracting timetable rows from the source markup. A casper.js script plus a bundled CasperJS install handle headless-browser navigation where the schedule pages needed scripted interaction rather than a plain HTTP fetch.
Technologies & Approach
PHP for orchestration and DOM parsing (three different HTML-DOM libraries, reflecting trial-and-error to handle messy markup), with CasperJS/PhantomJS for headless browsing. A pragmatic, scrape-and-parse approach typical of student automation projects.
Outcome / Impact
A working personal automation utility. It proves early competence in web scraping, resilient HTML parsing, and headless-browser automation against real-world, non-API data sources.
Capabilities Demonstrated
- Web scraping of real-world HTML sources
- Robust HTML/DOM parsing (multiple parser libraries)
- Headless-browser automation (CasperJS/PhantomJS)
- Dependency management with Composer