Samstag, 29. Oktober 2011

Plasma PublicTransport

Public Transport Data in a KDE Plasma Desktop

I'm Friedrich Pülz and this is my first blog ;) To introduce myself: I'm studying informatics in Bremen, Germany and I'm the author of the PublicTransport project among others (eg. KrossWordPuzzle game, Glucose plasma applet for diabetics).

The PublicTransport applet in action for a german stop

Motivation 
 
I started the PublicTransport project (a single Plasma applet at that time) because I was annoyed at using websites of public transport service providers. To get a list of journeys you have to open up a browser, navigate to the service provider's website, type in the origin and target stop names and finally you can see the results. The PublicTransport applet simplifies this, it eg. stores the name of your home stop and sits in your desktop or panel. You only have to type in the target stop name to view a journey list (or use a favorite journey search, see below). A departure or arrival board for your home stop is shown by default, as can be seen in the screenshot above.
This brings possibilities for interesting features: Alarms, filters, favorite journeys, use of GPS to find near stops, themable design or eg. to show stop positions in a map application like Marble. It frees online timetable data from the browser.


PublicTransport Project
 
The project consists not only of a single applet now, but also two more applets, two data engines, a runner, a tool to add support for new service providers (TimetableMate) and a helper library (shared between the applets, the runner and TimetableMate). The data engines are named publictransport (to get timetable data) and openstreetmap (used in conjunction with the geolocation data engine to get public transport stops near the user). The other two applets are Flights (only shows flight departures with status) and GraphicalTimetableLine (showing vehicles moving on a street, see below).

I'll now give an overview of the current state of the project.


PublicTransport Data Engine

The PublicTransport data engine provides timetable data from different service providers. It uses "accessors" to get timetable data from service providers.
Currently there is one main type of accessors, that downloads documents and parses them using a script (in JavaScript, Ruby or Python, using Kross). Another type parses XML files, but is only used by one accessor (de_rmv). A new type is currently being developed in a feature branch in the git repository: GTFS. This new type will be included in version 0.11 (which maybe gets 1.0). It imports GTFS feeds into a local database (therefore it will work offline). Adding GTFS feeds of new service providers will be very easy.
There are currently 21 accessors available for Germany, Italy, Czech Republic, Switzerland, Austria, Belgium, Denmark, France, Poland, Slovakia, Sweden and the USA. Not all of them cover a whole country and there are multiple accessors for single countries. But at least for Germany, Switzerland, Austria and Poland it can be used for all public transport stops and train stations. Flight departures/arrivals are retrieved using flightstats.com for flights all over the world.
Writing new scripted accessors isn't too hard, I've even created a tool for that task (TimetableMate). Complete information about it is available in the documentation of the data engine (eg. at http://publictransport.horizon-host.com/doc/engine/0.10/page_accessor_infos.html). It needs an XML file with information about the service provider, urls to download departure documents from and a script to parse them. For parsing specially named functions in the script get called with the downloaded timetable documents (mostly HTML).

The data engine has multiple data sources, eg. "Service Providers" to give information about installed service providers or "Locations" to give information about all supported countries. The more interesting data sources are "Departures ...", "Arrivals ..."/"Journeys ..." and "Stops ..." (for stop suggestions). These data sources need more information like a stop name to get departures for, so a complete source name to get departures from "Bremen Hbf" using the service provider "de_db" looks like this: "Departures de_db|stop=Bremen Hbf". The service provider can be left away, the data engine then uses the default service provider for the users current country.
The data engine tries to get as much information from the timetable document as possible like delays, delay reasons, news, routes, platforms, operators, vehicle types, ...

All service providers in the data engine are tested with unit tests, to be able to quickly update accessors which service provider decided to change the layout of their website. Error messages from parser scripts get logged (with the HTML code where parsing failed), which makes it easier to identify and fix problems (eg. ~/.kde/share/apps/plasma_engine_publictransport/accessors.log).


PublicTransport Helper Library

This library can be used by applets, runners or normal applications which use the PublicTransport data engine. It offers classes for filters, an enumeration for vehicles types and flexible widgets/dialogs to configure stop settings/filters.
Originally the code was developed inside the publictransport applet, but since there is now also a runner, I put that code into a separate library and made it more flexible. It now also gets used by TimetableMate.
The library has good documentation for almost everything and should help people that want to write another timetable applet or runner using the data engine (or maybe a wallpaper plugin which shows a bus stop with buses coming and going like they do in reality ;)).


PublicTransport Applet

The applet shows departure/arrival boards for configured stops. You can also use it to search for journeys. It has some advanced features like filters and alarms. Departures can be filtered by multiple constraints, eg. a filter can be created to only show buses that go via a given stop. Alarms use the same filter classes to filter out the departures to create alarms for.
If enough data is available, the applet can show delays, news about the departures and stops on the route.
To make it easy to distinguish departures in the list, they are grouped by direction automatically. These groups are visualized by background colors. Each group can be turned off to filter out it's departures.
The appearance of the applet is very flexible, eg. it's contents can be made very big (with big fonts/icons), which is useful if the applet is used like a big display panel.
A screenshot can be seen on top of this post. 
In the next version (0.10) journey searches that are used often, can be set as a favorite journey search with a meaningful name. These journey searches can be executed with one click.


GraphicalTimetableLine Applet

This applet shows departures as vehicle icons moving on a street with nice animations.


GraphicalTimetableLine applet in action for a german stop with a tooltip


PublicTransport Runner

The runner can show departures, arrivals, journeys and stop suggestions using a simple query syntax, eg. "Departures Bremen Hbf" to show departures from "Bremen Hbf". It automatically uses the default accessor for the users country and can directly be used.

The runner in action with a custom german keyword for "departures"

TimetableMate

This is a little IDE that helps adding support for new service providers. It offers syntax completion, syntax error checking, complete script checking with sample data, an embedded web viewer (using KWebKit), GUI for all accessor settings (name, author, template urls, changelog, ...), installation of new accessors.

TimetableMate with syntax completion


Problems

The biggest problem is getting the timetable data. Mostly HTML documents need to get parsed, which of course isn't very nice. For some service providers there are better alternatives at least for stop suggestions (eg. JSON). For one service provider an XML source is used to get departures/arrivals (de_rmv). I talked with "Deutsche Bahn" (de_db) about that, but for now I only got a half-closed interface to get journeys (but with much less information than when using the HTML source).
This will be resolved partly with the coming GTFS accessor type, which can be used for many new service providers (with ~10 lines of XML for each accessor). But here in Germany for example, there is no publicly available GTFS feed.


Future
  • Better support for GHNS to download new accessors. This gets more important with the many new GTFS accessors.
  • Finish GTFS support
  • New parser for journey search strings


Resources