Polyglot's Diary

Public Transport Mapping, why do we add the stop details several times over?

Posted by Polyglot on 27 July 2017 in English.

When we started mapping public transport stops, some people insisted on mapping them on nodes next to the way, others thought the right way to do it, was to add them as nodes being part of the highway, thereby losing the information on which side of the road the bus stop was.

Then somebody came by with the idea to unite both ways of mapping. In itself that sounds great. But where do we add the details then? On both? That doesn’t really make sense. It’s a maintenance nightmare.

So we still have some people adding the stops as stop_position nodes on the highways and others mapping them as isolated nodes next the ways as public_transport=platform. But of course a node is not a platform, so others map those as ways and areas. Nothing wrong with that, but why do we need to add all the details to these ways?

For some reason it was decided that both these stop_postion nodes and the platform ways/nodes need to be added over and over again to the route relations. These route relations represent each and every variation of the public transport lines, so there are thousands of them. Another maintenance nightmare.

Why can’t we have a node next to the way, with all the relevant details and add those nodes to the route relations, then followed by a continuous string of ways? The node gets tagged public_transport=platform/highway=bus_stop.

The node isn’t always representing an actual platform. If there is a platform, nothing wrong with representing it as a way or an area. But there is no real need to duplicate all the details like name/ref/route_ref/zone to these ways. And there isn’t really a need to add them to the route relations.

For the simplest bus stops a node next to the way public_transport=platform/highway=bus_stop is all that is needed. It contains all the relevant information and it has coordinates, which makes it convenient to compare it to data from operators.

For more completely mapped bus stops, benches and waste_baskets can be added.

If you want to make explcit where the vehicle stops, a public_transport=stop_position can be added on the highway. For the first and last stops, the way should be split there, as it’s the beginning or end of the route.

But these stop_position nodes are not all that important, so no real need to map them for every stop. Also no reason why the stop details should be repeated on them and no real need to add them to the route relations. It’s enough to add them to stop_area relations.

Discussion

Comment from NunoCaldeira on 27 July 2017 at 11:37

PTv2 is the current schema. Some users are still using the old method as its easier to add stuff and you don’t have to be an “advanced” editor to edit relations and all that. Further info at https://wiki.openstreetmap.org/wiki/Public_transport#Different_Tagging_Schemas

My point of view, it’s better to have bus stops in the old schema than not having the bus stops at all. Clearly the PTv2 is better, but is more time consuming to edit. Therefore advanced users should fix bus stops into PTv2 standards, simply because some users don’t know/care about relations or have time to edit in that schema.

Comment from Polyglot on 27 July 2017 at 12:22

I’m converting those older route relations to v2. That’s not the issue. What I’m concerned about is that v2 is far from perfect. It’s too complex and it has duplication of information everywhere.

I don’t mind redundancy, like having bench=yes on the bus stop node and an amenity=bench mapped explicitely on a node of its own. But having the details like name, ref, operator, network several times over and adding each stop to the route relations twice irks me.

To help mappers having an easier time to map public transport, or at least not break those route relations when they make changes to the ways, I’m overseeing development of PT_Assistant.

But the scheme should be as user friendly as it possibly can be.

Polyglot

Comment from gileri on 27 July 2017 at 18:51

I don’t understand what is the issue with PTv2. The previous schema was limited and imprecise.

The downside of having a more exact schema is that it’s less simple. But that’s the price to pay to have an accurate map. You can’t always map complex things with simple schema, sometimes you need to accept a bit more complexity (let’s face it, it’s really not that complicated to map stops and lines).

What’s even better with PTv2, is that if you don’t want to map stop_positions for example, you can just map the platforms, and let someone else map the rest. Please just don’t delete correct information on the map because you don’t want “extra complexity”.

Comment from gileri on 27 July 2017 at 18:55

Also no reason why the stop details should be repeated on them and no real need to add them to the route relations

Why should the platforms be part of the route and not the stops ? I don’t see why one should be prioritized above the other; in I would argue that to the route the bus/train/tram takes the stopping positions are more related. So I think the best thing to do is to add them both to the relation; it’s really not a big deal.

Comment from Polyglot on 30 July 2017 at 18:10

Don’t get me wrong, we had to get away from the principle of throwing everything in a bag and hope for the best. All v1 was good for was rendering where you might encounter a given line. There was no hope to be able to validate it automatically though. What I’m proposing is a way of mapping public transport without duplication of information. I don’t mind some redundancy like mapping bin=yes/bench=yes/shelter=yes and also mapping amenity waste_basket/bench and shelter as separate ofbjects, but repeating the name,ref,operator,network,route_ref and so on will quickly get out of sync.

Of course, once all the details can be found on a node highway=bus_stop public_transport=platform

it would make no sense at all to add other objects to the route relations. Especially given that there are a lot of route relations, one for each variation of a line.

Polyglot

Comment from AgusQui on 4 August 2017 at 04:02

To the node stop_position I only put the name to order it easily in the relation, the rest of the data to the platform.

Comment from xmd5a on 4 August 2017 at 21:20

I do not understand what you mean by duplicating the data, because it is possible to use stop area relation.

Comment from Polyglot on 4 August 2017 at 22:21

For me the ideal scenario is as follows:

Concentrate all the details (name, ref, zone, route_ref, operator, network) - on a node (as it has coordinates) - next to the highway (so it is immediately apparent in what direction the bus travels)

Add additional details like bench and waste_basket as separate nodes Add additional detail like actual platform as way or area Add shelter as area, like we do for buildings

Once all the details of the logical bus stop are on a single node, use this node in the route relations. To avoid clutter in the route relations, it would be better to not add them to the route relations.

The stop_area relations can be used to group all the objects together that make up the stop. (platform node, stop_position node (if mapped), platform way (if present and mapped), shelter, waste_basket, bench (if present and mapped). Repeating name on stop_area relations will make managing them easier, but no need to repeat the names on the platform ways or the stop_position nodes.

Comment from gileri on 4 August 2017 at 23:20

Concentrate all the details (name, ref, zone, route_ref, operator, network) - on a node (as it has coordinates) - next to the highway (so it is immediately apparent in what direction the bus travels)

Why is the node next to the road the “primary” feature on a PT route ? Also, no it may not be immediately apparent for a lot of cases; for example when the node is between two roads, when buses are on a opposite lane.

Add additional details like bench and waste_basket as separate nodes Add additional detail like actual platform as way or area Add shelter as area, like we do for buildings

I agree, but I’m not sure to find an accurate enough method of drawing such features as of now.

Once all the details of the logical bus stop are on a single node, use this node in the route relations. To avoid clutter in the route relations, it would be better to not add them to the route relations.

Are you talking about “minor” features such as waste baskets and benches ?

The stop_area relations can be used to group all the objects together that make up the stop

Yes. However linking features like waste_baskets is more discutable as they are not really related to the bus stop apart from proximity.

Repeating name on stop_area relations will make managing them easier, but no need to repeat the names on the platform ways or the stop_position nodes.

Ideally software and users would be able to parse informations from the stop_area, sadly this may not always be the case. So I usually duplicate information there, but would not document or encourage this.

Comment from datendelphin on 5 August 2017 at 12:46

I think stop positions are too much emphasized in the wiki. I try to maintain the public transport stops in Switzerland, and I have a lot of awkward discussions with new mappers who help about how to map them. Too many times I see mapper moving stops from beside the road onto the road, thinking this is the new, correct way to map bus stops.

Your summary also cleared up a few misconceptions I had about the more detailed mapping scheme, thanks :)

I also think using just the node beside the road is the way to go for the simple cases. As gileri mentioned, there are situations that call for more information. Good to have that flexibility to use stop area and stop position in those cases. But using relations for every simple bus stop is is a maintenance night mare for me.

I will worry about shelter, waste basket and route relations after we are up to date with the stops themselves in Switzerland.

Comment from Polyglot on 6 August 2017 at 09:22

I think that we should be able to get from the simple cases to the fully mapped ones without the need to transfer the stop’s details from one object to another.

That’s another reason, why I’d start with a public_transport=platform node and keep going with it. Both to represent the stop, as for adding it to the route relations.

This also helps preserve the history of that stop, but most importantly it simplifies everything, regardless of whether a stop is already mapped in all its detail or not. It also helps consistency. Both stops in the middle of nowhere and stops that are part of a complex bus stations are then mapped in the same way. On a node, next to the highway. Route relations are easy to sort. In case of ambiguity, we can add stop_area relations to show which platform node belongs to which stop_position / platform way. (This presupposes that stop_area relations contain pairs of platform node / stop_position node as a basis and not all stops that happen to have the same name, those can easily be found based on proximity)

Polyglot

Comment from slhh on 8 August 2017 at 01:16

There are some valid thoughts included, but PTv2 has a much more important issue. It’s a nightmare for the maintainance of the highway/railways where many routes or variants exists, at least without accepting massive PT destruction. This is due to the highway/railway ways being member of very many PT route relations, and this maintainance would often to be done by user with litte or no knowledge of PT mapping.

Any solution for this issue, which I can imagine, seems to become too complex to handle for data consumers without making heavy use of stop positions. Therefore, it doesn’t seem to be a good idea to drop stop positions or to make them less accessible from the route relation.

Comment from Polyglot on 8 August 2017 at 05:35

That’s exactly the reason why the PT_Assistant plug-in was created. It helps a lot to know where there are problems and some problems can be fixed (semi-)automatically.

I also live in hope that one day we’ll be able to use route relations containing other route relations like segments, such that each way only needs to be a member of 2 route relations, one for each direction of travel.

This will make maintenance a lot easier, but it comes with the price of some added complexity. At the moment it’s possible to visually check for continuity of the ways. We’ll need additional editor support in the relation editor though.

Comment from slhh on 9 August 2017 at 11:50

A tool like the PT_Assistant might help the PT-expert to repair the damages, but it doesn’t solve the core issue. How should editors, like iD, handle PT-relations for the normal users? In case the editor protects the PT-relations it prevents maintainance of the highway/railway. In case it doesn’t protect, the PT-experts get much work to repair the resulting damages. Are they really willing to do this work permanently?

We do need to get better editor support for PT, but not only for JOSM but also for iD. Excluding the majority of users from being able to map PT is quite bad for the data quality. We won’t get reasonable editor support as long as we haven’t defined and decided on a better PT data model. I can’t imagine resonable Editor support in iD for the current PTv2. Cloning JOSM’s relation editor into iD doesn’t make sense, because it is enabling advanced users to map PT, but not the normal user. Therefore, it would be useless for the typical iD user.

Comment from slhh on 9 August 2017 at 23:08

I can imagine two different approches to reduce the impact of highway/railway maintanance on PT datastructures and vice versa. Both approches a based on removing the ways from the the route relations of directions and variants. This makes these relations quite different to traditional route relations. Therfore, I call these relations tour relations here.

The ways used by any direction or variant are added as members to the master route relation like it were a PTv1 route. The tour relations contain a sorted list of the stops. A map can be easily rendered using the master route relations. Applications like passenger routing can be based on the tour relations, because the exact route (ways) is not really required for this application.

Applications requiring the exact route of a variant are likely rare. In this case data consumers need to autoroute based on the stop sequence from the tour relation and the limited set of ways from the master relation. A simple standardized routing algorithm has to be used in order to make the autorouting result well defined. The autorouting algorithm shall ignore tags, but respect forward/backward roles of the ways. We will need editor support or at least a QA tool to check that autorouting is possible and has a definite, correct result. In case the result is incorrect some via nodes can be added to the tour relation to fix the routing. We might also allow to use via ways where applicable.

Stops shall contain nodes on the way (stop positions). Otherwise autorouting would become more complicated and error prone.
The ways are put into unsorted route segments. The route segments shall have a from and a to member specifying the start and end node of the route segment. This enables checking the completeness of the segment, and it helps to identify and find the segment. The route shall be splitted at stop positions, or in special cases at named via nodes, which have to be added to the tour relations. Due to stop positions being named, the editor can generate a display name of the segment based derived from the names the from and to nodes. Splitting the route at nodes, which are also included in the tour relation, the editor can find segments fitting to the tour automatically, and offer them to be added as members of the tour.

We might even decide not to add the segments, except where it is ambiguous. In this case data consumers have to look-up the segments. Maybe an extension of the Overpass API can help to do this.

We might also let the editor add the segments to the tour relation automatically if unambiguous.

We might allow segments to have forked ends with multiple from or to members, but use should be limited to multiple stop positions of the same station. Otherwise generating a display name for the segment would be complicated. Using forked segments can reduce the number of segment relations significatly, but it adds some complexity for data consumers to remove unused branches.

Comment from Polyglot on 19 August 2017 at 07:06

I’m sorry slhh. I fail to see how such ‘hinting to itineraries’ could be validated automatically. So I don’t see that as a valid way forward. If we want to have less impact of route relations of ways, a better approach would be to make smaller segment relations and use those in the relations describing the itineraries. For the time being the ‘explicit’ way of describing the ways used by the buses and in which order is the most straightforward (and clear for both users and data consumers) way to go.

DatenDelphin, I would like it best if we can continue using that node next to the way, all through the lifetime of the stop. So no conversion to stop_position node or platform way.

Comment from slhh on 19 August 2017 at 20:27

If we want to have less impact of route relations of ways, a better approach would be to make smaller segment relations and use those in the relations describing the itineraries.

This is essentially the second approch, which I have proposed above.

I fail to see how such ‘hinting to itineraries’ could be validated automatically.

I would disagree on automatic validation being impossible. The relation could no longer be verified separately, but the other involved relations can be easily determined automatically, and validation can be done based on the determined complete set of relations.

In case of my first approch, the following automatic validations seem to be possible:

Routability of each itinerary based on sole use of the ways in the master route relation
Uniqueness of the routing result
No unused ways in the master route, which aren’t used by any itinerary

In case of my second approch, automatic validation seem to be quite straightforward. Even if we allow to omit the segment as member where unique, the route can be automatically determined, and we can validate for existence and uniqueness automatically .

Comment from slhh on 19 August 2017 at 23:00

If you want to make explcit where the vehicle stops, a public_transport=stop_position can be added on the highway. For the first and last stops, the way should be split there, as it’s the beginning or end of the route.

But these stop_position nodes are not all that important, so no real need to map them for every stop. Also no reason why the stop details should be repeated on them and no real need to add them to the route relations. It’s enough to add them to stop_area relations.

There can be multiple stop positions per platform. If the route contains platform nodes only the correct stop position can not be determined. We would have to use the stop_area, which is containing a single stop and single platform, as member of the route instead. This doesn’t seem to be convenient. Alternatively, we can place a seperate platform node per stop position, but this would change the definition of public_transport=platform, including changing the meaning of existing data. Therefore, we must not use public_transport=platform in this case.

Comment from datendelphin on 29 August 2017 at 15:58

slhh: I’m not sure I understand your comment correctly. I thought route relations can contain both, the stop position and the platform. So why “not use public_transport=platform in this case”?

OpenStreetMap

Polyglot's Diary

Public Transport Mapping, why do we add the stop details several times over?

Discussion

Log in to leave a comment