MikeN's diary

Recent diary entries

US Rail Crossing Statistics

Posted by MikeN on 16 August 2017 in English (English)

Out of curiosity, I pulled some statistics on the US Rail network. This does cross into a bit of Canada and Mexico where the GeoFabrik extract approximated the boundary.

level_crossing	TOTAL	232167
crossing	TOTAL	8231

Rail_Bridge	47232
Rail_Tunnel	16394
Highway_Bridge	70811
Highway_Tunnel	8972

Total Layer Crossings 143409

A bridge or tunnel is counted as a single occurrence no matter how many rail lines are included. Each marked crossing node is counted, so a fully mapped rail yard could contain many crossing nodes.

For a closer look here is a breakdown of the top 20 categories of level_crossing - the first column is the type of ‘highway’, and the second column is the type of ‘railway’ in OSM:

residential	rail	112659
service	rail	29171
tertiary	rail	26789
secondary	rail	16717
unclassified	rail	12732
track	rail	10058
primary	rail	6734
residential	disused	1628
residential	light_rail	1547
residential	tram	1413
trunk	rail	1122
tertiary	light_rail	1114
secondary	light_rail	1059
service	light_rail	676
residential	abandoned	660
primary	light_rail	517
tertiary	tram	493
service	disused	435
tertiary	disused	372
residential	Unknown	371
residential	preserved	353

A type of ‘Unknown’ usually means that a node that joins ‘highway’ to ‘highway’ is marked as type level_crossing for example.

Similarly, here are the top 10 ‘crossing’ types:

footway	rail	2984
footway	light_rail	2083
path	rail	881
cycleway	rail	762
footway	tram	565
path	light_rail	116
cycleway	light_rail	96
footway	miniature	86
footway	preserved	62
residential	rail	56
service	rail	34

Note the presence of ‘residential’ and ‘service’ which are either newly incorrectly marked ‘crossing’s or a rail crossing at the junction of ‘residential’ and ‘path’ for example.

To see the complete breakdown to perform custom category groupings, obtain the raw .CSV files from Rail Crossing Counts

Maproulette Rail Crossings completion

Posted by MikeN on 7 August 2017 in English (English)

The group of railroad crossing challenges listed in the previous diary entries is now complete. Many people worked on these challenges to make this happen. The tasks seemed to get more difficult as the challenge neared completion, since each task would correct a single crossing at a time.

Many of the tasks involved original TIGER highway crossings with poor alignment. I also worked on geometric alignment of the crossing approach roads to match the aerial imagery. Although almost none of these would be cross-checked against GPS traces, Bing imagery is almost always a factor of 10 more precise than the worst of the TIGER data. The resulting edits may have an effect on “TIGER desert analysis” - the study of large untouched areas in the US. Those “TIGER deserts” still exist but would be smaller. My personal contributions heat map now mostly shows areas in the US where there were rail crossings that needed fixing:

US Contributor Heat Map

I have fond memories /nightmares of a task landing in West Virginia or Kentucky, and realizing that the only way I could fix the crossing called out by the task would be to untangle the surrounding 30 miles of rail and roads.

The challenges took a total of 10 months to complete. I attempted to keep the quality of tasks relevant by re-running the analysis every week. Thus any fixes completed outside of MapRoulette or edits of surrounding crossings would automatically be marked complete in MapRoulette.

In the role of the “Monday Morning Quarterback”, (with perfect hindsight) there are things that ideally would be done differently or included in a more detailed task definition page:

  • Separate tasks into additional challenges to reduce the time needed to analyze each one. For example, a ‘duplicate rail crossing nodes’ task would define a work flow even if there is already a visible ‘X’.
  • Create a video tutorial for the most popular editors to show the workflow for all the common tasks and how to fix. Even having screen shots of how to identify most common US bridges and tunnel types would help.
  • I was able to concentrate on tasks marked as Skipped or ‘Too hard’ by querying the API. If the crossing was completely invisible from aerial imagery or alternate aerial imagery such as being under a bridge that might be 3 layers, the final task completion was to remove the task and leave an OSM Fixme or OSM Note. It would be useful to add a preference for those task status types in MapRoulette.
  • Define a helpful sequence of alternate imagery: latest TIGER, NCOneMap (for North Carolina off-leaf imagery), use of SRTM Elevation to detect the difference between a tunnel entrance VS barriers on a decommissioned highway crossing, USGS Large Scale Imagery / NAIP for latest imagery for road realignments.
  • Define a sequence of analysis when encountering a crossing that looks correct: was it recently fixed by someone else, or is the problem difficult to see? (1. Try a node “J”oin in JOSM. 2. Check date of last Crossing node edit - mark as already fixed if corrected in the past week. 3. Check the last edit date of both the highway and railway - mark as already fixed if corrected in the past week. ) The trickiest of these was a bridge - road crossing where all edits were much older than the last OSM synchronization. The problem was a ‘duplicate bridge’, two 2-node ways where one ‘bridge’ had no layer or bridge tag.

As a side note about the internal detail of MapRoulette, it seemed as though the Postgres SQL RANDOM statement was not truly random - as though there was an internal optimization or spatial caching of a previous RANDOM statement. My only evidence of this was that when encountering a 2-track crossing on a task where both level crossing nodes were corrected for the first task, the second crossing task would appear after only a small number of other tasks.

Because OSM is constantly being edited, there are now new unmarked crossings identified. I have not extended the Rail Crossings challenge because many of them come from new construction where there is no imagery to determine whether the crossing should be a level crossing or bridge. I may add these if I can find a way to ensure that just the solvable crossings are identified.

The end result of these challenges is that the Rail-Road network intersections in the US are much more accurate. They could serve as a reference to routing apps that can generate an alert for level crossings. This application has one case that doesn’t fit easily in the OSM tagging convention: busy tram crossings in cities, where there may be many crossing alerts generated over a short distance. Although they are actual rail crossings that would result in serious damage in an accident, continuous alerts are a case of ‘crying wolf’. This was briefly discussed in railway=level_crossing with in-street trams? , but without any definite resolution. For now, the application developer also needs to examine the crossing rail type to screen out tram crossings - probably not an ideal solution.

Progress Report: Railway Crossings challenge for MapRoulette

Posted by MikeN on 24 January 2017 in English (English)

Update on challenge status:

Crossing Ways Rail-Highway 35.6% complete
Crossing Type Rail-Highway 53.3% complete    Both types of pedestrian crossings are 100% complete.

I’m taking time to do geometric repairs as well. The result will be a reasonably correct road network near rail crossings.

The links to the remaining challenges are:

[Crossing Ways: Highway-Railway, US]

[Crossing Type: Highway-Railway, US]

Progress Report: Railway Crossings challenge for MapRoulette

Posted by MikeN on 18 November 2016 in English (English)

As previously mentioned in diary entry , the MapRoulette Rail challenge uses topological analysis of the rail network to generate the task list. Many tasks land on a multi-track crossing or include a rail yard, so many tasks correct fix more than 1 crossing per edit session. Also, anyone routinely reviewing an area might correct rail crossings outside of MapRoulette By re-analyzing current OSM data, finished tasks were marked off. Some findings for the last interval:

  • 1221 tasks marked complete in Map Roulette
  • 5468 tasks auto-marked complete after detection as fixed
  • 4.5:1 node correction per task ratio
  • Overall topological rail challenge 16% complete 10925 / 68758 tasks

Because the first MapRoulette tasks are more likely to land on in a ‘busy’ area, future fix rates and node correction to task ratio will probably fall as each task is more likely to point to an isolated crossing.

Many tasks end up making geometric corrections to the rails and surrounding roads.

The links to these challenges are:

[Crossing Ways: Highway-Railway, US]

[Crossing Ways: Pedestrian-Railway, US]

[Crossing Type: Highway-Railway, US]

[Crossing Type: Pedestrian-Railway, US]

There was a moment of panic as the analysis also discovered thousands of new tasks! This had 2 causes:

  • An abandoned rail line / cycle trail was changed to be a railway. I have commented on that changeset asking for more information.
  • A node from changeset was moved across the country near the following changeset in JOSM. From a history examination, I’m not sure if the rail node was attached to Benton Street. Maybe there is a JOSM hotkey that does this. To try to avoid this in the future, I have changed MapRoulette to open JOSM in a new layer.

Railway Crossings challenge for MapRoulette

Posted by MikeN on 30 October 2016 in English (English)

One class of quality improvements in the US is Railway Crossings. The original TIGER import mostly connected railways to highways or crossed with a duplicate but unconnected road. There was no bridge or crossing information to assist.

It can be useful to have railway crossing information available for navigation. In 2015, MapRoulette defined a challenge to review all railway crossings. The first version of the Railway Crossings challenge at MapRoulette used the points defined by the US Federal Railway Authority. Many of these crossings had already been corrected by map editors during normal QA. The challenge began with 120K crossings to review. This was reduced to some 70K points by the time MapRoulette V2 came out. The partially completed challenge was not migrated because the MapRoulette V2 features were being tested and improved. Although this is not the ideal type of task for MapRoulette, I enjoyed being able to knock out 5 in a row without much effort (unless in KY, PA or WV!) It also is ideal for an armchair challenge - only a few are difficult to make out from the air.

Because I would typically correct nearby crossings when fixing a task (others may have also), I wondered if identifying remaining crossings with a topological analysis would result in fewer false positives, and fewer already-completed tasks to review. So I set up a POSTGIS instance and tried to construct queries that would identify problem crossings. That proved to be too difficult:

  • I couldn’t tell if OSM2PGSQL or Osmosis populate the database with the full OSM topology, including where nodes are shared, and even if so, what would such queries look like.
  • I only had 8G of RAM, and didn’t know how to construct queries for POSTGIS that would handle US-sized data in a reasonable time.

In the end, I wrote a program in C# to analyze railway crossings. I filtered the raw OSM data in Osmosis so that my program only needed to deal with Highway VS Railway. As I looked at some results, I realized that I could also Quality Check pedestrian-railway crossings with the same program, and create another challenge.

As I was looking at the first analysis, I found many bridges without a layer tag. Some would say that bridges imply a nonzero layer, but it is still better to specify. For this challenge however, I excluded all bridges, tunnels, and ways with a layer attribute. My thinking is that those locations do not have a typical railroad crossing, and someone has already done some review there. And there are other OSM QA tools that already address bridges with missing layer tags.

I also exclude these railway types: abandoned, razed, station, disused, dismantled, demolished, adjacent, platform . Although a ‘disused’ railway may cross a road, I saw too many of these with no X painted on the road and could not identify that rails are even present. Often they would require local or RailFan knowledge to be accurate.

When railways intersect a roadway and share a node, I check for a railway=level_crossing node tag. When railways intersect a sidewalk, path, or cycleway, I check for a railway=crossing tag. Many highway crossings are marked as a pedestrian crossing because the mapper’s natural choice is ‘this is a railway crossing’, therefore railway=crossing.

Because the OSM data is the starting reference, no crossings will be flagged where driveways cross a railway, but no driveway exists in OSM.

The links to these challenges are:

[Crossing Ways: Highway-Railway, US]

[Crossing Ways: Pedestrian-Railway, US]

[Crossing Type: Highway-Railway, US]

[Crossing Type: Pedestrian-Railway, US]

Some C# problems I encountered (with the Microsoft .NET library):

“640K of memory ought to be enough for anyone” Out of memory! What?! It turns out that the default build properties are set to “Prefer 32-bit”. Unchecking that option gives a larger memory option. Be sure to do that for both Debug and Release!

Rerun - it gets further, but now “The dimension of the array exceeds the limits of addressing.”

“47995853 nodes ought to be enough for anyone”

The next problem encountered was the discovery that the default hash implementation supports only 47,995,853 objects before giving an out of memory error. Fortunately the error is easy to work around. In the application configuration file, configure the runtime to support very large objects with the gcAllowVeryLargeObjects tag: <configuration> <startup> <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.6.1"/> </startup> <runtime> <gcAllowVeryLargeObjects enabled="true" /> </runtime> </configuration>