baditaflorin's Diary

Recent diary entries

How to create a translation file to convert a shp file to osm

Posted by baditaflorin on 31 October 2017 in English (English). Last updated on 1 November 2017.

Obviously but very importantly:

  • Before copying anything into OSM the license needs to be properly vetted.
  • If you are planning to perform any kind of bulk update, the full extent of the Import Guidelines applies.

After comparing Tiger 2017 for 16 US states until now, using shp2cygnus, a tool that I written at Telenav to be able to compare big datasets in a automatic way, to identify roads that are present in the dataset, but not in OSM

After posting on the talk-us, some people told that it would be more useful to run this tool on other datasets, from state or local authorities.

In this article, I want to teach you how to create your own translation file, so that if you have a shp file, you can use Cygnus to compare this with OSM, using this link If you have a shp were the size is bigger then 50*50 km, you can send me a email and I can process it for you.

But you need to learn how to create your own translation file.

So, first step.

  1. Find your data

We will use as a example Philadelphia Each city or local authority should have a open data portal. After a google search for “Philadelphia Open data”, I found this site image

Search for either street or road


Click the data-set, and identify the shp, that you need to download


Extract the archive and open the shp into JOSM.

To be able to open shp in JOSM, you will need to install the OpenData Plugin


If we look at the tags, we see that they don`t have any connection with what we are used to, like highway=residential, tertiary, etc or oneway=yes

To be able to understand how to translate the shp to osm, you need the Street Centerline Metadata


Click Download, and you will see, at least for the Class, what is the highway type.


To see how you can create the translation for osm, you can look here for the OSM highway tags

After doing this, run this script, and then load the output.osm file into JOSM.

python -t –add-timestamp –add-version –positive-id input_shapefile.shp -o output.osm

It will not be perfect, this is after the first try.


Comparing this with osm data I see that I have to much yellow roads (tertiary)

Also, the trunk roads are actually motorway on OSM


We look at the shapefile and identify that Class=4 should be highway=residential, instead of highway=tertiary


We run again the converter image

And we get now a more better result, more accurate with what we have in OSM.


If we are happy with the result, we need to save the osm XML file to PBF, so that we can use them with Cygnus.

We do this by installing the pbf plugin in JOSM


Save the file as Pbf


We next go to the website, and load the pbf file.


Also, you can select some tags of interest, for me, this was oneway.


In the end you should see something like this


You can download the translation file for Philadelphia from here

Yesterday we had our anniversary OSM Time meetup, where we celebrated 13th years of OSM, and we also had a workshop for the participants, to learn how they can extract information’s from OSM, using overpass API and overpass-turbo.

Attaching 2 slides examples :

You can find the slides here

You can find the materials for the participants here

Location: Center, Cluj-Napoca, Cluj Metropolitan Area, Cluj, Romania

1. The most simple way is to count just the ways with the highway tag that are at version 1

But …

This fails when we approach the fact that when you split a way, the longer segment will remain with the WAY id, and the smaller way will get a new ID, meaning that it will be version 1. Alt text

2. We can use the nodes of the ways

Using the nodes as a identifier, because even if the way is new, all of the nodes are still from the old way, meaning that the timestamp will be different, the user that created the node will be a different user then the user of the way.

3. But how can we find out this in OverPass ?

Until now, i have this overpass turbo (Go To Halifax, Canada) script that tries to do some filtering, meaning that it will give just the ways that are version 1, and will filter ways that have more then 3 tags. (Because when we add roads, we add a maximum of 3 tags, name, highway type and oneway) If i split a way that have 6 tags, even if it will be version 1, it will be excluded because it have more then 3 tags.

But there should be a better method. And here we still get some ways that are splitted but have less then 3 tags. And in residential areas you get a lot of false positive.

Any ideas how to refine or other technique that i could use to achive the task of getting just the new roads ?

And don’t tell me use the full planet history, because there is no such think as a full planet history file that you can use, compile, run. Nope, it`s a legend.

If i try to use the copy and paste function from the keyboard, more times than not it will not work.

Only when using the menu Edit -> copy it will copy and paste in the first go.

Are you having the same problem or there is something local with me and my colleagues from the map analyst team that we are having this problem ?

I made a ticket

And for simplicity, also, a video showing the error.

How to find out the total length of highway per type of road / per Country in PostGIS

using the OSM postgis Script Repository This is a part of the final result.

  • Step 1. - Download the North-America osm.pbf file from Geofabrik __(7.4 Gb)__

The link is here You can use this command line to download the file in linux.

  • Step 2. - Load the data into PostGIS using the utility tool

You can find the script here The aim of this tool is to simplify the process of importing a osm.pbf file into PostGIS. Now the procedure is complicated and you first have to create the database, then to enable postGIS and Hstore on that database, etc

Here, you have a very simple process where you need to tell the name of the postgres user, the name of the database that you will create and the name of the osm.pbf file that you want to import, without the extension. At the end you will see something like this :

INFO: Total execution time: 93403048 milliseconds.


                          Map Analyst Team

 Find Postgis Scripts and snippents of code that you can use here
 It is a Open Source Project so you can also contribuite
 with you PostGIS Code to make the repository more complete
  • Step 3. - Wait 25 hours for the file to import.

In the end we will have a 280 Gb database, that is composed of over 59M ways, 855M nodes, and 500k relations

  • Step 4. - Find out if a node is from USA,Canada or Mexico.

Now, all of or nodes and ways have GPS coordinates, but we do not know if they are in Canada or Mexico. Fortunately, there is a great guy in Germany that goes by the name of Walter Nordmann that have a server where he hosts the OSM Boundaries Map using the data from OSM. He is doing this as a volunteer, so if yoi can help with any donations this will definitely help him to keep the server running. The website is To do this we will download the admin_level=2 (al_2) boundaries of Canada.Mexico and USA.

  • Step 5. - Load the al_2 boundaries into the same postGIS database

To do this i used QGIS, a free and open source GIS tool, where you first need to connect to the database by doing :

Layer -> Add Layer -> Add PostGIS Layer.

In the new windows, under Connections click New and setup the connection.

After this, click OK and close the windows

We then go to

Database -> DB Manager -> DB Manager

We find and select our database from the left part of the screen.

We click on the Table Menu and we select Import vector layer

We put the name of the new table to be al_2 and we also select the index and we click ok

  • Step 6. - Calculate the total road length per each type of road and for each Country

This is the simple part, because we already have a OSM github script page with PostGIS scripts that you can run it for a city, a country or the planet, and they will work the same. The link is this OSM PostGIS script repository and the script that we are interested can be found here highway_length_per_type_different_attr.sql

  • Step 7. - Run the script

If you would run the script from the osm PostGIS script repo, you will get the total value of all of the roads in North America. This is why we need to see where each of the roads are located, and we do this using a inner join and ST_contains

The modified script is available as a gist here

I have not used the admin_level=2 borders that i have imported because the geometry of them was to complicated and it would have taken to much to get the result.

Instead i over-simplified the geometry until i got something really cartoon-ish compared to the first example. I first used the plugin SimpliPy developed by Albert Ferràs. You can download the plugin from Qgis Web link

  • Step 8 - Test and optimize the script, the polygon for the st_contains

I put a limit of 500 ways and tested to see how it will compare if i try to run the 500 examples using the bulk admin_level=2 boundaries, if i simplify and if i simplify and then also delete a lot of nodes so that i will speed up the process of Geocoding.

admin_level=2 no simplification. al_2_simplipy al_2_simplipy_extreme
16 sec 0.7 sec 0.3 sec

You have to have in mind that at the borders you will get some mixed results, where some parts of a town in Canada can become counted as part of USA, because of the over-simplification, but this is a useful when we have 59M ways, from where 22M ways have highway informations.

If for 500 we had to wait 16 seconds, for 5.000.000 ways, or a quarter of the total number of ways that at the worst case scenario we will have to wait 160.000 seconds, or
44 hours until the task will be completed.
For the complete 22M ways, this would mean around one week until we get the results.! dsds

By doing the simplification we reduce the amount to 0.7 seconds * 10000 = 7000 seconds or 2 hours for 25 % of the dataset. In 8 hours it should give us the results.

The last simplification meant that we start to deleting almost everything, trying to get a square, at least where i know that there is only water.

  • Step 9 - Run the script and get the results

In the end the script completed in 2861 seconds, or around 50 minutes.

The table with the results for North America can be seen here

The table was generated using this link

Hello dear OSM community.

It was been a while since i posted in the diary.

I had made a new script that i had put in the OSM postGIS scripts repo

You can find it in the statistics folder, under the name highway_length_per_type_different_attr.sql

How can you do the same for your country or region ?

First, download and run the (linux) or the scope.bat (win) utility that you will use to load the file into postGIS.

After you load the file in postGIS, you just have to connect to the database either in PgAdmin or via Qgis Dbmanager and copy and paste the code.

It took 20 seconds to generate this result for the map of Romania (~150 MB pbf file)


Please help me with ideas of what other scripts would there be interesting for you. Thanks and see you at SOTM in 3 weeks

I define a grid city as a city formed from straight roads.

You can read more and see a video where i try to explain here

Now, if we have a matrix of 7*7 grid, we have to click almost 50 times to add the city.

With this solution, we should only add the vertical and horizontal lines, clicking 14 times, and then using the Validation, it will see that we have 49 Crossing ways, and would offer the possibility to fix, by merging the overlapping ways together


1 1 2 2 3 3 4 4 5 5 6 6 7 7


I will not explain here all the process, but if people are interested, i can do a short video explaining all the process.

I have used the OSM Postgis Script to load the data in PostGIS. This was made for Stereo70 projection, but can be adapted for wsg84

The result it`s a completely 3D map that you can navigate and see each all of the edits made in one year or for each month (not shown in the video or photos)

3d usage

In this image you can see the ways that have been created or last modified in 2016 (top layer), then 2015, then 2014, then 2013, then 2012, then 2011, 2010,2009,etc

3d usage

2015(upper part),2014(middle),2013(lower part)

3d usage

You can use this to split a city into the different users that edited in that city, or by versions,etc

You can see a live demo in this youtube video that i made

This script works in 2 parts.

1. First it will generate from the osm.pbf file the list with all the admin_level=4 (al_4)

Then it will remove the ones that are not closed. Then it will make a polygon from each al_4, using the ST_BuildArea

After it will have a inner join that will search for all the nodes and ways that are a restaurant.

2. Then, using st_contains we are searching to see in what polygon each restaurant fits.

Took 4 seconds to do it for Romania

You can find the query here

Admin_level_4 Count
Municipiul București 505
Cluj 249
Constanța 217
Iași 155
Prahova 128
Brașov 124
Timiș 121
Mureș 109
Hunedoara 88
Sibiu 78
Arad 74
Galați 68
Buzău 56
Neamț 53
Harghita 53
Dolj 52
Argeș 47
Bihor 41
Covasna 38
Ilfov 36
Brăila 35
Vâlcea 33
Alba 32
Maramureș 29
Suceava 28
Bacău 28
Mehedinți 28
Caraș Severin 26
Călărași 22
Tulcea 22
Olt 20
Satu Mare 19
Gorj 18
Dâmbovița 16
Giurgiu 13
Ialomița 13
Botoșani 13
Bistrița-Năsăud 12
Vaslui 11
Vrancea 11
Teleorman 7
Sălaj 5

Random OSM World Stats

Posted by baditaflorin on 25 March 2016 in English (English). Last updated on 31 March 2016.

The total lenght of the pedestrian roads in the world is 2.919.220 kilometers

For the tunnels and bridges, i will also present to you the global figure here 114.949 pedestrian tunnels are in the osm planet map 351.099 pedestrian bridges are in the osm planet map

Crossings Name 48872 Paris 27647 Madrid 26310 Milan 18959 Yokohama 16748 Helsinki 15445 Kawasaki 15418 Tokio 13542 Moskau 13117 Portland 13081 Phoenix

Top Tunnels

Count tunnels Name city 4304 MĂĽnchen 2552 Moskau 2255 Berlin 1948 Essen 1792 Paris 1417 Hamburg 1366 Milan 1366 Vienna 1268 Helsinki 1205 London 1091 Zurich 1082 Prag 1021 Stuttgart

Top Bridges

Count Name 3911 Hongkong 3844 Kawasaki 3648 Shenzhen 3544 Kowloon 3535 Tokio 3327 Yokohama 3070 Rotterdam 2330 London 2256 Amsterdam

I made a windows version of the script. The script helps you load a osm.pbf file into PostGIS, without needing to know osmosis or other low level tools. The fact is that you want to load and use the data, not think what command line should i use in postgres or osmosis for doing this.

Osm Diary Entry about Osm PostGIS Script Repo

## Github code ## YouTube Tutorial


Run the find_duplicate_nodes.sql script from Github repository to find all the possible duplicated nodes that exist in Romania. ## Hardware The hardware used is a Intel i7-4770 CPU @ 3.40GHz, 16 GB RAM ## Numbers The pbf file for Romania is 144 MB. Romania database have a total 18M nodes. The query took 20 minutes to run, and found 34674 duplicated nodes. # 34674 duplicated nodes loaded into JOSM ## 1 The way 336945939 have 1500 nodes, and does not contain now any relevant information. ## 2 Buildings not sharing same connecting nodes ## 3 All the city had been imported 2 times This means that every building have a duplicate, one of them should be deleted. ## 4 Duplicated Highway path Way 370332689 and way 363957545 are the same. ## 5 Bad Import Separate ways for each objects, when they should be connected and share the same way. ## 6 2 relations on 2 different ways Way 164811803 and way 164682871 should be combined where they shere the same path, and one of them deleted, and mode the relation into the remaining way ## 7 Bad Import - Building and fence not sharing same way. That is one of my imports. Ups. ## 8 - To much details ## 9 - Reduntant Nodes The line is strait, so the information is redundant.

Osm Diary Entry about Osm PostGIS Script Repo

## Github code ## YouTube Tutorial

New Osm Postgis Script – Find duplicated nodes.

The idea of using a single repository where everybody cand find and fork the same Postgis OpenStreetMap schema scripts, is allowing anybody in the world to run a query on every part of the OSM database.

Read more on the project here Github code here

Even if you want to load a city and run a analysis on that town, or you want to find out where are all the duplicated nodes in Mexico, you can reuse the code that other people had created. Or you can add your own script that will be compatible with anybody that will load the osm.pbf with Osmosis.

To make the loading of a osm.pbf more simple, i made the SCOPE loader script for linux users, that allow everybody to create a postgis database without needing to know how to use Osmosis and other command line tools. Youtube tutorial here

Mexico Example :

In Pgadmin i have run the command to find duplicated nodes on all the nodes that exist in Mexico on a intel octacore, 16 GB RAM Machine

Mexico have 13.000.000 nodes, and the query took 550 seconds to complete. It found 43.000 rows, meaning around 10.000 instances of possible duplicated nodes, ways. For example this building appers on the same location 23 times !!!

JOSM building

This garden way needs smoothing 369775636 Also this way 369775423

The find duplicate nodes it`s available here

Openstreetmap Postgis Script Repository For Osmosis Pgsnapshot Schema - QGIS compatibility scripts

Github code here

Youtube tutorial here ( 4 minutes )

The licence is AGPLV3+


Here at Telenav , in the Map Analyst Team we use Postgis a lot.

In the real world, Postgis and OpenStreetMap seems to be used more only in Academics or by people that know a medium or high degree of programming.

I think that we should change that and make more simple the availability of Postgis as a tool that can help us visualize information’s from the OSM database much faster.

3 Ways of making Postgis more easy to use with OSM data

# I. - SCOPE

For this i will release SCOPE ( databaSe Creator Osmosis Postgis loadEr ), a command line tool for easy loading of a osm.pbf file into Postgis, via osmosis


  • You will find scope in the repo

II. A central place for osm-postgis-scripts

The idea is to have a central place were we can find different Postgis scripts, so when we want to do a statistic or something regarding the OSM data, to have a place were you can download, fork and store the scripts that you are using in your Projects The same is when you want to make a map, visualize some things, etc

screenshoot # III. Qgis + Postgis = Speed

Because people from different parts of the world can use the same script, this will mean that we can also use the same QGIS Style , so in the repo you can also upload your Qgis Style that you have used with one of the scripts, and even if i will work on a different country, i will use the same DB schema and the same script, so the style will be compatible to use.


Check it out here

Bonus IV. Let`s use standards and templates so we can easily modify and create something new.

I have made this example as a blueprint for creating a new sql script file.

Need input from you, how should we do with the names of the files, so that they respect some sort of standard And also if people want to add a QGIS Style, to respect a format

If the sql script is called nodes_show_adresses.sql the Qgis Style should be nodes_show_adresses_QGIS_description.eml

I am bad at the last part, so need your input so we can create something nice for the future.

The Long name is :

Postgis Openstreetmap Script Repository for Osmosis Pgsnapshot Schema


Here at Telenav , in the Map Analyst Team we use Postgis a lot.

In the real world, Postgis and OpenStreetMap seems to be used more only in Academics or by people that know a medium or high degree of programming.

I think that we should change that and make more simple the availability of Postgis as a tool that can help us visualize information’s from the OSM database much faster.

3 Ways of making Postgis more easy to use with OSM data

# I. - SCOPE

For this i will release SCOPE ( databaSe Creator Osmosis Postgis loadEr ), a command line tool for easy loading of a osm.pbf file into Postgis, via osmosis


  • You can try scope in the repo
  • Also, you will find there a Postgis example that i posted there.

II. A central place for Postgis OSM Scripts

The idea is to have a central place were we can find different Postgis scripts, so when we want to do a statistic or something regarding the OSM data, to have a place were you can download, fork and store the scripts that you are using in your Projects

screenshoot # III. Give it a more sexy and short name It also need a more simple name, i tried for to hours to think what name to give it, and i stoped and put this long name to the github repo. Please suggest a better and shorter name or acronym

Location: Center, Cluj-Napoca, Cluj Metropolitan Area, Cluj, Romania

Turn Restriction part 2

Posted by baditaflorin on 30 December 2015 in English (English).

I will not elaborate now on the difficulties that i had getting ( either osmconvert or osmfilter removes the timestamp, user_id, leaving me with a extracted file that is 50 MB osm.pbf but cannot do nothing more then do some static statistics.

I cannot see how many where imports, how is evolving in time, nothing.

I could try to get the information via Overpass API that this would mean to query the database fro 500.000 relations

Until i figure out what i can do, i was able only do do reverse geocoding using POSTGIS, tried with the dataset in QGIS to count the points in each polygon, but it got stuck after 2 hours at 94 %

I will try to load the GADM dataset in postgis and do reverse geocoding from POSTGIS, so i can have a better lever or details about the spread of the turn restriction in each country

Country ISO Count DE 89105 IT 53901 US 45986 RU 42827 ES 34697 BR 32212 PL 12125 FR 9974 GB 9311 AR 6701 AT 6570 HU 5584 AU 5505 CA 5037 DZ 4516 CH 4453 NL 4357 UA 4121 SK 3942 GR 3932 CZ 3422 PT 3371 RO 3219 HK 3145 BE 2818 LT 2603 CO 2295 SE 2169 FI 2016 BY 1893 JP 1815 HR 1688 ID 1659 CU 1543 DK 1497 UY 1449 CL 1444 VE 1416 KZ 1379 IN 1239 LV 1226 ZA 1185 IL 1157 CN 947 MX 900 EE 807 EC 766 KR 763 IE 744 SI 703 RS 683 NZ 654 MK 650 NO 650 MY 626 PH 603 BG 582 MD 544 AE 529 AZ 491 TR 487 AM 465 OM 383 GE 381 BA 361 CR 351 QA 348 TH 345 LU 338 SA 333 BO 329 TW 313 IR 296 TJ 256 CW 252 IS 232 BN 227 UZ 207 -99 179 CY 154 TT 153 SG 147 BH 144 AL 139 MA 136 PE 135 LB 127 IQ 125 JM 124 ME 117 ZW 116 TN 113 PG 111 NA 96 KG 89 PS 89 VN 86 AO 80 PY 74 SV 73 NC 66 EG 62 KE 61 CI 61 SY 57 MN 53 MT 50 NI 47 LK 41 KW 38 MZ 38 UG 37 PR 36 SM 36 MU 35 DO 35 LY 32 NP 31 GT 27 IM 27 KP 26 PK 19 AD 16 GN 16 KH 16 JO 16 TM 15 SN 14 ML 13 GA 13 KN 13 LI 12 NE 12 TZ 12 GH 12 GQ 12 LR 12 BT 11 TL 11 SR 11 LA 9 HT 9 BF 9 MG 8 BB 8 TD 7 AI 6 MM 6 MO 5 RW 5 CV 5 BI 5 GW 5 GM 5 VA 5 CM 5 DM 5 TG 5 ST 4 BD 4 MW 4 LS 4 BZ 4 BW 3 KY 3 GG 3 ET 2 YE 2 BS 2 AG 2 FJ 2 GD 2 KM 2 MR 1 ZM 1 SZ 1 CG 1 MF 1 GU 1 VU 1 JE 1 BJ 1 NG 1 HN 1 VC 1 BL 1 (blank) Grand Total 451977