pnorman has commented on the following diary entries

Post When Comment
Importing 1 million New York City buildings and addresses 30 days ago

Now that we've got it imported, when will NYC be releasing new data, and how will we handle updating it?

Complex Intersections, or Why We Should Get Rid Of exit_to 3 months ago

I don't understand your questioning about the destination reference. Why not simply use the tag 'ref' on the link way if it's valide after the junction node

Frequently the link road does not itself have a reference, or at least not the same reference.

In one example locally, there is an exit signed as the exit to get to Highway 17, but Highway 17 is 4-5 km away through secondary roads and over a 1km bridge. This is a bit extreme, but a good example of how the destination refs may be quite distinct from the refs of the roads themselves.

Adding to the support for destination, both MapQuest Open and GraphHopper are consumers who support it.

OpenStreetMap and the Open Data Movement 4 months ago

Yesterday I had a short exchange of tweets with somebody that was surprised that was using google maps instead of OSM. Given that it is rather a convoluted subject, explaining why this in fact is not surprising was a bit difficult in 140 letters and is what prompted me to create this post.

I'd probably boil it down to they're using closed data instead of open data on their page. Many people assume open data is government data. Some even go so far as to say that any government data posted is open data, regardless of license. There seems to be a widening awareness that there are many non-government open data sources, including OSM, and that they come from a variety of sources, including companies.

Note: I gave a talk on how open data isn't just from governments.

GraphHopper 0.3 released: Fast Route Planner Beyond Two Dimensions 4 months ago

Is GraphHopper now supporting turn restriction relations?

Heatmap made easy with uMap 5 months ago

Great to see more heatmap options. Are you able to change the colours used to a more traditional heatmap option?

If you're using Leaflet.heat, does come with the same advantages and disadvantages of it?

I've looked at Leaflet.heat, but it falls down in the face of large datasets.

highway=bus_stop - Mappen für den Renderer 5 months ago

Also the assumption "1 node where PT users are waiting" is wrong. As I pointed out in a comment above there are over 100000 bus_stops mapped together with stop_position. Which means probably most of them are on the road.

Your numbers are wrong. 1142673/1295234 (88%) of highway=bus_stop are not part of any way. Of the 12% that are part of a way, a reasonable portion of those will be part of a footway, not part of the road. There's a pretty clear standard usage.

This is consistent with the long-standing method of the node being at the pole location, not the the middle of the road.

In short, highway=bus_stop is the standard way to indicate the pole location of a bus stop, and will likely continue to be for the foreseeable future.

My Experience, Accuracy 6 months ago

You're not going to get an accuracy of better than 5 feet out of a cell phone GPS. In practice, the best accuracy is from properly aligned imagery, the type city GIS departments purchase.

OpenStreetMap Isn't All That Open, Let's Change That and Drop Share-Alike 6 months ago
  • If the NYC government wanted to copy buildings that have changed from OpenStreetMap to the NYC building dataset they couldn't as their data needs to remain in the public domain.

Are you proposing that OSM change to a public domain like license like CC0 or PDDL?

OpenStreetMap Isn't All That Open, Let's Change That and Drop Share-Alike 6 months ago

I would be inclined to give the argument more credence if not for three significant shortcomings

"Less" free

You start by stating that a copyleft license is less free. It is, at the very least, debatable if a license that does not guarantee continuing freedom to use the data and improvements is any freer. It allows anyone to convert the data into a proprietary dataset, taking away freedom.

Ignored costs of a decisive change

You've completely ignored the costs of changing the license. I’m not speaking solely of monetary and time costs, but also social costs. If you don't recall, the license change from CC BY SA to ODbL was highly decisive, took considerable time, and taking such a decisive action has been compared to the project shooting itself in the foot. Any reasonable discussion has been derailed by people insisting on public domain (e.g. because that's all some governments can work with), completely ignoring the fact that public domain, CC0 or PDDL would render us unable to use most government, company, or crowd-sourced open data. I see this happening again, with you talking about the census bureau.

I did a significant chunk of the work cleaning up data not covered by the redaction bot. It was weeks of soul-draining work and late nights. Any future change that would require us to remove the CC BY-SA/ODbL dual-licensed data in the database would be, by my estimation, worse.

An argument coming from a company with a questionable track record on compliance

When considering Alex’s argument it’s important to consider its context, including where it’s coming from. Mapbox and Mapbox customers have a questionable record on meeting the attribution requirements of the ODbL. We're not talking about issues on space-constrained mobile apps, we're talking about full-screen webmaps on a desktop display with no attribution text at all. This isn’t something that’s ODbL specific – a significant portion of open data out there has attribution requirements. Until recently had no reference to OpenStreetMap, despite being based on essentially only OSM data at higher zooms. Even now, it hides the attribution in a means of questionable adequacy.

In conclusion, I found a particularly relevant quote from the Free Software Foundation

Proprietary software developers, seeking to deny the free competition an important advantage, will try to convince authors not to contribute libraries to the GPL-covered collection. For example, they may appeal to the ego, promising “more users for this library” if we let them use the code in proprietary software products. Popularity is tempting, and it is easy for a library developer to rationalize the idea that boosting the popularity of that one library is what the community needs above all.

But we should not listen to these temptations, because we can achieve much more if we stand together. We free software developers should support one another. By releasing libraries that are limited to free software only, we can help each other's free software packages outdo the proprietary alternatives. The whole free software movement will have more popularity, because free software as a whole will stack up better against the competition.

p.s. Unless you have a special license, you're not meeting the CC BY-SA license that the photo you're using is distributed under

OpenStreetMap Isn't All That Open, Let's Change That and Drop Share-Alike 6 months ago

Yet, I think we may be able to get a long way with the ODbL. I think it at least worthwhile to share more details on possible legal interpretations. I remember at SotM-US last year, a productive BoF about licensing. Seemed to me like one question, geocoding into 3rd party databases, seemed close to a workable and widely supported realization. There was a lawyer who was interested and seemed to have some ideas about how it could work.

One of the points out of that session was that the companies involved with issues were going to provide to the LWG a clear description of their geocoding use case on which they were having issues, and this hasn't happened.

The Mucky Pup 7 months ago

With a vector PDF the glitches are laid bare a little more than web maps we're used to (Some would normally be lost in sub-pixel fluff) especially as I had scaled it up to fill the whole A4 sheet. Certainly printout made for some fun pub conversations.

When I did work on print rendering I found mapnik PDF rendering was worse than AGG rendering for text. I'm not sure if this has improved with recent versions, but I ended up doing 600 DPI PNGs and embedding them in a PDF.

Universal language 8 months ago

Yes, let's all talk in Esperanto.

Attribution and all that (a rant) 8 months ago

I do my best to attribute OpenStreetmap in my mobile app, one thing that is bothering me, however, is the requirement to make the logo clickable and it opens a webbrowser.

There is no such requirement. The attribution requirements for produced works are

You must include a notice associated with the Produced Work reasonably calculated to make any Person that uses, views, accesses, interacts with, or is otherwise exposed to the Produced Work aware that Content was obtained from [OpenStreetMap], and that it is available under [the ODbL].

Normally this is done with hyperlinks because that's by far the easiest way to meet the requirements in a medium where you can do hyperlinks, but if you've got one where it doesn't make sense, why not use plain text.

You could use the 4.3.a example notice from the ODbL, including the plain text of the URIs as suggested. This works out to "Contains information from, which is made available here under the Open Database License (ODbL) ("

Motivation for Contributing to OSM 10 months ago

Comment from mgehling on 6 December 2013 at 17:05


Pardon? Please assume good faith and be careful before sending a message accusing a mapper of spam. Particularly in a case like this where the message isn't spam.

OpenStreetMap and the Public Domain 10 months ago

Something worth noting is that with a public-domain database you are unable to use most government data. Most governments are releasing data under licenses that require attribution, which cannot be used as public domain data.

Reports about global administrative boundaries 12 months ago

Note - do NOT upload GADM data to OpenStreetMap. It is not under a compatible license.

Land Use versus Residential Private Property about 1 year ago

There is actually a general consensus against the importing of property lot data into OSM.

Stuff like fences, hedges, etc get mapped and are verifiable, but these don't always correlate to property lots. Many times what you'd think was one lot when looking on the ground is really multiple ones. Then you get buildings that sit on multiple lots, etc.

CanVec Data about 1 year ago

Much of the CanVec water data in many regions is over 30 years old, I'd be bold in deleting it if you can't see any signs.

OSM datasize in PostGIS over 1 year ago

The big difference between osm2pgsql and pgsnapshot is osm2pgsql is lossy, so can disregard most of the tags.

The other tips are

  • Use --write-pgsql-dump and if you want geometry columns build them with osmosis. It's a more manual process but it is way faster.
  • If you need geometry columns and have the RAM, give java 32GB of heap space, otherwise put the node location store temp files on a SSD
  • When building indexes, omit any indexes you're not planning to use. If you're just performing tag analysis, you could get away with no geometry indexes, and the nodes geom index takes more time then loading the data.

For reference, creating and loading the dump files takes 10h51m on my home dev server and with decent sequential disk speed is CPU bound if you have in memory node store.

The --read-pbf-fast option with as many workers as CPU cores may help a bit here.

Small fix to osm2pgsql commited over 1 year ago

I ran into a similar problem with addressmerge, except my addresses could be marginally outside the buildings. I solved it by adding a column in which I buffered the geometry with

UPDATE import_addresses SET buffered_geom = geometry(ST_Buffer(geography(geom),N));

where N was the distance, on the order of 1-2 meters (real meters, not mercator meters). For some purposes you might want to reproject but I was working in WGS84.

For mercator you could just use ST_Buffer and multiply N by a factor, but I had to do the reprojections.

Another trick for doing the JOINs between addresses and buildings was to pre-filter with && and then do ST_DWithin(geography(...), geography(...), N)

In my code this was expressed as

a LEFT JOIN b ON != AND ST_Expand(a.geom, PRECOMPUTED_DEGREES) && b.geom AND ST_DWithin(geography(a.geom), geography(b.geom), DISTANCE)

with PRECOMPUTED_DEGREES being DISTANCE * scale, where scale is chosen from the point closest to the pole in the region I'm considering.