Recent diary entries
Vespucci 0.9.5 is now very near the feature freeze for the 0.9.5 release, numerous tasks that have either been been requested by users for a long time or have been on my personal to do list have been completed and there are only one or two left to do.
Some of the more interesting new functionality
- On device help and some usability improvements
- JOSM compatible OSM file reading and saving
- Auto download
- Fast address tags adding with house number prediction
- Import and upload of GPS tracks
- Function to add node at current GPS position
- Support for external GPS sources (for example RTKLIB)
- Basic conflict resolution
- and numerous "under the hood" changes
If you want to give the beta version a spin, it can be downloaded from
While I was fairly sure that the numbers were correct, I did know that I hadn't taken one potential systematic problem in to account: users that had signed up and produced only empty changesets. Now it could be argued that such users at least tried to contribute and should be counted, but opinions may differ on that. It was clear that any effect on current trends would be minimal, all modern editors will typically only allow you to save if you have actually changed something, implying that an empty changeset is something that can only happen in an error situation (for example an editing conflict, or a crash of the application).
Here are the corrected graphs:
The effect of removing the empty changesets lowers the overall number by roughly 25'000 users, however as expected does not effect the general trend. End of June 2014 we had an accumulated total of 421'701 contributors, increasing at a rate of around 8'000 per month.
The following graphs compared to the previous ones show that most of the effect of the change is in 2009, which as we know is the year changesets were introduced in, so some issues there are not quite unexpected.
In mid-2014 we have already had more than 90'000 active contributors, we can expect a substantial increase in the year total if the trend continues.
Back to the empty changesets, when and how much of an effect did removing them from the numbers have:
and the same relative to the new contributors per month:
(note: the above is the difference between the old numbers with zero edit changesets and the new ones, in some cases this simply caused users to be counted one month later, which explains the single negative month)
As already mentioned all changesets before May 2009 were generated when changests were introduced. I haven't been able to determine what caused the large numbers from May to November 2009, however the changesets in question do not have a created_by tag and I suspect that they were created after the fact, by some kind of mechanical process. Nobody that I asked can remember, maybe a reader can shed some light on this.
After November 2009 most of the empty changesets were created by Potlatch 1 and a well founded suspicion is that they were caused by P1 live mode. This continues up to April 2011 when Potlatch 2 was made the default editor and the absolute numbers have remained stable in the 100-200 users per month range since then.
Now it is important to remind ourselves what we are looking at: the numbers are new users that signed up, tried to edit, failed and never had a successful edit after that.
In other words, users that wanted to participate that we lost (the total number of empty changesets is far higher, however for now I'll assume that regular contributors are more tolerant of things going wrong than first-timers).
Done deeds are done deeds and there is nothing we can do about users that we have already lost, however what we can do is try and improve the situation going forward.
I've produced some numbers on which editors the users were using and nearly all (yes including JOSM) turn up, however the majority of the users effected are using iD (which is not a surprise given that it is the default editor) and, big surprise, "Go Map!" at nearly the same level. Given that "Go Map!" has a far lower user base than iD (editor stats), this indicates that there may be a real problem with the app this has in the mean time been resolved. Naturally given that iD is what is usually used by potential new contributors, further investigation is warranted too.
Anybody that has spent any amount of time fixing TIGER data in the US has seen all the artifacts that we have all come to love, over- and underruns in corners, lost fixes after being inside, wiggles where the surveyors stopped to read road signs, crosscountry non-existing residenials and so on. Today I saw one thing I hadn't before:
(blue is the corrected version)
[As always this is just my own, personal, opinion, and is no way an official statement by anybody]
Yesterday I had a short exchange of tweets with somebody that was surprised that http://opendata-hackday.de/ was using google maps instead of OSM. Given that it is rather a convoluted subject, explaining why this in fact is not surprising was a bit difficult in 140 letters and is what prompted me to create this post.
It is probably just natural that outsiders, even members of both the OSM community and the Open Data movement, simply assume that these are essentially the same and have a large overlap in motives and goals. Numerous OSM contributors are active in the Open Data movement and undoubtedly we are a very large consumer of open data in various forms.
However this apparent overlap shouldn't hide the fact that both our goals and motives are in large parts completely different. The Open Data movement is about liberating, accessing and exploiting data that is already there, and, please don't take this negative, about improving the bottom line of the companies involved. One of the major arguments used in prying data out of the hands of government is that it will have a beneficial net effect for our economies and the involved companies and I don't have an argument with that. On top of that, the “we have already paid for it” justification is surely, at least in some ways, correct.
The OpenStreetMap project is very different, it is all about producing free and open geo-data and while we do utilize open sources, we are clearly at our best when the data has been surveyed and curated by mappers on the ground. Our goal is, in the end, to produce the best “map” of the world that is at the same time freely usable and re-reusable. Yes, some of the economic arguments apply just as well to the OSM ecosystem as they do to the same in the Open Data movement. I think we should be all be proud of the enlightened view the OSM community has had on commercial re-use of our data from the beginning.
But it has to be very clear: the OSM community created and owns the OSM data, and we control the terms on which it can be used, we have not been “already paid for it”.
Hopping off the soap box, I believe it is now understandable why complete alignment of goals cannot be expected. For “joe open data” google is a just as valid member of the open data community as OSM. google may even fit the open data model better: consumes and produces non-open products from it. We are just the crazies that spend our own time producing something free.
It seems, even though it has been available for quite a while, that the fixthemap page on openstreetmap.org is still not particularly well known. If you have an OSM based project and you don't want to create your own page for this purpose (as for example Mapbox has done now too) please provide a link to the page with a suitable text. You can provide coordinates with the URL (best probably the center of the map the user is viewing) and if the user chooses to add a note it will be positioned correctly.
An example of how to do this is for example the SOSM run osm.ch site: http://www.osm.ch/#13/47.2095/8.5237
Vespucci Release 0.9.4 Highlights
This release contains a lot of “under the hood” improvements and some work on making the UI more consistent and easier to use. In particular the following changes have been made
- selectable overlay layer.
- support for multiple simultaneous presets.
- added find action to lookup location with nominatim.
- add per zoom level imagery offsets with support for querying and saving to the imagery offset database or manual entry.
- added support for name suggestions and auto preset setting.
- added goto current GPS location.
- added action to arrange nodes of a closed way in a circle.
- limited support for geo: URIs and JOSM style remote control.
- add action to directly set position of node by entering coordinates.
- major rework of imagery provider configuration, now based on https://github.com/osmlab/editor-imagery-index .
- make https API default.
- major refactoring of projection code.
- lots of bug fixes and stability improvements.
The full change log is available here http://code.google.com/p/osmeditor4android/source/list
We will be updating the documentation to include the new features as soon as possible.
Upgrading from previous Versions
There are a few points that you may want to consider when upgrading from previous versions of Vespucci:
- some of the new features may cause degraded performance on older phones, see http://code.google.com/p/osmeditor4android/wiki/FAQ#Running_Vespucci_on_%22old_and_small%22_devices
- 0.9.4 uses a new configuration file format for imagery as mentioned above. As a consequence the internal identifiers for the background tile providers have changed with the exception of the “standard” mapnik tiles and Bing imagery. If you have extensively used any other sources you will be left with unused tiles using space on your device. You can delete unneeded directories and tiles by navigating to the andnav2/tiles directory on your system (where exactly the directory is located depends on the device and Android version, but it will be in the same place as the vespucci directory) and simply deleting the directories with the exception of BING and MAPNIK.
- some of the defaults for preferences have changed in 0.9.4, your old values will continue to remain the same, however the new defaults seem to make much more sense. See http://code.google.com/p/osmeditor4android/wiki/Tutorial
Last weekend I had a short discussion with a well-respected OSM community member on some aspects of the ODbL and it ended more or less on a question, "then when does share alike kick in?" Given that it was 2am my answer wasn't particularly good and so I thought I should expand it a bit in writing. Particularly because I may have given the impression that it is a fairly complex matter, when in reality it is fairly simple.
Disclaimer: this is the personal opinion of a non-lawyer and it is neither an official policy statement by the LWG nor the OSMF. There are a handful of grey areas that I will not touch on, on some of them the LWG is preparing clarifications for discussion that will be available soon, in other words I am staying on safe ground.
Further it is well known that I'm not particularly in love with the ODbL, but on the other hand I do think it is a lot better than it is made out to be.
The ODbL has 3 concepts that are relevant to triggering share alike (verbatim quotes from the ODbL text):
"Derivative Database" - – Means a database based upon the Database, and includes any translation, adaptation, arrangement, modification, or any other alteration of the Database or of a Substantial part of the Contents. This includes, but is not limited to, Extracting or Re-utilising the whole or a Substantial part of the Contents in a new Database.
"Collective Database" - Means this Database in unmodified form as part of a collection of independent databases in themselves that together are assembled into a collective whole. A work that constitutes a Collective Database will not be considered a Derivative Database.
"“Publicly” – means to Persons other than You or under Your control by either more than 50% ownership or by the power to direct their activities (such as contracting with an independent consultant).
Starting with the last concept, share alike only kicks in when you "Publicly Use" a derivative database see (ODbL 1.0: 4.4(a) and 4.5(c)) , in house use, use by a contractor on your behalf and similar all do not trigger share alike and are not of interest. For the rest of this discussion please assume that whatever we are discussing, we are discussing it in the context of publicly using whatever you have created.
You are now probably already jumping up and down and shouting "And what about Produced Works?". Produced Works are only relevant to share alike in that if you "Publicly Use" a Produced Work (ODbL 1.0: 4.4(c)) any derivative database that was used in producing the Produce Work is considered "Publicly Used". Given that we already are assuming that, we do not need to consider Produced Works at all for the purpose of this discussion. Seems as if we have already considerably simplified the matter at hand.
If you read the ODbL *Derivative Databases" is what in the end share alike is attached to, original OSM data, extracts and modifications to such are all datasets that are, no surprise, subject to mandatory ODbL licensing. But what happens if you are using other data together with OSM derived datasets? Going back to the definitions, we see that such use creates a Collective Database.
How does share alike apply to a Collective Database? Well according to 4.5(a) "For the avoidance of doubt, You are not required to license Collective Databases under this License if You incorporate this Database or a Derivative Database in the collection, but this License still applies to this Database or a Derivative Database as a part of the Collective Database;".
In other words if you simply lump together one or more datasets with data derived from OSM, you are only required to licence the OSM part of the Collective Database under the ODbL or a compatible licence.
Example: assume that you have a proprietary global database of waste bins and want to use that data together with OSM data. No problem, you can use your data together with OSM without any issue and there is no need to publish your proprietary dataset on ODbL terms.
Grey area alert: while the example is clear, there are some kinds of "lumping together" that need clarification.
Now given that OSM has a lot of waste bins already, the result might contain a lot of duplicates that you would like to remove. Again no problem, you can simply remove all waste bins from the OSM dataset. Now the resulting OSM data is clearly a Derivative Database and is subject to the share alike terms in the ODbL (as it was before), but it does not change the status of the collective whole which can still have different licences for its individual parts and the whole.
Grey area alert: this kind of Derivative Database (reduced and extracted unmodified OSM data) triggers a number of obligations that essentially nobody is adhering to.
This is the point I was in discussion at 2am and when the question "then when does share alike kick in? " was posed.
Well the answer is: "when you modify OSM data". The simplest example: you improve the position of a POI by changing the coordinates or you add further information to the POI, then you have to make the resulting dataset available on ODbL terms. Don't forget we are always assuming that you are Publicly Using the data.
A more interesting example: assume you have a proprietary database containing road geometry and associated with that geometry, road surface information and further that you have permission to integrate the surface information into OSM. You add surface tags to the OSM roads in your copy of the OSM data: yes you have to publish the improved OSM data on ODbL terms.
The important thing to note is that it does not effect your original proprietary database, there is no infection or tainting of that dataset, you simply cannot keep the changes to the OSM data to yourself.
And what about the other way around? Assume you notice that OSM has some surface data that is better than that in your proprietary database and you replace the original information with that? Then the resulting dataset is subject to share alike and you need to make it available on ODbL terms.
To sum it up: When does share alike kick in? When you modify OSM data or apply modifications from OSM to third party data and use the results publicly.
That's it really.
We will be meeting in Zürich tomorrow for the fifthiest Zürich OSM meetup, if you have time please feel free to join us.
The iD editor has used the name suggestion index for quite a while and it was something that I wanted to support in vespucci in the upcoming release. The one thing you do want to reduce in a mobile app is typing.
Basically the idea is to suggest correct spelling (and tagging) for some of the more prominent chains of restaurants and shops to the mapper. The way it works in vespucci now is slightly different from the current implementation in iD and at least some aspects might be worthwhile supporting there too.
Using the name as a short cut for tagging
If you are creating a new object, or adding tags to an object that previously didn't have relevant tags you can use the name as an "Ersatz"-preset (note vespucci supports normal JOSM-style presets too, and I may expand this functionality to automatically add further tags from the presets):
Enter the name tag
Start typing and get the auto-complete list
Further typing refines the list
Selecting the correct item set the name and the corresponding tags
Adding the name to a "pre-tagged" object
In the example the building has already been pre-tagged with amenity=restaurant, which while not really wrong is not the best tagging for a fast food restaurant.
Enter the name tag
Start typing and get the auto-complete list
This includes suggestions not just for amenity=restaurant but for potenially further refined tagging.
Select the correct name after further refinement
Vespucci now asks for confirmation if the old tags should be replaced.
The new functionality is available in the vespucci test builds.
Over the last couple of weeks I've done a substantial amount of work on vespucci the OSM editor for Android http://code.google.com/p/osmeditor4android/, with the goal of a new release 0.9.4 some time in March. While a lot of the work is under the hood, there are a couple of new features that will make editing with vespucci a lot easier
- support for overlays (for example the OSM GPS tracks)
- imagery configuration from OSM imagery index hopefully making it easier to keep the available imagery updated
- support for imagery offsets
- multiple simultaneously active presets (vespucci uses JOSM presets)
- support for accessing the API with https and OAuth
and not to forget, you can now zoom out completly without getting sea sick :-).
One of the issues the OSM site has had, is that we have been missing a trivial "Report a problem" / "Fix the map" landing page on openstreetmap.org. Not that it was in any way difficult to do, we just had never added one. I suspect that the absence of such a page is one of the reasons why the OSMF board now and then gets complaints about things missing on the map to its e-mail addresses and even more times the DCMA take down form is misused for such purposes.
Now, with the redesign live and google making the concept popular the time was ripe to add one at last, nothing fancy, but linking to OSM in the fashion below is a lot better than simply dumping people on our main page:
(a zoom parameter will work too).
Why is it that getting users of our data to attribute OpenStreetMap correctly seems to be such an uphill battle? This is not something new, browsing the wiki, mailing lists and forums show that this has been an issue from day one.
Why is it so difficult for us to get the only compensation we ask for, when at the same time nobody has issues attributing google? Matter of fact it is not rare to see google attribution on OpenStreetMap derived content, to add insult to injury.
Why do OSS projects that would go ballistic if somebody violated their licence take such a cavalier attitude to our requirements?
Why do third party services based on our data go out of their way with smartass attribution links and buttons, just designed to hide the fact that their maps are OpenStreetMap derived?
The OSMF board at their last meeting laid down the law (again) http://wiki.osmfoundation.org/wiki/Board_Meeting_Minutes_2013-12-10
Googles says on the same topic "Without exception, we require attribution when Content is shown. Please do not ask to negotiate this requirement." .
Please don't treat OpenStreetMap worse than mega-corporations!
Updated for the full year 2013
In May (2013) I produced a set of graphs and numbers on our contributor growth based on analysing the regular changeset dumps. Given that this was half a year ago I believe it is time for an update.
Our total contributor number has continued to grow at something around 7'000 per month giving a total of 391'367 at the end of December:
The number of active contributors per month continues to show a slight upward trend: The jump in April 2009 coincides with anonymous edits no longer being allowed.
Which reflects itself in the number of active contributors per year too (the year end number for 2013 is 129'000).
The above graph also shows that the number of contributors that contributed in a previous year is showing growth too. Note the lack of "old" editors going from 2008 to 2009 is likely due to anonymous edits going away in 2009. This is likely also the reason for other low counts in the early years.
And while it is a bit too speculative to state reasons, we do see, compared to active contributor growth, over proportional growth in the number of changesets. Going over 500'000 per month for the first time in September of this year:
You may have noticed that the 0.9 Vespucci release (see http://code.google.com/p/osmeditor4android/downloads/list and on the google Play Store rsn) has an experimental geo-referenced photograph overlay:
this is already very helpful, you can walk around surveying then sit down in peace and quite, enter the information that you have collected and upload right there. Vespucci does not support taking photographs directly (yet?) so I'm doing this with osmtracker. Now and then it isn't really clear in which direction the photograph was taken and it would be helpful to have that information available too.
Most mobile phones today can measure magnetic fileds and have what used to be called a compass :-), however there don't seem to be any (or at least none that are well known) apps that add this information to the picture EXIF information (there are a couple of Android "features" that need to worked around to be able to do this). In any case I hacked together a modified version of osmtracker that actually does this today and made the necessary changes to be able to read the information in Vespucci and this is the result: the rotated camera icon is pointing due east in the direction I took the photograph (of some grass so I'm not showing -that- here :-)).
There is still some work to do to make the actual photographing more robust, for example rotating the mobile phone into landscape mode will mess the compass reading up in the current modified osmtracker version, but the concept seems to have some merit.
We've moved the translation mangement to transifex (the same platform used for iD): https://www.transifex.com/projects/p/vespucci/ .... the sooner we hve at least the major languages complete the faster we can release.
We are nearly ready for a release of Vespucci 0.9, the translations have to be updated and, more important, some more testing needs to be done. Head over to Vespucci Home for the current beta build.
Here's a short video showing the new features.
Maybe some can remember that I produced some statistics on per country address counts end of March this year. I've rerun the scripts twice since then see http://qa.poole.ch/addresses/ .
What is naturally interesting is not so much the absolute number of addresses we have in the database (we know that we are just at the beginning of collecting addresses), but how fast are we adding them.
The full numbers can be seen on the website linked above, but for a quick comparison here are the overall numbers, Germany and the USA (which has had some address imports lately).
2013-03-27 2013-05-03 2013-07-13 Increase since March Total 20'168'470 21'072'447 22'939'124 13.74% Germany 3'659'043 3'836'410 4'144'198 13.26% USA 2'090'893 2'122'662 2'277'269 8.91%
While obviously the numbers are still quite small (the number for Germany is roughly 10% of all addresses there) I believe it is encouraging that we are adding substantial amounts even without large scale imports and do not depend solely on addresses becoming available as open data (which will not happen everywhere).
I suspect that nearly everybody has heard that google has acquired waze for a substantial amount of money.
While waze has historically made noise about generating their own map and have occupied a substantial part of the mind share in the crowd source data space, the effort seems to have never really beared much fruit and inspection of the maps has mainly uncovered third party sources. Still, in the heads of potential consumers and contributors to OSM, waze has continued to be present as a OSM competitor.
The acquisition by google will remove waze from the equation leaving the well known players and OSM as only visible and viable players at a global level.
Very often, if not always, in such high visibility corporate acquisitions, the resulting construct is less than the sum of the individual undertakings and it is not unheard of the net result in the long run ending up smaller than the size of the larger player in the beginning. Now that is unlikely in this case, but on the other hand, just from a numbers point of view, waze is just a pimple on googles user base. Naturally the constructors of such deals are not stupid and it is more likely, regardless of what google say in public, that this was not about acquiring additional market share and technology but more denying that a competitor.
Waze has had in the past had an enthusiastic and loyal community that in the end is mainly responsible for its success, I believe it will be very difficult for google to maintain that community in its context and given the complete integration of waze that will happen regardless of any statements now, it is likely to fall apart in a short time.
Any way I look at it, the google-waze deal opens up opportunities for companies and organisations in OSM-space to provide similar crowd based traffic reporting and avoidance services and to fill the void left by waze.
This is really good news for OSM.
I've added support for adding turn restrictions in the branch of vespucci I've been working on, nothing fancy and rather straight forward.
Select the "from" way
Select "Add restriction" from the drop down menu
Add the "via" and the "to" elements:
Then set the restriction type and you are finished:
We all know that while we have over a million, actually over 1.2 million now, user accounts, the number of actual contributors is quite a bit lower. Often you will see a number of 200'000 quoted, however that only considers the last editor of an object and not all the editors.
Analysing the mid May 2013 changeset dump gives a total of 335'000 unique user ids that have created a changeset. We can see this number increasing more or less linearly in recent years:
Over the last 12 months we have averaged roughly 7'000 new contributors per month, in total a good 80'000 per year:
The numbers only take half of May in to account so it is too early to see if the announcement of the iD editor will already have a noticeable effect. In the long run we hope for a clear increase in the number of account holders actually contributing.
May 2013 numbers now for the whole month, based on the early June changeset dump. Total number of contributors 340'311, new in May: 9357 While this isn't a new record it is a good 2nd place.