OpenStreetMap

tiles servers usage - part 2

Posted by !i! on 29 October 2011 in English (English)

As some of you might read my previous blogpost we had problems the past week, that the massive use of our tile servers endangered the donated free hosting at the universities.
Caused by the license change of Google Maps (guess we will get dozens of new friends, soon), I reviewed the newest server logs and tried to get in contact with the webmasters that embedd our map. Even if I got a lot of responses (most just didn't knew anything about our tile usage policy ), understanding and some already changed over to MapQuest, the OSM admins told me, that simply embedding the map is currently not a problem. Problem are unlimited (speed and capacity) tile downloaders in the apps. So I started to analyze and to compare the useragent logs to see who likes OSM 'too much':

Number crunching


Just for the case anybody is interested in how I did the calculations (all data here):
- split numbers, cluster agents with nearly the same name, using a little Python script
- open with Gnumeric
- review for doubles in alphabetical order (e.g. "App" and "App lite")
-- look for bad names, general names (New York Map, Berlin Map, ...) and try to cluster manually
- create categories: general frameworks(no feedback on the App itself), Browsers, obviously malformed URLs
- design Pie-charts

I hope I didn't made to much mistakes (please review the data processing) and please keep in mind, that faking an useragent is quite simple and some Apps seem to use the same simple agent as others (e.g. "GPS"). But I tried a intelligent analyze using the version number and googled for further details, to solve this conflicts.

Beginning of October





Ok so here is the first chart, that shows up, that the most visitors are browsers (or apps faking this agents?) and unfortunately that a lot of Devs use the default agents by general frameworks (as curl,...).

Now let's drop the browsers and frameworks and the 'friendly OSM tools' as JOSM and have a look at the consumers Apps only:



As you see MOBAC seems to be the most problem but don't underestimate tools with 'only 1%'. As you can see even for the <0.5 candidates, this are major portals (CycleStreets, Wheelmap,...) so this gives an good impression on the dimensions.

End of October


Only 2/3 of traffic as the weeks before which seem to be a nice reduction (but this is a speculation due to missing logs from the past). Again the two pie charts

Conclusions


Well if you compare the development of the ratios of the single apps (I just picked all with >=10.000 requests), you can see that on the one hand some of the top ten aps (Locus, OpenMaps,...) reduced their traffic a lot (more than 50%). But on the other hand there are tools with massive grown traffic (JTileDownloader, GAIA GPS,...). I'm still not sure how our throttling affects this results...
Another big problem, is that a lot of devs use third party toolkits, that sail under the same general user-agent flag and guys who create to general strings (as "Paris").



Open Question: Should the community contact map authors? An open letter to MOBAC users? A press report pointing that OSM has just limited ressources, too?
I already contacted some vendors and pointed some easy steps out, that might help to reduce our server load.

Comment from EvanE on 29 October 2011 at 19:36

Die Ma├čnahmen von der Autoren von MOBAC und Locus siehe deine Blog von Anfang des Monats (6.10.) scheinen einen sichtbaren Effekt bez├╝glich ihres Anteils an der Download-Menge zu haben.

Hide this comment

Comment from iandees on 29 October 2011 at 21:38

The "Browsers" you clumped together are almost exclusively faked User Agents. Most of the User Agents for real browsers are relatively unique and won't end up grouped together.

The sysadmins have been contacted by or have contacted themselves several of the tile downloading clients. If I remember correctly one of the app authors even made a donation to OSMF that helped pay for the SSD in yevaud.

Hide this comment

Comment from !i! on 29 October 2011 at 22:13

Yes this is what I expected, cause this part seemed to be to much.

Hide this comment

Comment from wieland on 29 October 2011 at 23:30

Shouldn't all the Browsers be clumpled by Referer?

Apps should show wether it is online or offline-download like openmaps now.

Apps with more than x downloads need to be known with contact.

Hide this comment

Comment from !i! on 30 October 2011 at 10:40

Well indeed some apps already ship an email adress with the agent. And some are clever and use a central config file, so they can change the servers without any big update process.

I contacted all #Top 20 authors.

Hide this comment

Comment from mobrob on 30 October 2011 at 10:57

Interesting graphs. But comparison between the two dates is difficult as there are no absolute values only relative values.
Additionally a break-down of the browser user agents would be interesting to see if it corresponds to other sites (allows to identify faked user agents).

Hide this comment

Comment from !i! on 30 October 2011 at 11:06

Please have a look at the data itself, to get the absolute values. But as noted above, the total traffic number has fallen at 2/3.

This allows you to check the browsers, too. But I was suprized that there is no string for MS Internet explorer. But I'm not a web expert ;)

Hide this comment

Comment from EdLoach on 30 October 2011 at 11:52

See the Mozilla entries that have MSIE in them for Internet Explorer.

Hide this comment

Comment from !i! on 30 October 2011 at 12:10

Ah ok, that was how I explained it to myself. So I was right ;)

BTW some OpenSource tools are effected as well:
http://groups.google.com/group/osmand/browse_thread/thread/840a45868079fcd4
http://groups.yahoo.com/group/aprsisce/message/13480

Hide this comment

Comment from Gnonthgol on 30 October 2011 at 14:45

A lot of the heavy users seem to be wanting the same goal; gratis maps without internett. The only program I know of that does this without mass downloading is OsmAnd which comes with its own tile renderer and a sqlite vector map. This is the same model as Google maps have for their online application. Provided a light tile server that can fit on an embeded device and use a compact (filtered) database that can be updated with ease, how much will the trafic on the tile servers drop?

Hide this comment

Comment from !i! on 30 October 2011 at 14:54

The Idea is quite good Gnonthgol. Unfortunatly different aspects of a navigation app (rendering, routing, navigation, ...) have to make use of different optimized data structures, as well. So it's difficult to solve all this with just one try. And of course the most of our developers are already involved in a lot of projects....

There are already thoughts in doing offline vector rendering or to deploy only vector tiles. But it's not that simple...
http://wiki.openstreetmap.org/wiki/OSM_Mobile_Binary_Protocol
http://wiki.openstreetmap.org/wiki/Develop

With this post I tried to adress the current (and maybe again upcomming) problem that some Apps use the tiles servers in an heavy way. So I tried to make the bad ratios visible and to contact authors. The first easy step is to check if the tool fullfils our policy and maybe to switch to another tile provider. Nobody says you shouldn't use OSM tiles anymore, the opposite is the case :)

Hide this comment

Comment from chriscf on 31 October 2011 at 02:59

Of the major browsers, only Opera uses its own root UA. IE tends to use Mozilla/4.0 (compatible ...), and the others tend to be Mozilla/5.0. Looking down the list there appear to be a couple of spoofed UAs starting Mozilla/X.0, thouugh they do include the actual name of the app in them. The entries "Mozilla" and "Microsft Internet Explorer" concern me - are those inferences made by the server, or has something actually sent those strings as UAs? If the latter, then they're fake.

Hide this comment

Comment from wieland on 31 October 2011 at 11:50

Maybe we will have much more traffic soon.
Google maps will charge soon.

English: http://googlegeodevelopers.blogspot.com/2011/10/introduction-of-usage-limits-to-maps.html

German comment naming OSM as free alternative: http://www.spiegel.de/netzwelt/netzpolitik/0,1518,794935,00.html

Hide this comment

Comment from !i! on 31 October 2011 at 12:16

As you can read in my introduction, this was the reason for this report ;)

Hide this comment

Comment from wieland on 31 October 2011 at 13:42

But in this case, the referer is needed to analyse the traffic or api-keys.
All the discussion about mobile apps versus browsers doesn't help.

Hide this comment

Comment from !i! on 1 November 2011 at 08:30

The author of NaviComputer react as well http://navicomputer.com/blog/?p=215

Hide this comment

Comment from wieland on 2 November 2011 at 21:18

You wrote:
"-- look for bad names, general names (New York Map, Berlin Map, ...)"

These look like city apps:
http://itunes.apple.com/de/app/berlin-map/id398454229?mt=8
Have a look at the comments :-)

Hide this comment

Comment from !i! on 2 November 2011 at 22:24

Well yes I think I identified all of it. Unfortunatly some Agents are that general that I called the authors of the wrong apps. But hey, thats life ;)

Hide this comment

Leave a comment

Parsed with Markdown

  • Headings

    # Heading
    ## Subheading

  • Unordered list

    * First item
    * Second item

  • Ordered list

    1. First item
    2. Second item

  • Link

    [Text](URL)
  • Image

    ![Alt text](URL)

Login to leave a comment