As some of you might read my previous blogpost we had problems the past week, that the massive use of our tile servers endangered the donated free hosting at the universities.
Caused by the license change of Google Maps (guess we will get dozens of new friends, soon), I reviewed the newest server logs and tried to get in contact with the webmasters that embedd our map. Even if I got a lot of responses (most just didn't knew anything about our tile usage policy ), understanding and some already changed over to MapQuest, the OSM admins told me, that simply embedding the map is currently not a problem. Problem are unlimited (speed and capacity) tile downloaders in the apps. So I started to analyze and to compare the useragent logs to see who likes OSM 'too much':
Just for the case anybody is interested in how I did the calculations (all data here):
- split numbers, cluster agents with nearly the same name, using a little Python script
- open with Gnumeric
- review for doubles in alphabetical order (e.g. "App" and "App lite")
-- look for bad names, general names (New York Map, Berlin Map, ...) and try to cluster manually
- create categories: general frameworks(no feedback on the App itself), Browsers, obviously malformed URLs
- design Pie-charts
I hope I didn't made to much mistakes (please review the data processing) and please keep in mind, that faking an useragent is quite simple and some Apps seem to use the same simple agent as others (e.g. "GPS"). But I tried a intelligent analyze using the version number and googled for further details, to solve this conflicts.
Beginning of October
Ok so here is the first chart, that shows up, that the most visitors are browsers (or apps faking this agents?) and unfortunately that a lot of Devs use the default agents by general frameworks (as curl,...).
Now let's drop the browsers and frameworks and the 'friendly OSM tools' as JOSM and have a look at the consumers Apps only:
As you see MOBAC seems to be the most problem but don't underestimate tools with 'only 1%'. As you can see even for the <0.5 candidates, this are major portals (CycleStreets, Wheelmap,...) so this gives an good impression on the dimensions.
End of October
Only 2/3 of traffic as the weeks before which seem to be a nice reduction (but this is a speculation due to missing logs from the past). Again the two pie charts
Well if you compare the development of the ratios of the single apps (I just picked all with >=10.000 requests), you can see that on the one hand some of the top ten aps (Locus, OpenMaps,...) reduced their traffic a lot (more than 50%). But on the other hand there are tools with massive grown traffic (JTileDownloader, GAIA GPS,...). I'm still not sure how our throttling affects this results...
Another big problem, is that a lot of devs use third party toolkits, that sail under the same general user-agent flag and guys who create to general strings (as "Paris").
Open Question: Should the community contact map authors? An open letter to MOBAC users? A press report pointing that OSM has just limited ressources, too?
I already contacted some vendors and pointed some easy steps out, that might help to reduce our server load.