Recent diary entries
Ich did a redraw of previous image.
Answers to some points:
@quantumstate: moving license change upwards will not help. If some one wants to import Google Maps data, it is fine to make converter. And make sure all tags are wikidescribed. But not fine to upload, which ist marked. If person can not look ahead on the list she deserves that.
@Gnonthgol: linear path ist for easy understaning. To make sure nothing falls off out mind. Before upload, you look at this again and check whole process following the arrows.
@LivingWithDragons note about revert ist from original import guide lines.
Ich think about giving numbers to blocks for easy reference. Need make sure community do not want add more blocks so numbers do not change.
Ich post link to this to talk@ and imports@ to discuss better.
Multiple persons asked me to review import guidelines. I translated pages of text into this:
Diaries exist for 17 000 posts already. Happy Birthday OSM User Diaries!
I did small table for better understanding.
People in list above are respected. They did big contributions.
There was saying "OSMF doesn't shape the community". It now came to "OSMF has nothing to do with community".
It ist not problem. Problem ist most influencive OSMF members have their own small business. They have to pay bills. And think of their company.
It is big problem.
You know why Frederik Ramm keeps upsetting himself of Database Bloat? GeoFabrik.de has low disk space. Not openstreetmap.org. OpenStreetMap only grows faster than his business. He even wants to delete buildings: http://osm.gryph.de/2012/06/openbuildingmap/
OpenStreetMap is changing licenz. Ist fine. But two months already diffs are not easy to set up, and no regular planet exists. It certanly gives more respect to GeoFabrik. Their extracts update daily. But it simply a way to remove competitors.
Look at license change bot.
Commits? Almost only Matt Amos and Dermot McNally. Ich can not see Matt running his own small OSM business. And buisness of Dermot seems not connected to OSM.
If you want some thing from OSMF, all you get ist: "We are volonteers. We run on donated everything. We get no money for that. Fuck off and do it yourself."
Fix information if Ich am wrong.
Jochen Topf maked http://openstreetmapdata.com/. It ist good. It has good data. It can be on http://planet.openstreetmap.org/. But no. It is small business. It has Donate button. That goes to Jochen who did tool, not OSM who did data.
Why mentioned Paul Norman? He helps cleaning bad NHD imports. At least tries to. Not all OSMF ist lazy.
OSMF runs precious servers. Has it any full time system admin to maintain it at least? Or any paid guy who will think of OSM, not own business?
Some users asked me about importing ID tags. I tried to make a small scheme about that. Hope it is clear enough.
Comments? Ideas? Additions?
I want no offence. Forgive me if I speak wrong. I state my mind in best words I find. If I fail to find words please help me. Not say "go away and stop critics".
I try to fix worst things. First I thinked they were tags. Trying to fix tags I found larger issue. I will no more fix tags without approval. I got idea of "Caring community preview" proposed by user Matt. I want not my posts to be read as "justification for unaccepted edits". I want my edits accepted. But when unacceptance reasons are technical, I want fix technical reasons, not get "no do not just because".
User woodpeck said "Ten guys like WorstFixer and we can fill a separate $15000 database server just with the likes of him". Think: ten users can kill $15000 server.
I respect system administrators. They do good job.
But looks we have worse problem than database bloat. Database server IO load. I read wiki and munin. Here is how it looks it works.
Ramoth is main OSM database server. It is not cheap. But it is upset. And users are angry about it.
OSM Database server not calculates. It stores, reads and writes data. So it needs not to have lots of CPU. CPU is not used. Look:
It uses 2 cores at most. And look on top of graph. It always waits for IO. All it can do is wait IO. And do nothing. And be upset.
Here is how it waits for data:
Midnight spikes are some database cron jobs. I think they show top of possible performance.
Slowness comes if it needs to read faster. And it almost reaches top load when usual people edit. Not even WorstFixer.
Here is list of things I propose:
1. $0. Calculate statistics.
I want daily update of database table sizes. So people can really see data base bloat. And maybe start fixing code if it is really issue.
To do this, system administrators need add results of this query to http://www.openstreetmap.org/stats/data_stats.html page.
SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC;
2. Move all frequently accessed data to SSD.
There are already SSD in server. But less than half of it is really used. It is used to speed up read of some regions of disk. Needs checking what it actually stores. I afraid it stores GPS Points for Germany and not map data.
2a. $2000. 2xIntel® SSD 320 Series 600GB.
Decide which tables can fit into 600 GB array of two mirroring SSDs. Move all index to it. Preferably move all current_ tables if fits. Needs no rails code change. Just database table spaces.
2b. $1000. 1xIntel® SSD 320 Series 600GB.
Reorgznize disks. Use only 1 SSD disk for CacheCade. Use two other as in 2a.
2c. $600. 1xCrucial 512 GB SSD.
Reorgznize disks. Use only 1 SSD disk for CacheCade. Use 512 GB disk for that. CacheCade not uses more anyway. Use two other as in 2a.
If you like this proposals, you can donate to openstreetmap. http://donate.openstreetmap.org/ is easy. If OSM has money specially marked "for disks" it will surely buy better disks.
Are my calculations flawed? I want comment from OSM administrators, if possible. And any other opinion.
I was told every edit creates database bloat. Here some drawings.
This is simple image showing how data circulates in OSM:
Why grey revert? Reverting store is cheap. "Version 1 is same as version 3". You keep current version in current data base and just pointer in archive.
Look closer at archive and visible separation:
It is sane way to do servers for OSM. You need not tell contributors not to contribute because of large database. Real users need archive part not.
If disk space is low: buy more! Ask for donations! Here is rough simple list:
I remind OpenStreetMap is easy to donate: http://donate.openstreetmap.org/
I am ready to discuss.
I listen to you attentively. I put notes.
Why no mailinglist post?
Won't work. Ever. Maybe only if you write that to dead local mailing list. Where nobody reads anyway.
Have a look at our license change. Workflow #1. When the fuck will they finish?!
I propose alternative. It is not ideal. Nobody is ideal. But it will work. Because it needs people with editors. Not bureaucrats with mail boxes.
I work on scheme #2, until someone proposes better option in non-blocking discussion.
If you want to make schemes, you can use XMind software for that. It is free. Doubleclick for block, Ctrl+L for line. Screenshot to export. http://www.xmind.net/
Problem of OSM is graphomania.
People want to write.
To write long books about openstreetmap.
To write long threads on openstreetmap. Even longer to prevent disputes.
When someone imports loads of shit, they keep silence.
When someone posts pictures of that, they laugh and make fun.
Nobody ist perfect. Ich am not.
I not wanted to write this post. But I have 24 hours free time to think. I was banned, and will be banned in future if laws not change. If you want to fix something in OSM you have to be graphomaniac. You will write tons of letters to change tens of objects.
I was told to announce so I announce.
is_in tag is_indikator of bad import. Look yourself at taginfo. If object author used is_in tag instead of Karlsruhe scheme he did his import carelessly.
ele tag is another bad import indikator. Look yourself at taginfo. ele=0 ist used on carelessly imported parkings, and non-touched GPS way points. I am not saying anything about http://taginfo.openstreetmap.org/keys/ELEVATION which sucks too.
For that reasons, I declare:
I will not upload any object with is_in tag. I will remove that tag. Not to lose data I will store all data from it into Karlsruhe schema.
As first thing after ban finishes, I process all un-elevated objects. Remove ele=, latitude=, longitude= tags. See what to do with other.
Ich want no war. OSMF und DWG will ban me if Ich edit too fast. I will wait for all the bans to finish. They limit maximum speed. I limit minumum speed. Ich set minimum speed fur mich as 50 000 nodes a day. If Ich edit 100 000 objekts per ein day, it is O.K. to ban me for day. If 150 000 - for two and so on. No more please, if you want no war too.
Being Worst OSM Fixer, remembering that OSMF promised to never tell anyone who Ich really am, Ich collect any letters and tweets about shit-imports, bad tags und other fixable stuff.
I promise to never tell any one who you are. Show me things you dislike. If thing you show me convinces me that all is bad there, I will fix it.
My twitter ist https://twitter.com/#!/WorstFixer und Ich read it.