Last Saturday we officially kicked off the NYC building and address import with a community session hosted by OSM-NYC and Public Labs at the Pfizer building in Brooklyn. The goal was to get the local NYC OSM community involved in this large data undertaking and at the same time harden our import process.
Over 20 people attended, and we knocked out 158 of the over 5000+ sub-tasks total. Both turn out and tasks accomplished were great and exceeded what I expected for a casual Saturday afternoon event.
Working through this import we're learning very interesting lessons:
- OSM data structure is significantly different from traditional GIS, nailing down conceptual differences when translating to OSM takes time.
- Importing is a high inertia problem, partly due to sheer volume but also due to the lack of a solid tool chain like safe roll back tools or established conversion tools.
- Expect interesting quality issues in your source data. NYC data for instance has inconsistent address formatting in the source.
- Doing a fully automated import is non-trivial. For example, in NYC, buildings often intersect with misaligned TIGER roads. That's one big reason this import is not fully automated.
- Once all data is uploaded, we'll need a QA check on inconsistent data to catch any errors introduced by humans during the upload.
- This all feels a little like heart surgery.
Here are a couple of pictures and screenshots from the Saturday event. If you'd like to get involved drop me a line. Again, the import is on hold until a couple of issues are sorted out, but you're welcome to join.