OpenStreetMap logo OpenStreetMap

jsmart09's Diary

Recent diary entries

2010-02-11

Posted by jsmart09 on 12 February 2010 in English.

Long story short: added VE woodland but removed it and all HD due to duplicate nodes. Took me a while to do that but learned about changesets, conflicts, validation etc.

Wrote vbscript to generate .bat file to run shp2osm jar file and generated all OSMs for 021G10. Used Sam's rules files (mostly unmodified).

Looked at the OSM attributes. Some comments to myself on whether translations are good, what's worth bringing in etc:

2010009_0 (spot height) is translated as man_made=survey_point; ele=*. I disagree with that. It is not man-made. It gets rendered with an icon showing a surveyor like a survey control point might be. Will not import. OSM does not seem very good for Z.

1460009 is rapids, good. But OSM map features does not have an approved tag it seems. ACT has proposed but it is "abandoned". Will import anyway.

1350012 is a pit, "an excavation from which gravel, sand or clay are removed." Translated as landuse:industrial. I disagree with the translation. It is not an industrial area in the sense of factories or industrial estate. Better translation would be landuse:quarry and resource=aggregate. Not sure if I will import. Could have topo significance?

1360019 is a "domestic waste" i.e. a dump. Translated as landuse:landfill. I agree.

2110009 is a lumberyard. Translated as landuse:industrial, type=lumber_yard. I disagree. type is a property and properties should describe the nature of some entity (e.g. it is "narrow" or "disused"). I think there should be a proposed feature landuse:lumberyard.

1210009 is NTS map sheet boundary. Translates as boundary:administrative which I think is wrong. There is no place for this in the OSM catalogue.

1000039 has 1000032 cemetery. Translates as landuse:cemetery. Good.

2270009 has 2270012 = Park/Sports Field. No entry for this in the Google spreadsheet. Gets translated as sport:multi. I think it should be landuse:recreation_ground. I don't think we can qualify it more than that. No point in saying anything about sport as we don't know what sports are practiced there and recreation grounds are "sporty" by definition.

2420009 is trail. Translates as highway:path, foot:yes, sac_scale:hiking. I think not many people are walking these trails in NB. ATV and snowmobile or maybe pickup truck. Anyhow, that's their encoding in Canvec and the translation seems right. Quite a few of these so worth importing.

2490009 is a picnic site. Translates as tourism:picnic_site.

1580009 is toponymy. Sub-classed as island, channel, unincorporated area etc. Seems good. One qualm: we have name and name:fr. Should it not be name:en and name:fr? Canvec has NAMEEN and NAMEFR. I think I will change the rules file for this. Maybe I'll read more about OSM and locales first...

1020009 is railway. Canvec for 021G10 tells us what railways there were in 1975. Oh well, we can always bring them in then recategorize a lot of them to trail or whatever.

1190009 is runway. Translates as landuse:airport and aeroway:aerodrome. Shouldn't it be aeroway:runway?

1760009 is road segment. ... Lots to look at here.

1770009 is junction. I don't think I should care. I notice a tag delete_me:yes. Good idea...

1780009 is blocked passage. Translates as barrier:bollard. Barrier is good but who knows whether it is a bollard, which would be more common in urban areas. It is highly unlikely to be a bollard. More likely a gate or a ditch cut across the road. And the tags for horse, bicycle etc? How does anyone know? Maybe some rule adjustment needed here.

May not have much more time for this in the next few weeks. We'll see....

2010-02-07

Posted by jsmart09 on 8 February 2010 in English.

021G10: converted and uploaded:
021g10_4_0_HD_1480009_1 - single line watercourse
021g10_4_0_HD_1480009_2 - waterbody areas
021g10_4_0_SS_1320049_2 - saturated soil a.k.a. wetland

Rules files used:
HD_1470009_1_Single_line_watercourseRULES.txt
HD_1480009_2_WaterbodyRULES.txt
SS_1320049_2_WetlandRULES.txt

Modified the HD_1480009_2_WaterbodyRULES.txt to uncomment the permanency translation. This allows intermittent water areas (i.e. seasonally exposed sandbanks) to be tagged as something other than plain natural:water.

I don't know the significance of the "french rules" and "outer rules" files.

Sam has added me to the list of viewers for the Google spreadsheets which detail the recommended approach to the Canvec features. I note that single line watercourse is a "don't" but I do not see why we should not add these, and I also noticed other tiles to N of London ON that have these; and very good they look too. So I think NB should have single line watercourses too!

Did a little bit of editing to clip a pre-existing bit of double-line river from 021G10, so learned a couple more JOSM commands.

There are closing lines remaining e.g. at W edge of 021G10, half way across Oromocto Lake. I think the best approach will be to put all the waterbodies in for all tiles, then do another pass to tidy up. I'll try that idea with an adjacent tile, when I have time.

Tomorrow? Maybe some "vegetation" i.e. forest.

Location: 45.625, -66.750

2010-02-05

Posted by jsmart09 on 6 February 2010 in English.

Today I learned a bit about multipolygon-type relations in the OSM data model. My understanding is that a multipolyon is a relation object, and it contains one or more other objects. E.g. it could contain a lake and a couple of islands. I have skimmed through the [[Relation:multipolygon]] page.

I have looked at Sam Vekeman's rules files for HD_1480009_2_Waterbody. There are actually 3 versions of the file but they all seem pretty well the same. Essentially the rules files filter on geometry, attribute and value of the shapefile and map to an attribute and value in the OSM file.

The shp-to-osm.jar is clever because it seems to split edges (ways) up to be a 2000 node maximum which presumably is some limit for OSM datatypes for ways. It also manages to deal with SHP file outer and inner types, and it creates relation:multipolygon objects for areas that have islands.

I found myself reading an ESRI tech description of SHP files: http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf, briefly, just to get a better feel for how the outer and inner rings are done.

I ran the shp-to-osm jar on the HD_1480009_2_Waterbody SHP file. I get an .osm file out. I loaded that into Global Mapper 10 and it loads but it does not seem to know about the relations. Consequently picking on an edge just tells me attributes of that edge (and actually there are no attributes since shp-to-osm has put all the attributes on the relation).

JOSM loads the .osm and picking on an area object edge shows me that there is a relation:multipolygon object, and lets me see the members.

JOSM renders e.g. lake names using the name:en tag (if I have my preferences set to use English or default, which for me is English). Some of the waterbodies in 021G15 have names, while some waterbodies have multiple lakes and rivers massed into a single object, with no name. A cleanup phase could see me(?) go an close off lakes, give them names and remove the point toponymy names (see 2010-02-04 diary entry).

Looked at McGibbons Island (for example) in the St John River, where we have an inner ring, i.e. an island but there is another smallish water area that overlaps the island. These are sandbars (I can see from Toporama) but ... why is the area tagged as water? It's a 1480272 = waterbody in the SHP file. There must be some kind of product spec rules that dictate this in Canvec??

Enough for today. Friday night, time for a beer ... tomorrow's another day.

2010-02-04

Posted by jsmart09 on 5 February 2010 in English.

Today I have looked through the Canvec documentation, which is here: http://geogratis.cgdi.gc.ca/geogratis/en/collection/28954.html

In particular, the Data Product Specifications http://ftp2.cits.rncan.gc.ca/pub/canvec/doc/CanVec_product_specifications_en.pdf

and the Feature Catalogue: http://ftp2.cits.rncan.gc.ca/pub/canvec/doc/CanVec_feature_catalogue_en.pdf

I want to understand these so that I can understand the rules files that Sam Vekemans has set up.

Incidentally Geobase or Canvec? From what I can see Geobase is an initiative, to bring in data from multiple authorities. There is a variety of types of geo data. Canvec is basically the NTS topo map data but it has been reformatted / remodelled. New editions of Canvec are brought out, apparently every 6 months. We are onto edition 4 for NB at least for Fredericton area. The actual spatial data rarely seems to change and some parts are just as out of date in the newest edition (well actually they're more out of date!).

NRN (National Road Network) has been incorporated into Canvec so there seems to be no difference in the data whether you get roads from Canvec or from NRN files.

How to figure out what SHP file some entities / objects are in? Two letter code indicates general theme e.g. HD is hydrography, TR is transportation, TO is toponymy. Within the theme, there is a 7-digit code that indicates the entity type (what I'd think of as "object class"). E.g. 1480009 is the generic code for a Waterbody. Geometry (point, line, area) is split out into separate SHP files. So we have e.g.

021g15_4_0_HD_1480009_2.shp for:

NTS tile 021G15
Edition 4
Version 0
HD for hydrography
1480009 for waterbody objects
2 for area geometry

Inside this SHP file there are all the waterbody objects for the area of 021G15.

To encode the particular types of waterbody object there are further 7 digit codes, stored as the value of the CODE attribute. E.g. a CODE 1480272 is an "unknown / non-isolated" waterbody. A 1480092 is a "liquid waste, isolated" waterbody (i.e. a sewage holding pond)

Hydrography: I thought it might be simple to get lakes and rivers out of Canvec, but some of the data is a bit strange. In 021G15 I have Mactaquac Lake and a big chunk of the St John River as the same area object. In Edition 3 it all had the name "Mactaquac Lake". Edition 4 has removed the name. Likewise for the Oromocto River, a bunch of lakes adjacent to the river are all bundled into the same area object as the river.

So how does Toporama - http://atlas.nrcan.gc.ca/site/english/maps/topo/map?mapsize=1150%201350&lat=45.85204497&long=-66.48944242&mapxy=2170329.14209+130644.561916&scale=5000000&feature_na=Oromocto+River&location1=13&unique_key=0c803390849c20c3b25c26ef7b645fbd&searchstring=oromocto%20river&entity=RIV&layers=fapfeature+nodata_ntdb_50k%20north_arrow%20other_features%20roads%20hydrography%20boundary%20builtup%20vegetation%20populated_places%20railway%20power_network%20manmade_features%20designated_areas%20water_features%20water_saturated_soils%20relief%20contours%20toponymy%20contour&urlappend=%26unique_key%3D0c803390849c20c3b25c26ef7b645fbd%26map.layer[textzoom03]%3DFEATURE+POINTS+2169671.31541+142309.976588+END+TEXT+%22Oromocto%2BRiver%22+END%26map.layer[textzoom46]%3DFEATURE+POINTS+2169671.31541+142309.976588+END+TEXT+%22Oromocto%2BRiver%22+END%26map.layer[lineresultzoom0]%3DDATA+fap_rivers%26map.layer[lineresultzoom1]%3DDATA+fap_rivers%26map.layer[lineresultzoom2]%3DDATA+fap_rivers%26map.layer[arrowzoom03]%3DFEATURE+POINTS+2169671.31541+142309.976588+END+END

manage to get nice-ish looking lake and river names? Answer: there are TO (toponymy) theme point objects which have been digitized someplace inside the waterbody. As I mentioned above the waterbody could encompass multiple lakes and rivers. There is no relationship at all between the toponymy objects and the waterbodies.

TO objects seem to have a generic theme number 1580009. Each object also has a CODE but its value always seems to be 1580010, for all the names I've looked at. There is an additonal attribute, CONCISECODE, and this is the one which differentiates e.g. French Lake (the lake, CONCISECODE=150) from French Lake (the hamlet, CONCISECODE=80). Actually if you look at the Feature Catalogue you can see that 80 is an "unincorporated area).

How does OSM deal with area names? It seems it's all free and easy and varies depending on your mood. I looked at a lake or two in Maine and the area objects had no names but there were point names, like Canvec has. I also looked at a couple of lakes in Switzerland and saw that they had names as attributes of the lake edges. The Osmarenderer seems to pick up those attributes for display as names (if you zoom in far enough...) (Which raises another topic: where are the rules for the renderer? If I knew that, I could figure out what attributes are useful to include and what names to give them.. I think).

Back to the Canvec data. If I want to use the hydrography, I just don't like the approach of the unrelated point objects. I'd prefer to make the name an attribute of the area object. That would mean I had to split up those humungous waterbodies into separate lakes and rivers. More work. Should I do it the simple way first then make another pass later to improve? Seems like the better approach.

Maybe tomorrow I shall take another look at Sam Vekeman's rules files with a view to running the shp-to-osm.jar app to generate some .osm from one or two .shp files. Then I'll look at those in e.g. GM. I'll get a feel for what attribution the .osm is going to have. I should try to find whether there is any accepted standard for attribution for Canada at least.
Question (to myself, rhetorical!): do we want to

2010-02-03

Posted by jsmart09 on 4 February 2010 in English.

Played with Potlatch "revert" function. Bit painful since you can't step quickly through versions or see easily what has changed between versions (e.g. attribute change or spatial change, or spatial but not in the area you're zoomed in on).

Spent more time assessing accuracy. I have been using Global Mapper (GM) v10.02 which can load and display .osm, .shp, .sid. I have overlaid OSM / NRCan (Geobase / Canvec / whatever), SNB's 1:10k topo, SNB's SID orthophotos. Findings, in no particular order:

- the SIDs are quite useful, very useful since a lot of the Yahoo imagery has too coarse a resolution to resolve roads. From what I recall of the SID technical documentation, the accuracy is supposed to be fairly good, well it says "+/- 6.0 metres for well-defined features" here:
http://www.snb.ca/gdam-igec/e/2900e_1b_i.asp

More tech. information here: http://www.snb.ca/topo/assistance/ORE1999R.pdf. It says 90% of well-defined features (i.e. not covered by vegetation) must fall within 4m of their true position. Whenever I've plotted GPS positions on these images I have been comfortable with the locations. I think I am going to have faith in the accuracy of the SODB (Soft Copy Orthophoto Database), as an underlay to vector data to support assessment or even for on-screen digitizing.

Currency: imagery dates from 1996 - 2002. Obviously some roads have changed since then but most have not.

Legality of using SNB's data: see e.g. http://www.snb.ca/gdam-igec/e/2900e_1b_v_.asp?OrthoNum=45756660. From my reading, it says: "do what you like with it but don't blame us if you have problems".

- the SNB Digital Topographic Database (DTDB) has data in SHP format. This is topo data at a nominal 1:10,000 scale. Excellent stuff but perhaps more detailed than necessary. There are 1894 map sheets.

- the NRCan Geobase has data in SHP format on NAD83. I am interested in roads, especially their accuracy. I am also interested in grabbing water, forest, swamp, to supply context.

A thought I am having: it may be best in general to use Geobase roads because the Geobase data all fits together well. Bad to use someone's hokey GPS trace for a road and find it crossing the edge of a Geobase forest or lake.

The Geobase roads (actually the NRN roads) seem to fit the SODB pretty well. A problem with the NRN roads is that they are too filtered, especially at curves: sometimes curves become too jagged and the straight segments get noticeably far from the true curved roads.

The NRN data for NB is relatively old (2003) and does not have a great deal of attribution. But it is pretty complete so very attractive for using.

Merging with existing OSM roads: fortunately there is not a huge amount of roads in OSM already, for NB. This makes merging less of a problem. On the one hand I am inclined to replace existing OSM roads with NRN ones, for consistency and possible accuracy. On the other hand it does not seem right to go knocking out other people's possibly good work just because "it wasn't invented here".

For roads, it makes sense to do the main highways first starting with the TCH. I need to find out what the preferred attribution is supposed to be for Canada.

Water: I am attracted to the idea of doing rivers and lakes first as these do not interfere with anyone else's work. Again I need to find out about the preferred attribution.

Some other attributes: we should really record source, accuracy, source date.

Tracy: looked at GPS trace + OSM road on rte 101 through Tracy. Hmm, GPS trace seems about right but OSM road is about 70m too far E.

How can I insert images into the diary ??

Oh yes - Yahoo imagery seems a bit off sometimes in Fredericton, a few metres, nothing drastic.

END