Assessing the quality of electric vehicle charging station data (with a specific focus on the "capacity" tag)Posted by iboates on 5 March 2023 in English (English). Last updated on 6 March 2023.
I am a developer at an energy research institute, and so the topic of electric vehicles and charging stations is one that comes up often in my discussions with colleagues. One colleague mentioned that they were struggling to do a study in a specific city but was lamenting how hard it was to find data for them. Naturally, I suggested using OSM.
While not outright shot down, the idea was politely dismissed, citing that the OSM data is simply too unreliable and not detailed enough to perform the kind of analysis that they wanted. My initial reaction was to jump to the defense of OSM, but I realized that I don’t really know the data quality of this specific corner of the database.
I was already planning to attend the Karlsruhe OSM hackacthon at Geofabrik on February 26th and 27th, and so I decided I would make it my mission to analyze the quality of the charging station data as deeply as I could in those two days (and some time thereafter). Obviously, as with any analysis of OSM quality, seemingly simple questions balloon into exponentially difficult answers, fraught with tedious subtleties.
Despite this, I have come to a few conclusions that I thought I should share that are specifically targeted at assessing the quality of OSM charging station data for use in electrical engineering research. First, here are some definitions I will use so as not to repeat tedious, specific technical definitions:
- Charging station: An OSM feature tagged with “amentiy=charging_station”
- Charging station point: Such a feature with a “node” geometry type
- Charging station polygon: Such a feature with a “way” geometry type
I have also attached a dump of a PostGIS database which I was able to create using osm2pgsql (specifically the flex-output). I used this database in my analysis, and have written some queries in this post that are used to illustrate my points, when executed on an instance of this database. Special thanks to Jochen Topf and Sarah Hoffman for both developing these utilities and helping me directly in using them.
Confusion about capacity
It seems that there is a fair bit of confusion regarding the “capacity” tag on charging stations The confusion is not unfounded, however. “Capacity” in the context of electrical engineering refers to the maximum power of an output device. As a result, there are many (not a majority, but still many) charging stations on OSM that store this “electrical capacity” as opposed to the “people capacity” which is more standard for anything on OSM, and is defined quite clearly on the wiki.
To dig into this a bit, we can observe that the vast majority of charging stations are points. At the time of writing, there are 87 098 points and 930 ways (as well as 38 relations, but I did not investigate this any further since the wiki indicates that it should not be used on those) (source).
It is quite difficult to determine if a charging station has its power rating mistagged as its capacity, since there is no reliable way to determine the actual capacity without going to the charging station itself and counting the number of sockets.
Despite this, I have come up with a few criteria that I think that identify charging stations with capacity-related tagging issues.
Capacity is not completely numeric
The wiki is (at least at the time of writing) explicit on this:
“The number of vehicles that can be charged at the same time at a amenity=charging_station” (source)
In all cases, even outside of charging stations, it seems that this tag should be completely numeric. Any instance of a non-numeric character indicates that there is definitely something wrong, even if it is just a typo. All of the following cases, except the last one, can be see when executing this query:
select cs.node_id, cs.capacity as capacity from charging_station cs where cs.capacity !~ '^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$';
Specifying the power unit
This criteria catches a very common instance of this misconception, when a mapper has tagged the capacity with a number, followed by a power rating in kilowatts. Here is an example of this happening, although it takes many shapes and forms, with & without spaces, capitalizing either the “k” or the “w”, etc. To me, this is an immediately obvious as a mistake, and these values should be moved to the “socket::output” subtag.
Unnecessarily specifying the number of cars
Don’t be too hasty, because sometimes the mistake isn’t simply that the power rating has been assigned to the capacity. Some mappers have input the correct value, but have polluted the input with unnecessary extra content. For example, here is an instance of a mapper having indicated that the capacity is “2 Cars”, which is technically a correct usage of the tag, but there shouldn’t be any need to specify that it refers to cars, because it should be implicit.
Specifying both power and number of cars
There are also instances of the capacity being specified correctly, but the power rating of the charging station is just shoved into the capacity tag anyway. A really common occurrence of this is in Germany, where there are several charging stations with capacity tagged as “2 x 22 kW”. There are variants of this all over the world, it just happens more in Germany. An example of this is here.
Feature is a polygon (and is small)
According to the wiki, a polygonal charging station is perfectly valid, and it makes sense for cases where there is a high concentration of charging stations, perhaps because of a privately owned company proving many charging slots in a single charging location as a business model. In these cases, the polygon is likely to have a large area. An example of this is here, and we can be reasonably certain that this is a company due to the shape of the building around it (it looks like it is designed to fit many vehicles to maximize space efficiency. Unsurprisingly, the capacity is 28, a reasonable value for such a large area.
However, the polygon does not need to be large. Here is an example of the same company operating only six charging stations, packed into 6 adjacent parking spaces, and occupying a much smaller area as a result.
Problems arise however when we sort the list of polygonal charging stations by area (yes, calculated in 3857, I am well aware of the distortion issues, I only wanted to have an initial look. Proper analysis should certainly use a local, equal-area CRS). For example, this one is the smallest one that I found in the database, and it is so small that it barely even appears on the map. Thankfully, the parking spaces for it appear to actually be mapped, and the rest of the station seems to be properly tagged. So in this case, this feature could be easily re-mapped to be a point instead.
Another case of a problem is when a non-charging station lists a charging station as an amenity, probably the mapper mistakenly thinking that the tag value is intended to mean that “this place offers vehicle charging in addition to its regular function, like here, where a campground in the USA is tagged with “amenity=charging_station”. This particular instance doesn’t actually define capacity, but if it did, it would logically follow that the capacity should refer to the number of campsites, not to the number of charging stations. It’s unclear how to fix this. On the one hand, I wouldn’t want to scrub real information from this place, but on the other hand, it causes a disconnect between the charging station’s capacity and the campground capacity. This could be especially problematic if the number of charging stations is quite high, as a GIS analysis of charging stations could report a wildly incorrect assessment of capacity in that specific area, despite everything being mapped technically correctly.
To see the polygonal charging stations, you can use this query:
select cs.node_id, c.name as country, -– Make sure to use a proper, local equal-area CRS when doing your own detailed analysis! ST_Area(ST_Transform(cs.geom, 3857)) as area from charging_station cs left join country c on ST_Intersects(c.geom, cs.geom) where ST_Area(cs.geom) > 0
Point charging station capacity tag value is suspiciously high
I tried to isolate features, however, that are technically correct, but are still “suspicious”, in that I cannot say (to within a reasonable degree of certainty) that they are wrong, but I think it is reasonable to assume that any value greater than four is suspicious, if the feature is a point. My reasoning is this:
Given that OSM happily supports detail up to (standard web map) zoom level 20, it is reasonable to assume that charging station points are intended to represent an individual installation about the size of a vending machine or ATM. Those features are widely mapped as points. I posit that a charging station point is almost certainly meant to represent an installation of about the size of an ATM or a vending machine. Unless a charging station has extremely long cables, four is likely to be the maximum number of vehicles that it can simultaneously service when positioned at the intersection of the parking lines separating four parked cars in the typical double-columned parking layout that is popular in most of the world.
Maximum “reasonable” density for a charging station point (servicing four vehicles simultaneously)
While it is physically possible for this number to be higher (Perhaps there is some facility somewhere in the world in which cars park in rings around a central charging station), I consider it unlikely, and as such, I consider any “capacity” tag value greater than four to be suspicious and should be verified by a local mapper.
Again, I am not saying that it is significantly unlikely to have more than 4 sockets on a single device, only that it would be worth a local mapper taking a look to confirm that it is the case. Additionally, maybe it is worth re-mapping such cases as polygons to encompass the parking slots themselves, as was the case in some examples in the previous section.
You can get these “suspicious” charging stations with the following query:
select cs.node_id, c.name as country, cs.capacity::real as capacity from charging_station cs left join country c on ST_Intersects(c.geom, cs.geom) where cs.capacity ~ '^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$' and cs.capacity::real > 4 and ST_Area(cs.geom) = 0
There are a lot more of these (>5000). Sadly I don’t think there is any way to realistically and reliably verify these. That is why I would propose a StreetComplete mapping campaign to attempt to clean them up.
I think we can divide the findings into two problem groups.
The first group are the easy fixes. They consist of the first few cases:
- Capacity tag value is not completely numeric
- Power unit is specified in capacity tag value
- Capacity tag value species “cars” or equivalent
- Both power and number of sockets are specified in capacity tag value
I have added a link to a .csv file containing all the OSM ids of features that have one (or more of these problems), as I believe that at least most of them can be fixed by a dedicated mapper (or mappers), even without going to the charging station directly. Here is the .csv file
The second group are the hard fixes. They consist of the last two cases:
- Feature is a polygon (and is small)
- Point charging station capacity is suspiciously high
There are definitely some totally correctly-tagged features in here, but I believe that there are some systematic problems mixed up in them that could stand to be verified by an on-site mapper, perhaps via a StreetComplete campaign. I have also added a list of the OSM ids of features with these potential problems, but keep in mind that fixing them will almost certainly require visiting and verifying the charging station. Here is the link
Finally, I have also added a link to a PostgreSQL database dump (~40MB compressed) of all charging stations worldwide, if anyone wants to dive in deeper.
The code I used to do all this analysis can be found in a GitHub repo with instructions on how to run it.
Comment from SimonPoole on 6 March 2023 at 13:08
We had a discussion back in October in the Swiss community with respect to best practices of mapping individual chargers vs. the area of larger charging facilities. (and tagging the area). See http://lists.openstreetmap.ch/pipermail/talk-ch/2022-October/011686.html It would be nice if we could get general agreement on how to handle detailed vs. rough mapping without losing information.
Some of the issues you have seen in numeric values are “normal” and not specific to charging station tagging, any use will have to normalize these.
PS: I need to remember that we need a plug-in value for authentication, while this is only common right now for Teslas it is likely to become much more common.
Comment from iboates on 7 March 2023 at 13:21
Thanks for the comment Simon, I don’t understand what you mean however by:
“Some of the issues you have seen in numeric values are “normal” and not specific to charging station tagging, any use will have to normalize these.”
What do you mean by “any use will have to normalize these”?
Comment from SimonPoole on 8 March 2023 at 07:44
The long tail of values in essentially any “numeric” tag in OSM will contain malformed data that has to be dealt with. How exactly could range from trying to parse them to extract some useful information, to simply ignoring them. What you can’t do is simply assume they will contain well formatted numbers.
Comment from Pieter Vander Vennet on 8 March 2023 at 17:03
Also see https://mapcomplete.osm.be/charging_stations - if stuff can be improved there, let us know in the issue tracker
Comment from NKA on 12 March 2023 at 13:55
Please note that most of the commercial charging stations are mapped at the “site” level (only one OSM object for all points/sockets at a location), and often just with one node, rather than with a node for each of the charging points at the station. Almost all of the Tesla Superchager stations are mapped this way, for example. This is similar to how a fuel station is mapped.
There is no requirement to map a charging station (site) as a way rather than a node, so charging station nodes may easily have 20-30 sockets.
So I think your list of “suspicious” points is based on a misunderstanding of the OSM mapping, and a StreetComplete campaign should of course not be initiated.
Comment from 4004 on 12 March 2023 at 23:37
Good write up (and props to your colleagues for knowing what OSM is and how the data can be). One issue I found with the capacity tag and the last anomalous indicator you noted is sometimes people generalise a bit - eg where there is an installation of a few chargers, but a mapper only places one node, and then tries to correct this by summing up the capacities of each charger. Also, a few of the datasets (eg from infrastructure providers) I’ve seen have no or little detailed capacity data (ie split by socket type etc)
Comment from iboates on 13 March 2023 at 08:15
@NKA Indeed after some more investigation I think that I was naive in my assessment. However, there are numerous instances of “capacity=22”, and since 22kW is a very common charging station voltage, I think there there is still grounds for suspicion in the capacity tag in general.
@4004 Thank you. I am currently focusing my efforts on finding ways to assess the completeness of charging stations as you said, to see if I can find any trends based on region.
Comment from AngocA on 13 March 2023 at 12:48
Hi iboates, thank you for the analysis and explanation about how to tag electrical chargers. Regarding the capacity, I have a comment in a particular charger in my city, Bogotá; there is a charger with 3 connectors (shuko, mennekes y sae j1779 combo) that allow to plug 3 cars. However, the electrical capacity is not enough, and when the mennekes plug is connected, the combo plug does not provide power to charge the car. In this case, the capacity is 3 (cables/connectors) or 2 (simultaneous charging cars)?
Comment from Niels Elgaard Larsen on 13 March 2023 at 13:30
Where I live, charging stations are mostly located along sidewalks between 2 parking spaces (diagonal or parallel). I.e. capacity=2. Usually in groups of 3 or 4 chargers.
Sometimes they are all mapped as one charging station with e.g., capacity=8. I do not think that is wrong. It still work for people searching for charging stations.
I have done it myself a few times when adding charging stations in OsmAnd just because there is no easy copy function. Then at home, I can split it into 4 stations using aerial images if I have tagged e.g., the first station and remember in which direction they are.
Charging stations are installed quickly at this time and it can take a year before they appear on aerial images.
Comment from InfosReseaux on 13 March 2023 at 13:37
We’re currently discussion about charging infrastructure on community forum https://community.openstreetmap.org/t/charging-stations-sites-or-individual-chargers/96810
Comment from Hedaja on 13 March 2023 at 14:41
Thanks for analyzing the data. Maybe you could open up a Maproulette challenge so we have an easy way to check the suspicious cases.
Comment from SLMapper on 14 March 2023 at 09:16
Hi @iboates and others.
interesting investigation. This looks very focused on cars. Charging stations however also exist for other vehicles (e.g. cars, bicycles, trucks, busses, scooters, boats) and some are even useable by different types of vehicles at a time. All are mapped in osm with amenity=charging_station. So
Is there anybody having any more insights or tips on vehicle types?