Some numbers about mailing lists (part 2): Number of messages per mailing list and year, most active authors since 2016
Posted by Nakaner on 3 December 2018 in English.A new evening is a new evening to play around with SQL. This is part 2 of my series on the number of messages and the most active authors on the public mailing lists hosted a lists.openstreetmap.org. Part 1 was published on 2 December 2018.
Here are my latest queries and their results:
Numbers of messages per year
SELECT EXTRACT(YEAR FROM "date") AS year, COUNT(1) AS message_count
FROM mails
GROUP BY year
ORDER BY year;
year | message_count
------+---------------
2004 | 180
2005 | 2343
2006 | 10481
2007 | 34059
2008 | 70120
2009 | 102357
2010 | 96569
2011 | 66484
2012 | 69943
2013 | 67428
2014 | 55866
2015 | 52484
2016 | 38914
2017 | 36815
2018 | 34271
Numbers of messages per mailing list and year (2004–2008)
Mailing lists with less than 50 messages in all of these years are not shown in the following table. Rows are sorted by number of messages in their first year.
WITH
postings_years AS (
SELECT list_id, EXTRACT(YEAR FROM "date") AS year, COUNT(1) AS message_count
FROM mails
GROUP BY list_id, year
),
list_ids AS (
SELECT list_id
FROM mails
GROUP BY list_id
)
SELECT *
FROM (
SELECT
l.list_id AS list_id,
y2004.message_count AS "2004",
y2005.message_count AS "2005",
y2006.message_count AS "2006",
y2007.message_count AS "2007",
y2008.message_count AS "2008"
FROM list_ids AS l
LEFT OUTER JOIN postings_years AS y2004
ON (l.list_id = y2004.list_id AND y2004.year = 2004)
LEFT OUTER JOIN postings_years AS y2005
ON (l.list_id = y2005.list_id AND y2005.year = 2005)
LEFT OUTER JOIN postings_years AS y2006
ON (l.list_id = y2006.list_id AND y2006.year = 2006)
LEFT OUTER JOIN postings_years AS y2007
ON (l.list_id = y2007.list_id AND y2007.year = 2007)
LEFT OUTER JOIN postings_years AS y2008
ON (l.list_id = y2008.list_id AND y2008.year = 2008)
ORDER BY
y2004.message_count DESC NULLS LAST,
y2005.message_count DESC NULLS LAST,
y2006.message_count DESC NULLS LAST,
y2007.message_count DESC NULLS LAST,
y2008.message_count DESC NULLS LAST,
list_id
) AS stats
WHERE "2004" >= 50
OR "2005" >= 50
OR "2006" >= 50
OR "2007" >= 50
OR "2008" >= 50;
list_id | 2004 | 2005 | 2006 | 2007 | 2008
-------------------+------+------+------+-------+-------
talk | 180 | 1771 | 7772 | 11656 | 11310
dev | | 572 | 2119 | 5625 | 4979
talk-gb | | | 239 | 1952 | 1524
talk-de | | | 235 | 5953 | 26392
legal-talk | | | 47 | 452 | 1285
talk-fr | | | 42 | 763 | 4579
talk-it | | | 27 | 322 | 4381
talk-nl | | | | 4107 | 3873
talk-es | | | | 888 | 1473
josm-dev | | | | 597 | 1689
talk-cz | | | | 590 | 1727
talk-au | | | | 487 | 771
talk-za | | | | 184 | 104
routing | | | | 162 | 432
talk-tr | | | | 103 | 26
party | | | | 69 | 7
talk-ie | | | | 54 | 192
talk-fi | | | | 52 | 30
talk-se | | | | 19 | 58
talk-us | | | | 13 | 619
merkaartor | | | | | 1001
talk-ja | | | | | 897
talk-be | | | | | 612
talk-ca | | | | | 471
talk-at | | | | | 441
talk-ph | | | | | 261
talk-in | | | | | 220
talk-ar | | | | | 181
talk-gb-midanglia | | | | | 116
legal-general | | | | | 111
talk-co | | | | | 100
talk-il | | | | | 69
talk-is | | | | | 69
(33 rows)
Number of messages per mailing list and year (2009–2013)
Mailing lists with less than 50 messages in all of these years are not shown in the following table. Rows are sorted by number of messages in their first year.
WITH
postings_years AS (
SELECT list_id, EXTRACT(YEAR FROM "date") AS year, COUNT(1) AS message_count
FROM mails
GROUP BY list_id, year
),
list_ids AS (
SELECT list_id
FROM mails
GROUP BY list_id
)
SELECT *
FROM (
SELECT
l.list_id AS list_id,
y2009.message_count AS "2009",
y2010.message_count AS "2010",
y2011.message_count AS "2011",
y2012.message_count AS "2012",
y2013.message_count AS "2013"
FROM list_ids AS l
LEFT OUTER JOIN postings_years AS y2009 ON (l.list_id = y2009.list_id AND y2009.year = 2009)
LEFT OUTER JOIN postings_years AS y2010 ON (l.list_id = y2010.list_id AND y2010.year = 2010)
LEFT OUTER JOIN postings_years AS y2011 ON (l.list_id = y2011.list_id AND y2011.year = 2011)
LEFT OUTER JOIN postings_years AS y2012 ON (l.list_id = y2012.list_id AND y2012.year = 2012)
LEFT OUTER JOIN postings_years AS y2013 ON (l.list_id = y2013.list_id AND y2013.year = 2013)
ORDER BY
y2009.message_count DESC NULLS LAST,
y2010.message_count DESC NULLS LAST,
y2011.message_count DESC NULLS LAST,
y2012.message_count DESC NULLS LAST,
y2013.message_count DESC NULLS LAST,
list_id
) AS stats
WHERE "2009" >= 50
OR "2010" >= 50
OR "2011" >= 50
OR "2012" >= 50
OR "2013" >= 50;
list_id | 2009 | 2010 | 2011 | 2012 | 2013
-----------------------+-------+-------+-------+-------+-------
talk-de | 28125 | 20392 | 10276 | 8758 | 6088
talk | 13570 | 9296 | 5724 | 4095 | 3336
talk-fr | 11994 | 12072 | 9541 | 13755 | 12880
talk-it | 8267 | 7896 | 5182 | 6285 | 8046
dev | 4720 | 3398 | 2613 | 2235 | 1236
talk-au | 3515 | 2617 | 1291 | 1019 | 450
talk-gb | 3358 | 3428 | 2027 | 1669 | 1386
talk-nl | 2511 | 1824 | 1236 | 721 | 328
talk-us | 1942 | 2436 | 2101 | 2885 | 2385
talk-es | 1892 | 2281 | 2121 | 2265 | 1063
talk-cz | 1729 | 2019 | 890 | 1064 | 888
josm-dev | 1587 | 1217 | 792 | 464 | 428
talk-ca | 1539 | 1511 | 832 | 895 | 659
talk-ph | 1345 | 1185 | 835 | 584 | 639
talk-lv | 1304 | 967 | 1724 | 985 | 543
legal-talk | 1291 | 2468 | 1226 | 510 | 310
talk-at | 1278 | 939 | 934 | 1640 | 990
talk-ja | 1224 | 1844 | 1667 | 1222 | 917
tagging | 1011 | 5118 | 2900 | 3368 | 3551
talk-ro | 944 | 786 | 292 | 238 | 606
merkaartor | 942 | 1022 | 158 | 94 | 20
talk-br | 901 | 971 | 480 | 671 | 1728
talk-transit | 858 | 391 | 290 | 50 | 83
talk-in | 824 | 355 | 92 | 210 | 177
talk-hr | 553 | 478 | 316 | 344 | 285
imports | 440 | 280 | 504 | 548 | 848
osmosis-dev | 418 | 446 | 317 | 249 | 140
talk-be | 416 | 674 | 733 | 921 | 2098
talk-gb-westmidlands | 364 | 261 | 237 | 300 | 229
talk-co | 363 | 1148 | 964 | 500 | 321
talk-dk | 357 | 708 | 1055 | 1063 | 411
talk-is | 269 | 207 | 65 | 145 | 92
talk-za | 257 | 189 | 78 | 82 | 176
routing | 247 | 233 | 78 | 76 | 14
talk-ee | 228 | 548 | 357 | 199 | 154
talk-se | 211 | 411 | 414 | 676 | 572
talk-ar | 211 | 244 | 219 | 155 | 289
talk-lt | 148 | 484 | 285 | 251 | 334
talk-fr-bzh | 117 | 114 | 290 | 279 | 194
talk-gb-midanglia | 114 | 48 | 62 | 16 | 8
talk-cu | 108 | 34 | | 11 | 3
talk-si | 107 | 55 | 26 | 34 | 2
talk-gb-oxoncotswolds | 87 | 20 | 101 | 8 | 85
local-chapters | 65 | 118 | 20 | | 47
geocoding | 65 | 77 | 48 | 423 | 430
talk-ie | 61 | 50 | 64 | 26 | 44
moderation | 59 | 15 | | |
accessibility | 55 | 101 | 55 | 75 |
talk-vi | 54 | 1 | 1 | 7 |
potlatch-dev | 51 | 326 | 984 | 540 | 79
talk-cl | 33 | 857 | 120 | 83 | 36
talk-ko | 21 | 34 | 11 | 67 | 11
talk-pt | 14 | 70 | 232 | 171 | 178
talk-tr | 12 | 18 | 21 | 38 | 64
talk-pe | 3 | 65 | 53 | 127 | 40
hot | | 456 | 861 | 1179 | 1787
talk-ht | | 347 | 64 | 183 | 337
talk-it-fvg | | 206 | 75 | 66 | 180
strategic | | 185 | 398 | 14 |
dev-fr | | 140 | 480 | 821 | 449
mapcss | | 134 | 116 | 61 | 65
talk-ve | | 86 | 255 | 226 | 19
talk-tw | | 53 | 112 | 467 | 412
talk-rs | | 30 | 199 | 71 | 1
talk-uy | | 12 | 17 | 24 | 68
talk-it-piemonte | | | 281 | 143 | 47
rails-dev | | | 187 | 1408 | 3458
taginfo-dev | | | 74 | 52 | 59
talk-id | | | 70 | 64 | 21
talk-it-bici | | | 54 | 47 |
talk-it-lazio | | | 49 | 36 | 201
talk-it-trentino | | | 33 | 133 | 323
talk-lu | | | 33 | 59 | 16
talk-np | | | 14 | 163 | 31
talk-it-liguria | | | 13 | 513 | 153
talk-pl | | | 7 | 449 | 346
rebuild | | | | 353 |
historic | | | | 72 | 265
talk-sn | | | | 36 | 152
talk-ni | | | | 15 | 310
talk-baltics | | | | 5 | 91
sotm-asia | | | | 4 | 86
talk-mx | | | | 2 | 57
tile-serving | | | | | 778
imports-us | | | | | 481
osrm-talk | | | | | 359
talk-cat | | | | | 230
diversity-talk | | | | | 130
talk-us-nps | | | | | 106
talk-it-southtyrol | | | | | 69
talk-it-sardinia | | | | | 61
talk-ke | | | | | 61
(92 rows)
Some mailing lists changed their focus or became unnessary after some time. For example, the legal-talk mailing list became a bit more quiet after the license change in mid 2012 was over. The “rebuild” mailing list had the only purpose to discuss the decisions implemented in the Redaction Bot.
Number of messages per mailing list and year (2014 to November 2018)
Mailing lists with less than 50 messages in all of these years are not shown in the following table. Rows are sorted alphabetically by the ID of the mailing list.
WITH
postings_years AS (
SELECT list_id, EXTRACT(YEAR FROM "date") AS year, COUNT(1) AS message_count
FROM mails
GROUP BY list_id, year
),
list_ids AS (
SELECT list_id
FROM mails
GROUP BY list_id
)
SELECT *
FROM (
SELECT
l.list_id AS list_id,
y2014.message_count AS "2014",
y2015.message_count AS "2015",
y2016.message_count AS "2016",
y2017.message_count AS "2017",
y2018.message_count AS "2018"
FROM list_ids AS l
LEFT OUTER JOIN postings_years AS y2014 ON (l.list_id = y2014.list_id AND y2014.year = 2014)
LEFT OUTER JOIN postings_years AS y2015 ON (l.list_id = y2015.list_id AND y2015.year = 2015)
LEFT OUTER JOIN postings_years AS y2016 ON (l.list_id = y2016.list_id AND y2016.year = 2016)
LEFT OUTER JOIN postings_years AS y2017 ON (l.list_id = y2017.list_id AND y2017.year = 2017)
LEFT OUTER JOIN postings_years AS y2018 ON (l.list_id = y2018.list_id AND y2018.year = 2018)
ORDER BY
list_id
) AS stats
WHERE "2014" >= 50
OR "2015" >= 50
OR "2016" >= 50
OR "2017" >= 50
OR "2018" >= 50;
list_id | 2014 | 2015 | 2016 | 2017 | 2018
-----------------------+------+------+------+------+------
dev | 596 | 758 | 633 | 393 | 390
dev-fr | 365 | 144 | 74 | 54 | 14
dev-italia | 90 | 16 | 165 | 11 | 5
diversity-talk | 118 | 4 | | 5 | 100
geocoding | 318 | 300 | 104 | 57 | 29
historic | 205 | 380 | 55 | 83 | 59
hot | 2669 | 3729 | 2088 | 1272 | 503
hot-francophone | 305 | 497 | 182 | 123 | 107
imports | 914 | 604 | 518 | 564 | 515
imports-us | 154 | 90 | 47 | 12 | 52
josm-dev | 288 | 437 | 172 | 157 | 217
learnosm-coord | 50 | 57 | 4 | |
legal-talk | 367 | 276 | 197 | 74 | 29
osmf-membership | 2 | 59 | | |
osmosis-dev | 23 | 89 | 35 | 19 | 19
osrm-talk | 343 | 345 | 293 | 179 | 156
potlatch-dev | 14 | 22 | 2 | 71 | 12
rails-dev | 1309 | 1783 | 1781 | 2232 | 2494
tagging | 4685 | 7217 | 2803 | 3734 | 6602
talk | 2814 | 3584 | 1975 | 2668 | 1778
talk-africa | | 1 | 16 | 100 | 114
talk-ar | 228 | 77 | 4 | 11 |
talk-at | 851 | 938 | 596 | 334 | 729
talk-au | 248 | 279 | 370 | 416 | 592
talk-be | 1119 | 1569 | 1234 | 837 | 260
talk-bf | 84 | 45 | 27 | 18 | 1
talk-bj | 3 | 5 | 26 | 64 | 5
talk-blr | 3 | 52 | | |
talk-bo | 24 | 32 | 137 | 153 | 210
talk-br | 4412 | 1809 | 730 | 461 | 189
talk-ca | 370 | 311 | 906 | 611 | 679
talk-cat | 368 | 374 | 83 | 74 | 45
talk-cl | 106 | 110 | 165 | 159 | 83
talk-cm | 31 | 8 | 5 | 41 | 50
talk-co | 200 | 268 | 264 | 128 | 55
talk-cu | 1 | 13 | 131 | 42 | 35
talk-cz | 2290 | 1720 | 2594 | 2490 | 2140
talk-de | 3615 | 2306 | 1335 | 764 | 1138
talk-dk | 359 | 415 | 299 | 156 | 212
talk-es | 786 | 762 | 1121 | 814 | 856
talk-fr | 8479 | 5082 | 3690 | 4110 | 4008
talk-fr-bzh | 399 | 332 | 157 | 153 | 169
talk-gb | 1300 | 1136 | 1524 | 1390 | 1151
talk-gb-westmidlands | 262 | 161 | 123 | 180 | 65
talk-gh | 1 | 44 | 30 | 77 | 74
talk-hr | 238 | 164 | 35 | 32 | 65
talk-ht | 144 | 62 | 131 | 72 | 46
talk-ie | 352 | 481 | 297 | 298 | 159
talk-in | 125 | 362 | 271 | 275 | 189
talk-it | 5730 | 4646 | 5645 | 4843 | 3675
talk-it-cai | | | | 336 | 173
talk-it-fvg | 139 | 219 | 269 | 236 | 96
talk-it-lazio | 64 | 23 | 47 | 231 | 98
talk-it-liguria | 73 | 19 | 12 | 10 | 2
talk-it-piemonte | 118 | 49 | 17 | 73 | 113
talk-it-sardinia | 110 | 12 | 28 | |
talk-it-southtyrol | 69 | 10 | | 2 | 1
talk-it-trentino | 163 | 159 | 277 | 71 | 68
talk-ja | 635 | 543 | 356 | 378 | 370
talk-ko | 26 | 30 | 5 | 97 | 36
talk-latam | 165 | 235 | 167 | 119 | 70
talk-lt | 81 | 122 | 215 | 163 | 86
talk-lv | 153 | 132 | 57 | 63 | 28
talk-mg | | 68 | 20 | 28 | 46
talk-ml | 9 | 101 | 33 | 29 | 4
talk-mx | 139 | 252 | 127 | 79 | 53
talk-ni | 169 | 113 | 125 | 59 | 29
talk-nl | 417 | 199 | 133 | 53 | 33
talk-pe | 71 | 187 | 137 | 40 | 12
talk-ph | 547 | 356 | 177 | 155 | 117
talk-pl | 726 | 614 | 531 | 74 | 98
talk-pr | 142 | 92 | 53 | 29 | 4
talk-pt | 372 | 190 | 110 | 113 | 71
talk-ro | 204 | 134 | 39 | 2 | 3
talk-scotland | 26 | 69 | 42 | 96 | 39
talk-se | 230 | 146 | 247 | 263 | 129
talk-sn | 64 | 114 | 113 | 75 | 57
talk-tn | | | 58 | 58 | 37
talk-transit | 33 | 42 | 51 | 11 | 131
talk-tw | 91 | 54 | 64 | 13 | 18
talk-us | 1552 | 1771 | 1108 | 1300 | 847
talk-us-massachusetts | | | 37 | 114 | 248
tile-serving | 908 | 1870 | 709 | 1006 | 648
umap | 17 | 49 | 79 | 63 | 36
(84 rows)
Top Authors per Year
SELECT
row_number() OVER (ORDER BY message_count DESC) AS rank,
author,
message_count
FROM (
SELECT
from_short AS author,
COUNT(1) AS message_count
FROM mails
WHERE
EXTRACT(YEAR FROM "date") = 2018
AND from_short NOT IN ('notifications@github.com', 'theweekly.osm@gmail.com')
GROUP BY author
ORDER BY message_count DESC
) AS a
LIMIT 20;
rank | author | message_count
------+---------------------------+---------------
1 | Martin Koppenhoefer | 1819
2 | Warin | 569
3 | Marc Marc | 554
4 | Philippe Verdy | 506
5 | Mateusz Konieczny | 478
6 | Giovanni Cascafico | 344
7 | Paul Allen | 322
8 | Graeme Fitzpatrick | 295
9 | François Lacombe | 272
10 | Simone Girardelli | 271
11 | Marián Kyral | 271
12 | Christoph Hormann | 257
13 | Tom Ka | 249
14 | Volker Schmidt | 248
15 | Polyglot | 244
16 | majka | 244
17 | Frederik Ramm | 236
18 | Javier Sánchez Portero | 230
19 | John Whelan | 229
20 | Christan Quest | 226
(20 rows)
SELECT
row_number() OVER (ORDER BY message_count DESC) AS rank,
author,
message_count
FROM (
SELECT
from_short AS author,
COUNT(1) AS message_count
FROM mails
WHERE
EXTRACT(YEAR FROM "date") = 2017
AND from_short NOT IN ('notifications@github.com', 'theweekly.osm@gmail.com')
GROUP BY author
ORDER BY message_count DESC
) AS a
LIMIT 20;
rank | author | message_count
------+----------------------------------+---------------
1 | Martin Koppenhoefer | 1598
2 | Philippe Verdy | 581
3 | Marc Marc | 499
4 | Marián Kyral | 481
5 | Warin | 414
6 | Marc Gemis | 372
7 | Giovanni Cascafico | 357
8 | Simone Girardelli | 352
9 | John Whelan | 351
10 | osm.sanspourriel | 296
11 | Christian Quest | 291
12 | Andy Townsend | 248
13 | Volker Schmidt | 235
14 | LogicalViolinist/James2432 | 233
15 | Polyglot | 229
16 | Frederik Ramm | 225
17 | Dave F | 222
18 | Alessandro Palmas | 221
19 | Joost Schouppe | 210
20 | Paul Johnson | 210
(20 rows)
SELECT
row_number() OVER (ORDER BY message_count DESC) AS rank,
author,
message_count
FROM (
SELECT
from_short AS author,
COUNT(1) AS message_count
FROM mails
WHERE
EXTRACT(YEAR FROM "date") = 2016
AND from_short NOT IN ('notifications@github.com', 'theweekly.osm@gmail.com')
GROUP BY author
ORDER BY message_count DESC
) AS a
LIMIT 20;
rank | author | message_count
------+----------------------------------+---------------
1 | Martin Koppenhoefer | 1781
2 | Philippe Verdy | 506
3 | Marián Kyral | 431
4 | Simone Girardelli | 394
5 | Marc Gemis | 385
6 | John Whelan | 347
7 | osm.sanspourriel | 319
8 | Christan Quest | 315
9 | Giovanni Cascafico | 306
10 | Joost Schouppe | 302
11 | Warin | 267
12 | Polyglot | 245
13 | Luca Delucchi | 240
14 | Frederico Cortese | 224
15 | Maurizio Napolitano | 223
16 | Frederik Ramm | 223
17 | Andrea Lattmann | 222
18 | LogicalViolinist/James2432 | 220
19 | Colin Smale | 215
20 | Andy Townsend | 211
(20 rows)
I converted the email addresses into names manually by entering all email addresses into Google. While copying the results, I noticed that among the top 20 posters since 2016 only a few seem to have English as their native language. There are a lot of people from the French, Italian and Czech community. This is no surprise if you look for the most active mailing lists (see table above).
The OSMF-Talk mailing list was not taken into account for all these numbers (and it was not taken into account for the numbers published in part 1).
This list of the most active posters measured in messages per year will follow in the next part of this series. If you have any other suggestions what to query, don’t hesitate to comment.
I would like to ask all who intend to write a comment to stay on topic and don’t repeat themselves.
Discussion
Comment from amapanda ᚛ᚐᚋᚐᚅᚇᚐ᚜ 🏳️🌈 on 5 December 2018 at 17:08
OSM has (I presume) been growing for years, with more and more mappers and editors. I wonder why so many mailing lists show large declines in messages in recent years.
Comment from mmd on 5 December 2018 at 18:04
Pretty obvious I would say. Mailing lists are 1870’s technology, and lots of people use different communication channels today that are more familiar and convenient to them. I for one find it quite annoying that I cannot edit my mailing list posts after sending. If there’s one thing those figure are not indicative for, is that the overall osm project related traffic is declining. It’s simply spread across so many more channels.
Comment from Nakaner on 5 December 2018 at 18:30
@mmd I am happy to share the PostgreSQL dump and the scripts with you if you have scraped the forum, Reddit (does it have an API?), Facebook or whatever and want to do a combined analysis.
Comment from mmd on 5 December 2018 at 18:58
Looking at the huge number of communication channels on https://wiki.openstreetmap.org/wiki/Contact_channels and all related sites mentioned there, I think that would be quite a massive task.
Unlike the mailing list archive with its single server and uniform access, each platform has some different kind of API, if at all. Some platforms might require you to sign up first (e.g. Slack), and scraping websites might violate their ToUs.
Lots of project related communication also happens on various issue tracker (Github, Trac, etc.), and keeping track of them isn’t exactly trivial.
Also, people sometimes use different nicknames on different platforms, which makes it a bit more difficult to link profiles.
To sum up, we’re probably looking at hundreds if not thousands of different channels that would need to be covered for a somewhat comprehensive view. And all that keeps changing over time.
Comment from TheDutchMan13 on 8 December 2018 at 02:25
Thanks for posting this!