OpenStreetMap logo OpenStreetMap

watmildon's Diary

Recent diary entries

Using the JOSM Conflation plugin to add 1500 addresses in 10 minutes

Posted by watmildon on 24 April 2023 in English. Last updated on 20 March 2024.

UPDATE

Newer versions of JOSM have changed behavior of Replace Geometry which impacts all usage of the conflation plugin. See the JOSM ticket for details and workaround.

The setup

This diary is a follow on to my previous entry detailing Adding addresses with JOSM and MapWithAI. You’ll need all of the setup there plus the Conflation Plugin.

Finding a good area

The key to doing this quickly is to find an area with:

  • High quality address data with good spatial positioning
  • High density of building outlines to act as targets for the address data
  • Extremely low current address density (resolving conflicts is important but does slow you down!)

The area I’ve been spending most of my time recently is Phoenix, AZ which has great address data from the [National Address Database] and a high level of building coverage. The suburb are also quite sprawling which means you can cover a lot of very regularized ground very quickly. Here’s a good candidate for rapid addition: Aerial image of a suburb of Phoenix. The houses are laid out very regularly

Using the same setup from the other diary entry, pull in the address data. Looks great so far: Same aerial image of a suburb of Phoenix but not with dots representing address nodes. The houses and nodes match very well.

Let’s get conflating

The conflation plugin is extremely powerful. It can be used to ask “which things in set A are closely matched with things in set B”. I have described how to use it to find umapped hospitals but this time we’ll be using it to match address nodes with building outlines.

The first set of things we need to select are the address nodes we want to add. These will be in the conflation tool as the “Reference”.

  • Activate the MapWithAI layer that has the address nodes using the Layers pane right click > Activate
  • Select the relevant set of nodes (this may be all the nodes if your download area is precise)
  • Click the “Configure” button on the conflation tool pane
  • Click the upper “Freeze” button on the “Configure conflation settings” pane

Next we need to get the building outlines into the conflation tool as the “Subject”.

  • Activate the layer with the OSM data using the Layers pane right click > Activate
  • Open the find element dialog Ctrl+F
  • Search for all building elements that are ways building=* type:way
  • Click the lower “Freeze” button

Screen shot of the configuration pane for the conflation plugin showing that the reference and subject have been set

Hit “Generate match…” to set the tool to work. It will relatively quickly generate a list of matched and unmatched items. This set has 23 unmatched items listed in the “Reference only” tab. Let’s go take a look.

conflation tool dialog. a list of matched and unmatched items

Unmatched items

Each unmatched item will need to be reviewed to determine what needs to be done. There’s lots of ways things can go unmatched but I’ll outline the most common here.

Example 1: An ambiguous area.

In this case the aerial show an incomplete construction site with no real matching to the address data. It’s not clear at all that any of this is useful so we delete the nodes and move on. Remember, we don’t have to add everything but we really really want to avoid adding incorrect data.

Aerial view of a large constructionsite mostly still brownfield with address nodes scattered across it

Example 2: Features without buildings.

Here’s three cases where the address node should be merged into the OSM layer. A house without a building object. A bit of utilities. A park. Aerial view of a house but no OSM building outline Aerial view of a fenced in utility area Aerial view of a park

Example 3: An extra address node.

No data set is perfect. Sometimes you’ll find address nodes that don’t obviously belong to anything on the ground. Delete them and move on.

Aerial image of a suburban street with an address node in the middle of the street

Example 4: Bad conflation due to misaligned data

Again, we have a data set issue that needs fixing. It’s clear from context that the address data for some houses is shifted a bit to the west. It is best to move the nodes where they belong and merge them by hand.

Aerial image of a suburban street with misaligned address nodes such that one house has no address node associated with it

Example 5: Large building or building that is a relation

Sometimes an address clearly goes with a building but the building shape makes the conflation difficult. Merge it by hand.

Aerial image of a large building with a lone address node nearby but not in the outline

Reviewing matched items

Now we need to review the matched items. Matches with a low “distance” mean that the address node was sitting very near the centroid of a building. Let’s start by reviewing the large distance matches.

Screenshot of the conflation matches sorted by decreasing "distance". The top item has a distance very near our cutoff of 30

Click through them one at a time and see if it looks like a reasonable match. You only need to intervene on any matches that are incorrect. Click through them until the distance is small enough that most items are inside the building outline. For this set that’s something like 15.

The most common mismatch looks like this:

A match where the address node is matched to the neighbors shed

Bulk conflation and final review

You’re now ready to hit the big “Conflate” button. This will merge all of the address data into the “matching” building outlines. The area will look very nice.

Zoomed out aerial view of the osm data now with address information

Pan though the area and see if anything looks amiss. When I did this I ended up needing to correct a set of houses where the address data had matched onto some smaller out buildings and not the main structure. Moving the tags is easy with the copy/paste in JOSM.

Aerial view of a cul-de-sac with sheds having address data that should be on the larger building outlines

Upload

Run the validator and fix up anything it notices before upload. Most commonly adding a directional or suffix to a roadway. It’ll probably take a bit for your first few runs but this process takes me about 10 minutes for this chunk of the map.

Let me know if you have any questions or if you found this useful!

Happy mapping.

About HIFLD

OSM power user SherbetS has been documenting the HIFLD dataset. It is a large corpus of public domain licensed geospatial information about infrastructure around the United States.

Get all the data into JOSM

Download the dataset in GeoJSON format here. Fire up JOSM and open the file. This will create a layer that is just the HIFLD data. If won’t have any fields that look like OSM fields so DO NOT upload this directly.

Next we need to get the OSM data into JOSM. For this demo we will use the data for the state of Wyoming.

  • Click the green download button to open the Download dialog.
  • Click the “Download from Overpass API” tab at the top.
  • In the Overpass text box put something like:
// fetch area “Wyoming” to search in
{{geocodeArea:Wyoming}}->.searchArea;
// gather results
(
  // query part for: “amenity=hospital”
  nwr["amenity"="hospital"](area.searchArea);
  // query part for: “amenity=clinic”
  nwr["amenity"="clinic"](area.searchArea);
);
// print results
out body;
>;
out skel qt;
  • Hit the “Download into new layer” button at the bottom

We now have 2 layers. One with the HIFLD data and one with the OSM data.

Finding unmapped items with the conflation plugin

We will now use the Conflation Plugin to match the nodes in the HIFLD dataset with OSM downloaded elements. Any elements that do not match to an OSM item should be reviewed and additions made. Any matched elements may be reviewed for completeness in OSM but that’s a separate matter.

We will start by selecting elements from the HIFLD dataset that are in our state of interest (“WY” in this case) and add this to the Conflation tool as the “Reference”. This is the set we’re trying to match elements to.

  • Make the HIFLD layer active using the Layers pane right click > Activate
  • Open the find element dialog Ctrl+F
  • Search for all elements tagged as being in Wyoming STATE="WY"
  • Click the “Configure” button on the conflation tool pane
  • Click the upper “Freeze” button on the “Configure conflation settings” pane

Now we need to add the OSM data to the “Subject” section of the conflation tool. This is the set of all the things the “Reference” set can match into.

  • Make the OSM layer active using the Layers pane right click > Activate
  • Open the find element dialog Ctrl+F
  • Search for all elements tagged as being in a hospital or clinic amenity=hospital OR amenity=clinic
  • Click the lower “Freeze” button

Because hospital grounds are typically large I changed the “Distance” parameter in the conflation settings to be “Centroid < 150”. This may need to be tuned based on variation.

Hit “Generate match…” to set the tool to work. It will relatively quickly generate a list of matched and unmatched items.

conflation tool dialog. a list of matched items with the names of various hospitals

conflation tool dialog. a list of unmatched nodes from the HIFLD dataset

Go through the results

All of the matched items may be reviewed for accuracy and perhaps if there is interesting data from HIFLD that is worth adding to OSM (address information, helipad availability etc). However, the most interesting items will be unmatched. This indicates there is a health facility missing from OSM. Here’s a few from this search:

An aerial view of a large building with a red dot showing the location of the HIFLD node. No OSM data present An aerial view of a large building with a red dot showing the location of the HIFLD node. No OSM data present An aerial view of a large building with a red dot showing the location of the HIFLD node. A building outline is present from OSM data

Conclusion

You can now go an map any missing facility as per the usual OSM guidelines.

SherbetS has posted a walkthrough of how they adjust this kind of file to match OSM tagging guidelines. If should help speed up importing process and is worth a read.

Fixing tiny fonts in JOSM on Windows

Posted by watmildon on 14 March 2023 in English.

Act now to save from eye strain!

Did your JOSM UI text become a huge bother to read? Is it too tiny? Are the icons not as you’d expect? You may be suffering from a case of Too High DPI! It’s more common than you think!

Thankfully the medicine is easy enough to apply at home. You’ll need to find and edit the properties of the JOSM.exe file. It’ll be at a path like C:\Users\admin\AppData\Local\JOSM (you’ll need to change “admin” to whatever your user account is). Right click the exe > properties > Change high DPI settings > check “Use this setting to fix scaling problems…”

A screenshot of dialog boxes showing the workflow enable high DPI correction on Windows

Using Google to improve OSM

Posted by watmildon on 12 March 2023 in English. Last updated on 19 March 2023.

The header from an incoming email from Google Alerts

Google alerts

In the fall of 2002, there was a notice issued renaming many (~600) natural features in the United States. There was very likely to initiate a cascade of renames for various map features. For example, the name for the road up the mountain peak will likely change to match the new name.

At the tail end of our renaming work, I set up a few Google Alerts related to the theme of the order and promptly forgot about them.

Map updates!

Starting a few months ago the alerts began firing! Each week I would get an email or two about town councils or other agencies planning to rename some feature or another.

Another one came into my inbox this morning announcing that a resort near a renamed waterway was updating it’s name. A bit of checking the resort website to confirm the new name, a few clicks in iD, et voila!

It will certainly be interesting to see whether the roadway, golf course etc get similar updates and I’ve made a note to myself the check back in a few months.

Try it out

Even something as simple as “road rename " should flag local news stories about things needing updates in OSM. I encourage you to try it out and let me know what changes/updates/notices it turns up in your area.

A list of changes from these alerts

United States - https://www.openstreetmap.org/changeset/133569531

Canada - https://www.openstreetmap.org/changeset/133845472

The idea

This entry will describe one application of our GNIS matching program using a very narrow slice of the dataset.

While looking through the Populated Place feature class, I noticed a rather substantial number of names for mobile home parks. I was immediately reminded of the MapSwipe effort in coordination with YouthMappers and the ASU Knowledge Exchange for Resilience.

The full GNIS file has ~7000 entries that look like they will very likely be the location and name of a mobile home park. Running the GNIS Matcher against the ones in Arizona gives 516 places to check the map for either tagging improvement or new geometry.

You can check out the project here.

How to map

There is an amazing guide produced by ASU that has a huge amount of detail about mapping mobile homes. Definitely look it over.

Tagging

An area drawn to represent a mobile home park should have tagging like:

landuse=residential
residential=trailer_park
name=Estrella Villa Mobile Home Park

There is occasionally some confusion between this and a similar tag tourism=caravan_site. This tagging may also show up in places with trailers/RVs but the intent is for that to mark temporary stopping locations, similar to tourism=camp_site.

The matcher would also like a GNIS ID included. The appropriate tagging looks like:

gnis:feature_id=2669824

Drawing areas

In general, try to draw a boundary that encompasses all of the residences. There are often clues in aerial imagery that help delineate the specific park from other areas (guide). Look for fences, pavement differences, and road connectedness to guide your judgement. If there is high ambiguity, it may be helpful to leave a Note for local mappers.

Street side imagery from Mapillary or Bing Streetview is often helpful and has the appropriate license for OSM use.

Examples

No area found, needs mapping

Here’s three pictures from tasks showing GNIS data points that the matcher has not found an appropriate OSM object for. The correct thing is to add an area with appropriate tags as described and mark the task as “I fixed it!”. The matcher is somewhat conservative so may have missed the correct area. You should feel empowered to update an appropriate landuse=residential area if one already exists.

Aerial image showing a blue pin surrounded by what appear to be mobile homes

Aerial image showing a blue pin surrounded by what appear to be mobile homes

Aerial image showing a blue pin surrounded by what appear to be mobile homes

No area found, does not need mapping

The GNIS entries are not maintained and may point to areas that are no longer mobile home parks. In that case mark the task as “Not an issue”. Here’s an example:

Aerial image showing a blue pin surrounded by what appear to larger office or residential structures

Likely area found, needs tag updates

Some mobile home parks have already been mapped but may be missing some of the necessary tagging the keep the matcher happy. Check that the area is correctly drawn and add/modify the tags as appropriate and suggested by the matcher. Here’s two:

Aerial image showing a blue outline around what appears to be the boundary of a mobile home park

Aerial image showing a blue outline around what appears to be the boundary of a mobile home park

Area found but unlikely the correct area

While we try to keep false positives to a minimum, sometimes the matcher will make a guess that is incorrect. In that case, either map in the correct place or mark the task as “Too hard”. In the example below, the matcher has selected the incorrect area. It turns out this was because the correct area to the west was marked incorrectly as leisure=caravan_site.

Aerial image showing a blue outline around what does not appear to be boundary of a mobile home park. A clear mobile home park is visible to the west of the blue outline

Next steps

If you have any questions or would like to have a project created for another set of GNIS information please let us know!

Improving the quality of OSM using the GNIS data set

Posted by watmildon on 2 March 2023 in English. Last updated on 29 May 2023.

Some background

If you’re new to this wonderful dataset I encourage you to read the OSM GNIS wiki entry or play around with the public GNIS web portal. Try feature ID 1629903!

The most important bits from a summary record are the feature Name, Class, and Coordinates.

Last fall, a group of mappers coordinated to address the name updates for Department of Interior Secretarial Order 3404. We cobbled together a collection of data scrapers and spreadsheets and declared victory a few weeks later. Since then, Kai and I have been thinking “there has to be a better way”.

What can the matcher do?

For any entry in the GNIS National File we can ask the matcher to search OSM. It does this using a private Overpass instance and a set of heuristics about likely tag combination and feature types. With reasonably high fidelity we can:

  • Find GNIS entries that are very likely not yet added to OSM
  • Find OSM objects that look incomplete or incorrect
    • Missing GNIS tag
    • Missing/different name from GNIS
    • Have a geographic bounds that does not agree with GNIS
  • Generate MapRoulette challenges for subsets of the full GNIS dataset

What does it look like in practice?

We’ve generated a handful of MapRoulette challenges to check the functionality and see how it improves mapping. Here’s some examples that show off different feature types:

Sq___ Rename Validation - Final cleanup of rivers and other natural features listed in Department of Interior Secretarial Order 3404. It’s very good at noticing if a named waterway is mapped up the wrong tributary.

King County WA Place Update and San Diego County CA Place Update - Urban neighborhood names for the most part. This kind of task really does require local knowledge as the geo information for this class isn’t super precise.

Add and update Mobile Home Parks in AZ - Also from the “place” feature class but for an under-mapped feature type in OSM. This one ties nicely into a project run by MapSwipe and YouthMappers.

What’s next?

We would love any and all questions/feedback you have! If there’s a feature class you’re super interested in, let us know. If there’s a particular area/region you’d like to check all the GNIS entries for, let us know. If mountains are your jam and you want to map the 10% of “Summit” entries that aren’t on OSM, let us know!

PS, if you like this idea… you will LOVE Edward BettsOSM wikidata matcher!

OSM US Board Candidate Statement 2023

Posted by watmildon on 6 February 2023 in English. Last updated on 8 February 2023.

About me

A photograph of the mountains near Gold Bar Washington. A range of rocky peaks are in the distance with nearer hills covered in green trees. A river winds it's way through the middle of the photo

I started mapping in June 2020 as a way to find parks and trails near my home in Redmond, Washington. My daughter loves adventures and provides huge motivation. I am a software engineer by trade and from that bring my passion for great tools. I love being able to work with others on a project that is expansive, vivid, and important.

My mapping

A map of the United States with a heatmap overlay. It shows activity in almost every state with heavy concentrations in the areas of Seattle, Milwaukee and Indianapolis

I have found mapping to be rewarding. My profile lists the projects that I’m most proud of. Most recently, I completed a review and addition of 150,000 addresses to fill out the greater Indianapolis area.

Much of my mapping is done in direct collaboration with other mappers. Organizing on the OSM US Slack instance and taking advantage of the OSM US Tasking Manager has been a huge boon for me. I know my contribution has been much greater and of better quality because of these resources. I’m quite active in the various OSM chat channels, so please say hello!

I’ve given back to the community by writing diary entries about leveraging the wide range of OSM tooling to multiply your impact. Either by better visualizing areas that need work, workflows to do work faster, or using old data in new ways.

Goals as a board member

Great organizing makes contributing easier and more engaging. The OSM US Slack, message board, Task Manager, etc. are great resources and their maintenance, curation, and expansion is critical. You’re much more likely to have a mapper add their 1000th change if they’re helped and encouraged along the way.

I believe the most successful map is the one we ALL build together. Harmonizing competing interests and misunderstandings is definitely a challenge, but one that is worthy of our attention. The board cannot solve all issues and disputes but they are in a unique position to listen and coordinate.

The map is great but the map tomorrow will be even better. There’s a lot to love about what has been built here but we should continue to look for ways to accelerate our priority of “maximizing the impact and value of OpenStreetMap and our community”. We must identify places where the map is not meeting this goal, and more importantly, the specific things OSM US can facilitate to unblock and encourage our passionate mapping community.

Some projects I will be working on this year to achieve a better map:

  • Continue adding data from the National Address Database in areas that have low address density
  • Find and curate novel datasets to improve the POI density
  • The cleanup and harmonization of tags from old imports of giant datasets (GNIS, TIGER, NHD)

Most of all, I love to collaborate with and support the goals of other mappers. I’m still relatively new to the community which means I have lots to learn and lots of folks to learn from. The board exists to help YOU so I expect to do a lot of listening!

A JOSM screenshot of the road network for Rifle Colorado

The problem with remote TIGER review (and solution!)

I recently did a major road alignment update for Rifle, Colorado. With JOSM and the to-do plugin, it only took a few hours and the road data is much improved. Doing the geometry check was time consuming but easy. Checking that the names are all correct? Isn’t that impossible from my desk? Thankfully no!

There are two kinds of name checking that I like to do. The first is to check that they are “sensible”, meaning that they have been expanded and there’s no obvious vandalism. The next is that they are “correct”. Because most road names originated from the TIGER import we shouldn’t use it to cross check for “correctness”. The good news is that the National Address Database has data for huge chunks of the country and provides us with another source to get a sense of how good the road network is named.

The workflow

Requirements: JOSM, the MapWithAI plugin, and an area that has NAD data available. I also recommend the AddressValidator paint style.

  1. Download an area you want to do review
  2. Work through the geometry updates however you wish
  3. Use the MWAI plugin to pull down the set of address data for your area.
  4. Merge nodes into the OSM layer
  5. Run the validator.
  6. Review the MWAI “Addresses are not nearby a matching road” warnings.
  7. Delete any unreviewed MWAI data. (Find and delete: type:node AND new AND "addr:housenumber")

If you’d like to keep the addresses, good news!, I’ve written diary entry about working through that data.

Some common errors caught this way

Usually the roadway names match very well so this isn’t a totally onerous task. However, there’s a few classes of errors that seems pretty common:

  1. A typo in the road name (ex: a road name that is close to a different common word and been manually entered incorrectly)
  2. Unexpanded road name (ex: E Diary Entry St -> East Diary Entry Street)
  3. A road that needs to be split because it actually changes names but isn’t reflected in OSM (most commonly a user extending an existing road without knowing)
  4. Missing/wrong directionality or quadrant (ex: 103 Street -> East 103 Street)

Of those, the 4th one seems somewhat optional if the rest of the roads locally do not have directionality markers. For my local area, I updated directionality but there’s some disagreement about what’s “correct”. Up to you.

Adding addresses with JOSM and MapWithAI

Posted by watmildon on 22 January 2023 in English. Last updated on 5 February 2023.

A computer screenshot of JOSM showing OSM building outlines and NAD address points shown for several houses with objects needing address data highlighted in red

What now?

In my previous diary entry I demonstrated using a Tableau visualization to find areas of the United States that could benefit from additional address data. If you find some place you’d like to work on, now what?

Getting set up (social)

OSM is a community project. It’s important to make sure mappers know what’s going and it’s always a good thing to give a heads up about plans for any big data changes in an area. Because this dataset is US centric I’ve been using the OSMUS Slack to keep people up to data about what’s happening. Each state has a “local” channel where you can get feedback and find folks to work with.

Always remember, not everything in the dataset needs to be added to OSM. It doesn’t cost anything to leave stuff out but can be quite time consuming to clean up if not done well.

Getting set up (tools)

Here’s what you’re going to need

  1. Install JOSM
  2. Add the MapWithAI plugin
  3. In the MWAI plugin preferences activate the United States Addresses URL and deactivate any other endpoints (buildings, roads etc).
  4. Add an imagery layer to your view. I used Bing aerial exclusively.

Optionally you may wish to modify the “Address Tag Validator” Map Paint Style by adding:

/* highlight buildings without address */
area[building][!addr:street] {fill-color: #FF0000; color: #FF0000; fill-opacity:1.0}
area[building][!addr:housenumber] {fill-color: #FF0000; color: #FF0000; fill-opacity:1.0}

For some areas, it may be helpful to adjust the Advanced Preference “MapWithAI.duplicatenodedistance”. I found that 15 was helpful for reducing the headache cleaning up duplicate entries for various apartment complexes and other dense areas.

Adding addresses

With all that in place you are now ready. Open the JOSM download dialog, ensure that the datasource checkboxes for OSM and MWAI are checked, navigate the slippy map and download. Et voila! You should now have a view similar to the image at the start.

The MWAI plugin is now ready to do almost all the heavy lifting for us.

The easy parts

For any building highlighted in red, select all address points that are inside the boundary of the building and hit Shift+A. If it’s moving a single node, this will move the tag data to the OSM layer and place it onto the building. For two or more nodes contained in the same boundary, it will simply move them all to the OSM layer.

If there are any addresses on top of buildings that are clearly on imagery, move those to the upload layer with Shift+A.

If there’s an address node for a building that already has address info, delete it from the MWAI layer.

If there’s an address with no discernable building, delete it from the MWAI layer.

The harder parts

Some areas with have address data directly on top of where the buildings are. However, you will also find areas that have nodes needing to be moved. This can be a bit tedious but most of the time it’s still obvious where the address is meant to be.

An address node directly between to houses with no clear indication of which it belongs to

In rare circumstances you will find a truly ambiguous data point. Which house does this belong to? Where did the other houses address go? Neither are knowable from NAD and imagery alone. You have two options: 1. Sort it out from appropriate street level imagery 2. Delete the node from your MWAI layer and let a local mapper sort it out

A building outline with 3 address nodes, two of which are the same

You will absolutely find duplicate address nodes in the National Address Database. The most common, for me, looked like the above. A duplex with two addresses with one of the addresses duplicated into the building center. For this case, you need to delete the extra address in the building center before moving the other two address nodes.

Three buildings with addresses overlaid.

The top building already has address info and one address node from NAD. It can safely be discarded. The bottom building has no address info and two addresses from NAD. Selecting both and using Shift+A will do exactly what we need. However, the middle building has SOME address info tagged on it but there are two addresses from NAD. Because building with more than one address (at least in the US) have addressing as nodes within the building outline, the tags on the building will need to be removed. For this building:

  1. Activate the OSM layer
  2. Select any buildings that have address data but more than one NAD node
  3. Delete the data tags from the building
  4. Activate the MWAI layer
  5. Select the address nodes and merge into the OSM layer with Shift+A

Validation and upload

Once you’ve added enough data for one upload hit the upload button and run the JOSM validation suite. You may wish to fix warnings and errors related to buildings you’ve touched. The most common was overlapping geometry. However, you MUST fix things related to the actual data you are adding. The most common issue for that is “Duplicate address”. Clean them up as you think sensible, with a bias toward removing your addition.

You may find that some of the roadways in OSM do not have streetnames that match the data in addr:street from NAD. This is most commonly because of the roadway missing directionality in the name and is often omitted. However, you will also discover the roadway or NAD have a typo. Or an unexpanded abbreviation. Or just something in complete disagreement. I chose to fix typos and abbreviations, but leave most directionality as was. In cases where it was quite jumbled I left a note for a local mapper to help get things sorted. Occasionally, street side imagery was useful.

Now that the validator is happy hit upload again. Add a nice descriptive comment like “Adding addresses into AreaOf CityName from the NAD someCoolHashTag”. Add “esri;National Address Database” to your sources list. Hit upload.

Congrats!

You’ve added some valuable information to the database! Wooooo! Did you find something interesting? Have other questions? Definitely let me know.

I’m hoping to have a few more posts in the near future. Some potential topics:

  1. More odd and ambiguous NAD situations and what to do.
  2. How to set up the Task Manager to share work with other folks.
  3. Using other tools to evaluate your additions and do some Quality Assurance.
  4. How to generate fancy images to show off your hard work.

Finding areas where OSM is low in address data density

Posted by watmildon on 9 December 2022 in English. Last updated on 28 March 2024.

Update: Check out my comment below about the state of the art being greatly improved!

Northeast US address density

The idea

After finishing my first address import I was looking for a good view of “where else needs addresses”. One trick that always pays dividends for me is to look at distributions of ratios of various quantities. In particular, I presumed there should be a pretty smooth distribution to the ratio of number of addresses vs number of buildings in any given area.

Building it

osmconvert alabama-latest.osm.pbf -o=alabama-nodes.csv --max-objects=50000000 --all-to-nodes --csv="@lat @lon addr:housenumber building"
  • A bit of C# to do the binning (I’m sure QGIS and other tools are great for this but you use what you know)
  • Tableau Public for the viz generation

The results

Milwaukee address density map So what are we looking at? Here is the data for the area around Milwaukee, WI. Each data mark is a lat/long bin .01 degree on a side. The size of each mark indicates the number of OSM buildings objects in the bin and the color is the ratio (number addr:housenumber tags on objects) /(number buildings). Bigger boxes mean more buildings. Darker means better address density comparatively.

It’s nice to see that the viz immediately shows the address import work we recently completed. So what does an area that needs work look like? Indianapolis address density Welcome to Indianapolis! Lighter areas with large mark size mean there’s lots of unaddressed buildings. It’s particularly surprising to me as the core of the city looks quite under addressed. A great candidate to spend a few hours contributing.

You can look around the data yourself here

Happy mapping.

Some notes

  • Some of the larger states (CA/NY etc) will have incomplete data as osmconvert aborted while processing. I presume the data is a reasonable representation of reality but have done no work to back that up.
  • I have only processed the lower United States but could generate this for your locality relatively easily. Let me know!
  • The scrolling performance is pretty bad due to the number of data points but the search in the upper left works great.

The Goal

While working through some edits in Indonesia I noticed object with the key “nama”. A quick search revealed that this in Indonesian for “name” and the objects can very likely be modified to use the standard English. I wondered, how common is this? Is it easy enough to track down?

The Plan

As a test run I picked 4 tags: name, building, source, type. These show up in TagInfo in abundance and I’m sure there are lots of other good candidates.

Next step is to get usable translations. It turns out Google Sheets has a GOOGLETRANSLATE function that takes a word and will return translations into various languages. I pulled in the two letter language code list and built my sheet. After eliminating all languages that Google Translate didn’t support and all languages with non Latin characters I was left with ~80 languages to check.

The last step was to pull usage information. Fortunately for me TagInfo has an exceptionally well documented REST API. Fifty lines of C# later and I had my results.

The Results

Clicking through a few of these in TagInfo reveals some more likely candidates for cleanup.

name 92528930
nome 166
Name 7
Nom 62
Nome 31
non 6
név 1
nama 133
nombre 207
   
building 537924316
bangunan 4
Bangunan 1
budynek 2
   
source 242170152
bron 13
fonte 16
Source 66914
fuente 57
kaynak 8
   
type 10603620
tip 382
tipo 375
typ 65
Typ 6
genus 902468
tipas 2
Type 283
tur 8

Correcting addr:housenumber in the name field

Posted by watmildon on 19 November 2022 in English. Last updated on 20 November 2022.

The Issue

A common tagging mistake that I’ve encountered a few times is putting the addr:housenumber into the name field. Data of this type tends to be old so it seems the modern tools do a better job keeping this kind of edit from happening. However, when you do find this issue there’s usually a lot of objects to clean up which can be a bother without the right tools.

Finding objects

Here’s the overpass query I’ve used a few times that provides a reasonably stating place:

/*
This has been generated by the overpass-turbo wizard.
The original search was:
“"name"~"^[0-9]" and "building"”
*/
[out:json][timeout:25];
// gather results
(
  // query part for: “name~/^[0-9]/ and building”
  way["name"~"^[0-9]"]["building"]({{bbox}});
  relation["name"~"^[0-9]"]["building"]({{bbox}});
);
// print results
out body;
>;
out skel qt;

Update! User marczoutendijk from the Discord server has provided this overpass query that does a great job finding more instances if this with low noise:

(
  node["name"~"^[0-9]+$"]["addr:housenumber"~"^[0-9]+$"]["brand"!~"."]["shop"!~"."]["amenity"!~"."]["highway"!~"."];
);

The Cleanup

The easiest workflow I’ve found it to load the area into JOSM and run a search with some more specificity. Here’s the search for any building way that has a name tag with numbers at the beginning and doesn’t have an addr:housenumber tag:

name~"^[0-9]+" and building=* and type:way and -"addr:housenumber"

Or for names that are up to 3 numbers and nothing more (ex: 7,25,123):

name~"^[0-9]{1,3}" and building=* and type:way and -"addr:housenumber"

Or a name that represents a range of housenumbers (ex: 12-16):

name~"^[0-9]+.[0-9]+" and building=* and type:way and -"addr:housenumber"

Once you have the right set of objects selected, the tag editior will let you rename the “name” tag to “addr:housenumber”. I like to do a spotcheck or values in the name tag to make sure there’s nothing surprising and then the changes can be uploaded.