OpenStreetMap

The Size of TIGER

Posted by rivermont on 26 October 2018 in English. Last updated on 29 October 2019.

The Size of TIGER

There is a LOT of TIGER data, most of it still not even glanced at. And each TIGER road comes with a bunch of metadata tags.

way 16543325

Taginfo has the following statistics on common TIGER tags (as of Oct 6 2018):

  • 13,078,000 tiger:cfcc
  • 12,871,000 tiger:county
  • 11,874,000 tiger:reviewed (98% no)
  • 8,021,000 tiger:name_base
  • 6,880,000 tiger:name_type
  • 4,700,000 tiger:tlid
  • 4,700,000 tiger:source
  • 4,020,000 tiger:upload_uuid
  • 4,000,000 tiger:zip_left
  • 3,600,000 tiger:zip_right
  • 3,250,000 tiger:separated (99% no)
  • 1,275,000 tiger:name_direction_prefix
  • 1,127,140 tiger:name_base_1
  • 450,000 tiger:name_direction_suffix
  • 370,000 tiger:name_type_1
  • ~1,020,000 other tags with >20,000 usage

In the OSM XML format, each tag is structured like so:

<tag k="KEY" v="VALUE"/>

where KEY and VALUE are the key/value pair for the tag of course. For example, a simple highway tagged with highway=residential + name=Cole Mill Road + surface=asphalt is:

```

```

Excluding all other metadata and node references needed to make an actual way this comes out to 119 bytes, with 34, 34, and 30 bytes for each tag.

Size calculation

Applying this to all of the above TIGER tags, total sizes come out as follows.

  • tiger:cfcc (3-byte value): 380 MB
  • tiger:county (assuming an average value of 12 bytes): 514 MB
  • tiger:reviewed=no: 380 MB
  • tiger:name_base (assume avg. val. of 12 bytes): 345 MB
  • tiger:name_type (2-byte value): 227 MB
  • tiger:tlid (around 200 byte values): 1.13 GB
  • tiger:source (30-byte value): 272 MB
  • tiger:upload_uuid (51-byte value): 334 MB
  • tiger:zip_left (5-byte value): 140 MB
  • tiger:zip_right (5-byte value): 126 MB
  • tiger:separated=no: 104 MB
  • tiger:name_direction_prefix (1-byte value): 56 MB
  • tiger:name_base_1 (assume avg. val. of 13 bytes): 53 MB
  • tiger:name_direction_suffix (assume avg. val. of 2 bytes): 20 MB
  • tiger:name_type_1 (2-byte value): 13 MB
  • ~40 MB other tags

## Conclusion

That all adds up to over 4 GB of data, just from extraneous import tags (I’ll bet NHD imports are even larger … oh boy).
With the current planet.osm (uncompressed) sitting at around 960 GB, all these TIGER tags make up a whopping 0.42% of all OpenStreetMap data! Wow such large.

This doesn’t really conclude much, but it was a fun experiment. I had expected the number to be much larger, but even the vastness of TIGER doesn’t compare to the rest of the world.


Still, most TIGER data is misaligned, low-resolution, incorrectly classified, inconsistent and straight up wrong.
Help us cut down on bad TIGER data!

TIGER Gore

bad TIGER roads 1

bad TIGER roads 2

Discussion

Comment from Reinhart Previano on 29 October 2018 at 05:16

Wait, are these the imported TIGER Maps data to OSM?

Comment from LeifRasmussen on 30 October 2018 at 12:14

A while back, the OSM community imported tiger data for the whole United States, starting OSM in the United States with a “complete” road dataset. It’s very low quality, though.

Comment from amapanda ᚛ᚐᚋᚐᚅᚇᚐ᚜ 🏳️‍🌈 on 30 October 2018 at 16:30

Yes, the TIGER import was done in 2007 An initial attempt in 2005 was reverted. The quality is pretty bad.

The PBF format has numerous tricks to save space, which means common tags for a “chunk of objects” might only take 1 byte to store the key.

Log in to leave a comment