Thematic extracts of OpenStreetMap data in cloud-native file formats
OpenStreetMap’s native file format is OSM PBF, but this 80GB ‘planet file’ is unwieldly and not supported by all GIS software. Layercake is OSM data extracted into thematic layers (buildings, transportation, etc) and converted to cloud-native file formats that are easy to use with software from DuckDB to QGIS.
Layercake data is available from data.openstreetmap.us. Generally, you’ll put the URL for the layer you’d like to use into DuckDB or other software that supports GeoParquet files.
Schema
All Layercake layers are GeoParquet files with five top-level columns:
type
(string): the OSM element type (node
,way
, orrelation
)id
(int64): the OSM element IDtags
(struct): a struct containing fields which correspond to OSM tags. The specific fields that are available vary between different Layercake layers. All fields are strings.bbox
(struct): the xmin, ymin, xmax, and ymax of the element’s geometrygeometry
(binary): a WKB-encoded Geometry
The following layers are currently available:
buildings
URL: https://data.openstreetmap.us/layercake/buildings.parquet
Size: 611M features, 74 GiB
Available columns: building
, building:levels
, building:flats
, building:material
, building:colour
, building:part
, building:use
, name
, addr:housenumber
, addr:street
, addr:city
, addr:postcode
, website
, wikipedia
, wikidata
, height
, roof:shape
, roof:levels
, roof:colour
, roof:material
, roof:orientation
, roof:height
, start_date
, access
, wheelchair
highways
URL: https://data.openstreetmap.us/layercake/highways.parquet
Size: 260M features, 52 GiB
Available columns: highway
, service
, crossing
, footway
, construction
, name
, ref
, bridge
, covered
, lanes
, layer
, lit
, sidewalk
, smoothness
, surface
, tracktype
, tunnel
, wheelchair
, width
, access
, bicycle
, bus
, foot
, hgv
, maxspeed
, motor_vehicle
, motorcycle
, oneway
, toll
Examples
One use for Layercake is to download a subset of data that is of interest for your use case. For example, you could download buildings in San Francisco that are taller than 5 floors, and write the results to a GeoJSON file for further processing.
D copy (
from 'https://data.openstreetmap.us/layercake/buildings.parquet'
select count(*) as count
where try_cast(tags."building:levels" as int) > 5
and bbox.xmin > -109.05
and bbox.ymin > 36.99
and bbox.xmax < -102.04
and bbox.ymax < 41.00
) to 'sf_tall_buildings.geojson';
You can also do analytics queries on Layercake data. The example below shows how you can find all of the values of surface
used on highways in OSM, and sort them by how common they are.
$ duckdb
D from 'https://data.openstreetmap.us/layercake/highways.parquet'
select tags.surface, count(*) as count
where type = 'way'
group by tags.surface
order by count desc;
┌──────────────────────┬───────────┐
│ surface │ count │
│ varchar │ int64 │
├──────────────────────┼───────────┤
│ NULL │ 179401445 │
│ asphalt │ 29916184 │
│ unpaved │ 12156065 │
│ paved │ 4095349 │
│ concrete │ 3923954 │
│ paving_stones │ 3771049 │
│ ground │ 3387599 │
│ gravel │ 2139921 │
│ dirt │ 1688384 │
│ compacted │ 1192208 │
│ grass │ 851898 │
│ sett │ 485161 │
│ fine_gravel │ 444866 │
│ sand │ 295180 │
│ wood │ 217383 │
│ concrete:plates │ 193862 │
│ earth │ 146823 │
│ cobblestone │ 139089 │
│ pebblestone │ 130414 │
│ metal │ 45100 │
│ · │ · │
│ · │ · │
│ · │ · │
│ metl │ 1 │
│ curved │ 1 │
│ Via de Joaquim Gomis │ 1 │
│ 0 │ 1 │
│ earth_grass │ 1 │
│ unkno │ 1 │
│ driving_plates │ 1 │
│ 砕石舗装w │ 1 │
│ trawaw │ 1 │
│ azaq │ 1 │
│ surface=asphalt │ 1 │
│ dirt/sand;paved │ 1 │
│ murrum │ 1 │
│ rubber car tires │ 1 │
│ آهنگ_۳ │ 1 │
│ ground,_gravel,_sand │ 1 │
│ ail │ 1 │
│ bewachsener_boden │ 1 │
│ pu │ 1 │
│ dirt4 │ 1 │
├──────────────────────┴───────────┤
│ 5410 rows (40 shown) 2 columns │
└──────────────────────────────────┘