Parameters

Here is the full list of configuration parameters you can specify in a config.json file.

country: string
The OSM QA Tile extract to download. The string should be a country matching a one of the options in label_maker/countries.txt
bounding_box: list of floats
The bounding box to create images from. This should be given in the form: [xmin, ymin, xmax, ymax] as longitude and latitude values between [-180, 180] and [-90, 90], respectively. Values should use the WGS84 datum, with longitude and latitude units in decimal degrees.
geojson: string
An input file containing a GeoJSON FeatureCollection representing labels. Adding this parameter will override the values in the country and bounding_box parameters.
zoom: int
The zoom level used to create images. This functions as a rough proxy for resolution. Value should be given as an int on the interval [0, 19].
classes: list of dicts

The training classes. Each class is defined as dict object with two required keys:

name: string
The class name.
filter: list of strings
A Mapbox GL Filter to define any vector features matching this class. Filters are applied with the standalone featureFilter from Mapbox GL JS.
buffer: int
Optional paramter to buffer labels in 'object-detection' and 'segmentation' tasks by an arbitrary number of pixels. Accepts both positive and negative integers. It uses Shapely object.buffer to calculate the final geometry. You can verify that your buffer options create the desired labels by inspecting the files created in data/labels/ after running the label-maker labels command.
imagery: string

Label Maker expects to receive imagery tiles that are 256 x 256 pixels. You can specific the source of the imagery with one of:

  • A template string for a tiled imagery service. Note that you will generally need an API key to obtain images and there may be associated costs. The above example requires a Mapbox access token. Also see OpenAerialMap for open imagery. The access token for TMS image formats can be read from an environment variable 'https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}.jpg?access_token={ACCESS_TOKEN}' or added directly the imagery string.

  • A GeoTIFF file location. Works with local files: 'http://oin-hotosm.s3.amazonaws.com/593ede5ee407d70011386139/0/3041615b-2bdb-40c5-b834-36f580baca29.tif'

  • Remote files like a WMS endpoint GetMap request. Fill out all necessary parameters except bbox which should be set as {bbox}. Ex:

    'https://basemap.nationalmap.gov/arcgis/services/USGSImageryOnly/MapServer/WMSServer?SERVICE=WMS&REQUEST=GetMap&VERSION=1.1.1&LAYERS=0&STYLES=&FORMAT=image%2Fjpeg&TRANSPARENT=false&HEIGHT=256&WIDTH=256&SRS=EPSG%3A3857&BBOX={bbox}'
    
http_auth: list
Optional parameter to specify a username and password for restricted WMS services. For example, ['my_username', 'my_password'].
background_ratio: float
Specify how many background (or “negative”) training examples to create. Label Maker will generate background_ratio times the number of images matching the total number class tiles.
ml_type: string

One of 'classification', 'object-detection', or 'segmentation'. This defines the output format for the final label numpy arrays (y_train and y_test).

'classification'
Output is an array of len(classes) + 1. Each array value will be either 1 or 0 based on whether it matches the class at the same index. The additional array element belongs to the background class, which will always be the first element.
'object-detection'
Output is an array of bounding boxes of the form [xmin, ymin, width, height, class_index]. In this case, the values are pixel values measured from the upper left-hand corner (not latitude and longitude values). Each feature is tested against each class, so if a feature matches two or more classes, it will have the corresponding number of bounding boxes created.
'segmentation'
Output is an array of shape (256, 256) with values matching the class index label at that position. The classes are applied sequentially according to config.json so latter classes will be written over earlier class labels if there is overlap.
seed: int
Random generator seed. Optional, use to make results reproducible.
split_vals: list

Default: [0.8, 0.2]

Percentage of data to put in each category listed in split_names. Must be a list of floats that sum to one and match the length of split-names. For train, validate, and test data, a list like [0.7, 0.2, 0.1] is suggested.

split_names: list

Default: ['train', 'test']

List of names for each subset of the data. Length of list must match length of split_vals.

imagery_offset: list of ints
An optional list of integers representing the number of pixels to offset imagery. For example [15, -5] will move the images 15 pixels right and 5 pixels up relative to the requested tile bounds.
tms_image_format: string
An option string that has the downloaded imagery’s format such as .jpg or .png when it isn’t provided by the endpoint
over_zoom: int
An integer greater than 0. If set for XYZ tiles, it will fetch tiles from zoom + over_zoom, to create higher resolution tiles which fill out the bounds of the original zoom level.
band_indices: list

Default: [1, 2, 3]

A list of band indices to pull from a TIF. Using the SpaceNet Roads Challenge Data as an example, you can use [5, 3, 2, 7] to extract the Red, Green, Blue, and NIR bands respectively.