.. _geotiff_output_example:

# GeoTIFF outputs

By default, RiskScape saves geospatial outputs as *vector* data, but
RiskScape can also create *raster* outputs in the GeoTIFF format.

Saving raster data can be a useful way to represent risk when you don't have specific exposure data to model.
For example, a raster model output could represent the life-safety risk of constructing
a dwelling in a given location, even though an actual building does not exist in GIS data yet.

.. note::
  In order to use GeoTIFF outputs, you need to have the :ref:`beta-plugin` enabled.
  Saving GeoTIFF outputs is an advanced feature for expert users, so you should be familiar
  with :ref:`advanced_pipelines` first.

## Raster basics

A raster is a spatial file that basically represents a grid laid over part of the Earth.
The grid is divided into individual squares, called cells or pixels.
Each pixel can be assigned a single value, or if no value has been
assigned then the pixel will have the `NODATA` value.

The raster has a *bounds*, which is the spatial extent that the raster covers.
The bounds determines the total height (north to south) and width (east to west) of the raster (in pixels), as well as the part
of the Earth that the raster covers.

The height and width of each *pixel* within the raster grid is called the *grid resolution*.
For example, if the grid resolution was 10 metres, each pixel would be 10m x 10m in area.

When saving a GeoTIFF, the `grid-resolution` and `bounds` options determine how many pixels are in the raster.
For example, if we have a bounds which is 10km wide and 5km high, and a grid-resolution of 10m,
then the raster will be 1000 pixels wide by 500 pixels high, or 500,000 pixels total.

Currently RiskScape GeoTIFF outputs are limited to approximately 2 billion pixels in size (2³¹ - 1).
To fit more information than this into a GeoTIFF output, you would need to reduce the bounds or
increase the grid-resolution.

.. note::
    In general, GeoTIFF pixels do not have to be square.
    However, for simplicity RiskScape will assume the pixels are square,
    apart from some minor latitude skew when working in WGS84.

### Saving a feature's geometry

When a feature is saved to a raster output, any pixels that intersect with the feature's geometry will be
written with the feature's value.
A point geometry will only ever intersect a single pixel,
whereas line-string and polygon geometries may span multiple pixels, all of which will be written with
the feature's value.

The image below shows some polygon exposures (in yellow), some of which span multiple grid cells (in grey).

.. image:: exposures-over-grid-cells.png
    :target: ../_images/exposures-over-grid-cells.png
    :alt: image showing polygon exposures that span multiple grid cells

How you handle overlapping features will depend somewhat on the information being saved to GeoTIFF.
For example, if you were saving the total loss value to raster, then you would only want *one* pixel
written per building. Whereas if you were saving the average flood depth, then you would want *all*
overlapping pixels written, in order to represent the true extent of the hazard.

.. tip::
  To only write a single pixel per feature, use the centroid of the feature geometry, e.g. ``centroid(exposure)``.
  However, note that for some features, such as large line-strings, the centroid may fall outside
  the feature's original geometry.

### Saving a feature's value

When saving a GeoTIFF in RiskScape you must pick a single numeric value from your pipeline data to save.
You can use the `value` option to specify which value to use.

.. note::
    If you have multiple numeric values in your pipeline data, then RiskScape will pick an arbitrary
    attribute to save to GeoTIFF. Any values that are null or NaN will not be saved to GeoTIFF at all.

When saving data where there are multiple features per pixel,
you can use the `pixel-statistic` option to determine how the value gets set for each pixel.
The strategies supported for pixel-statistic include:

- Max: use the maximum value seen across any overlapping features
- Mean: combines the values for any overlapping features and uses the average
- Min: use the minimum value seen across any overlapping features
- Sum: combines the values for any overlapping features and uses the total

By default, the mean value is used.

## Save step examples

When saving a GeoTIFF output, you must always specify options in the `save()` pipeline step,
as RiskScape needs to know the `bounds` and `grid-resolution` of the GeoTIFF to create.

For simplicity, there is a `template` option, which lets you copy the bounds and grid-resolution
from an existing GeoTIFF. For example, to create a copy of the GeoTIFF file used in the :ref:`getting-started` project,
you could use the following pipeline:

```text
input('MaxEnv_All_Scenarios_10m.tif')
-> save('copy', format: 'geotiff',
        options: {
          template: 'MaxEnv_All_Scenarios_10m.tif'
        })
```

Alternatively, you can specify the `bounds` and `grid-resolution` instead of using a `template`, like this:

```text
input('MaxEnv_All_Scenarios_10m.tif')
-> save('copy', format: 'geotiff',
        options: {
          bounds: bounds(bookmark('MaxEnv_All_Scenarios_10m.tif')),
          grid-resolution: 5,
        })
```

The `bounds` option should be a polygon geometry.
The easiest way to get a suitable polygon is to use the `bounds()` function with a bookmark or file
that represents the data being saved.

.. note::
    If any features fall outside the bounds that you have specified, they will be ignored and not saved in the GeoTIFF.
    However, you will get a warning when this occurs.

The underlying grid-resolution in the `MaxEnv_All_Scenarios_10m.tif` GeoTIFF is 5m,
even though it represents the hazard data to 10m of accuracy.
If you wanted to resize the hazard data to 10m grid-resolution, then you could use:

```text
input('MaxEnv_All_Scenarios_10m.tif')
-> save('copy', format: 'geotiff',
        options: {
          template: 'MaxEnv_All_Scenarios_10m.tif',
          grid-resolution: 10
        })
```

Note that in the above example, the `grid-resolution` option (10m) will override the grid-resolution from the `template` GeoTIFF (5m).

Instead of just reading and writing GeoTIFF pixels, you can turn turn any input data into a raster output.
For example, this will take the building data from the :ref:`getting-started` project
and write out a raster with the total floor area, to a 50m-resolution.

```text
input('Buildings_SE_Upolu_CRS84.shp', name: 'exposure')
-> select({ centroid(exposure) as geom, measure(exposure) as Floor_Area })
-> save('building_floor_area', format: 'geotiff',
        options: {
            pixel-statistic: 'sum',
            grid-resolution: 50,
            bounds: bounds(bookmark('Buildings_SE_Upolu_CRS84.shp'))
       })
```

Note that this example altered the geometry and value to save to the GeoTIFF in a separate `select()` pipeline step.
Alternatively, we could specify the `value` and `geometry` to use in the `save()` step options like this:

```text
input('Buildings_SE_Upolu_CRS84.shp', name: 'exposure')
-> save('building_floor_area', format: 'geotiff',
        options: {
            pixel-statistic: 'sum',
            grid-resolution: 50,
            bounds: bounds(bookmark('Buildings_SE_Upolu_CRS84.shp')),
            value: measure(exposure),
            geometry: centroid(exposure)
       })
```