Pipeline steps

`input`

Name	Type Name	Arity	Description
limit	`Long`	0..1	Optionally limit the number of tuples that are yielded from the input data. Similar to an SQL limit.
name	`String`	0..1	Nest each tuple from the relation in a parent tuple. More simply put, if given, this will be the name of the attribute that wraps each tuple from the relation. For example, if your source has attributes named `foo` and `bar`, setting name to `baz` will give results named `baz.foo` and `baz.bar`. This can be useful for distinguishing the source data from other computed data in your pipeline.
offset	`Long`	0..1	Optionally skip a number of tuples that are yielded from the input data. Similar to the offset part in an SQL limit clause
relation	`Expression`	0..1	The source data to yield. Normally this is the ID of a bookmark from your project, but can also be a filename/URL.
value	`Expression`	0..1	An expression that provides a single input item. Used instead of `relation`. E.g the following would create an input with a single tuple `input(value: {make: 'Toyota', model: 'Corolla'})`. Note that `limit` and `offset` are ignored when `value` is used.

Makes input available to the pipeline. Input is most often from a relation (such as a Shapefile or CSV) but may also come from the value expression.

`filter`

Name	Type Name	Arity	Description
filter	`Expression`	1	A boolean-yielding expression. If the expression evaluates to FALSE, then the item is removed from the results.

Applies a filter to results, removing those that fail the filter test.

`join`

Name

Type Name

Arity

Description

initial-index-size

Integer

0..1

Sets the initial capacity of the index. This can improve performance when the initial size is large enough that the index does not need to grow as items are added. Setting to a large value will use more memory.

join-type

JoinType

1

Specifies what to do with left hand side (LHS) rows when no right hand side (RHS) row matches. One of:

INNER: any non-matching rows are dropped (the default).
LEFT_OUTER: include any non-matching rows from the LHS.

on

Expression

1

Condition that evaluates whether a row of data from each relation ‘matches’ and should be joined together.

Joins two relations together. The relation can be any chain of pipeline steps. When using -> to chain pipeline steps to a join step, the inputs should be named. Use lhs for the left hand side (LHS) relation, and rhs for the right hand side (RHS) relation, e.g. prev_step -> join.lhs.

`unnest`

Name	Type Name	Arity	Description
emit-empty	`Boolean`	1	False by default, which means a list that is empty or null will not produce any rows of data. However, the row of data containing the empty/null list may also contain other important data, which would be lost. If set to true, an empty list will always emit a single row of data, containing a null list item.
index-key	`String`	0..1	Adds an additional attribute, with the given name, to each row of data. This attribute will have a unique numeric index representing the Nth item produced by the list or relation
to-unnest	`Expression`	1	The list or relation attributes to expand, e.g. `unnest(list)` or `unnest([list1, relation1])` to unnest more than one list or relation at once.

Transforms a list, so that every item in the list becomes a separate row of data. The transformation happens ‘in place’, such that the list attribute in the pipeline data is replaced by the list item. When multiple lists are specified, the Nth row of data produced will contain the Nth item from each list (similar to zip() in Python, except null values will be used if one list is shorter than the other).

Unnest can also apply to relations, which lets you dynamically load relational data from multiple files into the pipeline. This is an advanced concept - refer to the documentation for examples of how to do this.

`select`

Name	Type Name	Arity	Description
select	`Expression`	1	An expression that maps the input to output. Output is always a struct, even if only a non-struct value is returned, e.g. select(‘’) will give a struct with a single text member.

Produces a new output tuple by applying the given select expression to the input tuple, e.g. select({*, '$' + str(cost) as "Replacement Cost"}). Don’t forget to surround your expression with braces ({ and }) if you want to return more than one thing!

`sort`

Name	Type Name	Arity	Description
by	`Expression`	1	The attribute(s) to sort by. To sort on multiple attributes, use a list of attributes, e.g. `[consequence.loss, exposure.total_value]`
delta	`Lambda`	0..1	A lambda expression that calculates the delta between the two tuples that have been sorted. The delta is added into the tuples. For example if the tuples contained a `value` attribute and you wanted to know the difference between tuples you could use `(prev, current) -> current.value - prev.value`
delta-attribute	`String`	0..1	Name the attribute that the delta will be stored in. By default this will be `delta`.
delta-type	`String`	0..1	The type of the delta-attribute. If provided, it will be used to pass the delta from the previous step into the lambda expression (as `prev.delta`). This provides basic support for cascading hazards. Only required when the delta lambda expression uses the delta-attribute
direction	`Direction`	0+	May be used to set the sort direction for attributes. `asc` to sort in ascending order, or `desc` to sort in descending order. Any attributes that do not have a direction set in this list will default to ascending. E.g. `['desc', 'asc']`

Produces output that is sorted based on one or more sort-by expressions. The sort step should only be used immediately before a save step. This is to prevent any subsequent steps changing the order because they are being processed in parallel. Caution, output is collected in memory. Sorting large volumes of output may result in failures due to insufficient memory.

`group`

Name	Type Name	Arity	Description
by	`Expression`	0..1	An expression that groups the input so that each group is aggregated individually. For example `group({category, sum(value) as total}, by: category)` would group all inputs by their category and calculate a total value for each category
select	`Expression`	1	The aggregation expression to apply to input, e.g. `{sum(value) as total}`. Members of the group by expression can be referenced here, e.g. `group({category, sum(value) as total}, by: category)` is valid.

Apply an aggregate expression across optionally grouped input values, e.g. group(count(exposures), by: damage_state). Similar to the use of GROUP BY in SQL, use RiskScape’s aggregate functions to perform operations like sum and count on grouped tuples to compute a single value for each group.

`save`

Name	Type Name	Arity	Description
format	`Format`	0..1	The file format to use when saving data. See `riskscape format list` for the available formats. Defaults to `csv` if no geometry is present in the output, `shapefile` if there is.
name	`String`	0..1	A name to give to the output. May differ to the name ultimately given to any created files, depending on the format or the storage location (e.g. files saved to a directory with existing files may be renamed to avoid over-writing any existing files).
options	`StructDeclaration`	0..1	Format-specific options. See `riskscape format save options FORMAT` for supported options.

Save results out to files or other supported storage systems.

`enlarge`

Name	Type Name	Arity	Description
distance	`Expression`	1	The distance (in metres) to enlarge the geometries.
geom-expression	`PropertyAccess`	0..1	Expression to the geometry to enlarge (or a struct that contains a geometry member). If not specified then the first geometry found will be the one to be enlarged.
mode	`EnlargeMode`	0..1	Controls how geometries are enlarged. Refer to `buffer` function for a description of how mode affects the enlarged geometry.
remove-overlaps	`Boolean`	0..1	When true, overlaps that exist after enlarging the geometries will be removed. Overlaps are removed for each geometry by 1) finding all other geometries that overlap it, 2) removing the overlap from either the geometry being checked or the other geometry in an alternating manner (this is to prevent all overlaps being removed from the first geometry). 3) If removing an overlap would render any of the resulting geometries empty, then the overlap is not removed.

Enlarges a geometry by a specified amount. Primarily intended for use with line geometries.

`union`

Combines two pipeline chains into one. For example, this can combine multiple input relations into one. The resulting tuple is a combination of the attributes produced by each of the input steps. For best results, the input steps should produce the same attribute names and types. If an attribute is present in one branch but not another, it will become Nullable. If the attribute is present, but has a different type, then its type may become ‘Anything’.

Pipeline steps

input

filter

join

unnest

select

sort

group

save

enlarge

union