.. _csv: # CSV CSV (Comma Separated Values) is a common text file format for tabular data. RiskScape can read CSV files directly and use them in your models. CSV files are useful for storing relational data, such as asset inventories, vulnerability tables, or lookup data. ## Default behaviour When RiskScape loads data from a CSV file, it assigns the *text* type to each column by default. For example, using the following `cities.csv` file as a model input would result in four attributes (`name`, `latitude`, `longitude`, and `population`), all of which are text-strings. ```csv name,latitude,longitude,population Wellington,-41.2924,174.7787,1816000 Auckland,-36.8509,174.7645,520971 ``` This data could not be used for any spatial operations because the coordinates are *text* type, rather than *geometry* type. The `population` attribute could not be multiplied by a scale-factor because it is *text*, rather than *floating* or *integer* type. In order to load the data into a more usable format, you would previously have needed to define a bookmark, like this: ```ini [bookmark cities] location = cities.csv set-attribute.geometry = create_point(latitude, longitude) set-attribute.population = float(population) crs-name = EPSG:4326 ``` .. _csv-type-detection: ## Type detection RiskScape can automatically detect and apply more appropriate data types when loading CSV files. This makes it easier to work with numeric data and geometry without having to manually convert text values via `set-attribute` and `map-attribute` bookmark parameters. To enable type detection, simply add `type-detection = true` to your bookmark, e.g. ```ini [bookmark my-csv-data] location = PATH/TO/data.csv format = csv type-detection = true ``` .. note:: Type detection scans all of the data in the CSV file. This is convenient for smaller CSV files, but can take a long time for very large files (e.g. more than 100MB). Type detection is applied *after* any other :ref:`attribute mapping ` operations (i.e. `set-attribute` or `map-attribute`). ### Numeric columns Type detection scans all text columns looking for numeric values: - If a column contains only numeric values, the type will be changed to `floating`. - If a column contains a mix of numeric and empty values, the type will be changed to `nullable('floating')`. This allows RiskScape to handle missing data appropriately. .. note:: If *any* rows contain text that cannot be converted to a number, RiskScape \ will assume it is not numeric, even if all the other rows are. ### Geometry When type detection finds numeric latitude and longitude columns in your CSV file, it can automatically create a `geometry` attribute. This makes it easy to use CSV files containing point locations (such as building centroids) in your models. RiskScape will add a geometry attribute when: - numeric columns are found for latitude and longitude coordinates - there is no other geometry attribute already present The geometry value added will be a point in the EPSG:4326 CRS (i.e. WGS84). The columns to use for the latitude and longitude coordinates is determined by: - Looking for an exact match in column name. RiskScape looks for columns called `longitude`, `long`, or `lon` for longitude, in that order, and `latitude` or `lat` for latitude. - Searching for column names that contain `latitude` and `longitude` as sub-strings, e.g. `Building_latitude` or `Longitude_best_guess`. In both cases the search: - Is case insensitive, so `Lat` and `LAT` would both be acceptable latitude attribute names - Uses the first match, so if `lat` and `custom_latitude` are both present then `lat` would be used. .. note:: RiskScape does not currently automatically detect columns containing Well-Known Text (WKT), e.g. ``POINT(174.7787 -41.2924)``. .