Data reading example 4 - FAOstat

To run this example the file Emissions_Agriculture_Cultivated_Organic_Soils_E_All_Data_(Normalized).csv must be placed in the same folder as this notebook. The data is available from the FAOstat.

[1]:
import primap2 as pm2

Dataset Specifications

Here we define which columns of the csv file contain the dimensions.

[2]:
file = "Emissions_Agriculture_Cultivated_Organic_Soils_E_All_Data_(Normalized).csv"
coords_cols = {
    "unit": "Unit",
    "entity": "Element",
    "area": "Area",
    "category": "Item",
    "data": "Value",
    "time": "Year",
}
coords_defaults = {
    "source": "FAOstat",
}
coords_terminologies = {
    "area": "FAOstat",
    "category": "FAOstat",
}

# TODO: proper mapping of the area to ISO3
coords_value_mapping = {
    "unit": {"gigagrams": "Gg N2O / year"},
    "entity": {"Emissions (N2O) (Cultivation of organic soils)": "N2O"},
}

filter_keep = {
    "f1": {
        "Element": "Emissions (N2O) (Cultivation of organic soils)",
    },
}

meta_data = {
    "references": "http://www.fao.org/faostat/en/#data/GV/metadata"
}

Reading the data to interchange format

[3]:
AgN2O_if = pm2.pm2io.read_long_csv_file_if(
    file,
    coords_cols=coords_cols,
    coords_defaults=coords_defaults,
    coords_terminologies=coords_terminologies,
    coords_value_mapping=coords_value_mapping,
    filter_keep=filter_keep,
    meta_data=meta_data,
    time_format="%Y",
)
AgN2O_if.head()
[3]:
source area (FAOstat) entity unit category (FAOstat) 1990 1991 1992 1993 1994 ... 2012 2013 2014 2015 2016 2017 2018 2019 2030 2050
0 FAOstat Africa N2O Gg N2O / year Cropland and grassland organic soils 34.3224 34.3224 34.3224 34.3224 34.3211 ... 35.2262 35.2203 35.1451 35.1434 35.1327 35.1277 35.0546 35.0546 35.0546 35.0546
1 FAOstat Africa N2O Gg N2O / year Cropland organic soils 17.4597 17.4597 17.4597 17.4597 17.4588 ... 18.1675 18.1546 18.1318 18.1307 18.1014 18.0904 18.0273 18.0273 18.0273 18.0273
2 FAOstat Africa N2O Gg N2O / year Grassland organic soils 16.8627 16.8627 16.8627 16.8627 16.8623 ... 17.0586 17.0657 17.0133 17.0127 17.0313 17.0373 17.0272 17.0272 17.0272 17.0272
3 FAOstat Albania N2O Gg N2O / year Cropland and grassland organic soils 0.0471 0.0471 0.0471 0.0471 0.0471 ... 0.0470 0.0470 0.0470 0.0470 0.0468 0.0468 0.0460 0.0460 0.0460 0.0460
4 FAOstat Albania N2O Gg N2O / year Cropland organic soils 0.0378 0.0378 0.0378 0.0378 0.0378 ... 0.0377 0.0377 0.0377 0.0377 0.0376 0.0376 0.0369 0.0369 0.0369 0.0369

5 rows × 37 columns

[4]:
AgN2O_if.attrs
[4]:
{'attrs': {'references': 'http://www.fao.org/faostat/en/#data/GV/metadata',
  'area': 'area (FAOstat)',
  'cat': 'category (FAOstat)'},
 'time_format': '%Y',
 'dimensions': {'*': ['source',
   'area (FAOstat)',
   'entity',
   'unit',
   'category (FAOstat)']}}

Transformation to PRIMAP2 xarray format

The transformation to PRIMAP2 xarray format is done using the function from_interchange_format which takes an interchange format DataFrame. The resulting xr Dataset is already quantified, thus the variables are pint arrays which include a unit.

[5]:
AgN2O = pm2.pm2io.from_interchange_format(AgN2O_if)
AgN2O
2021-04-13 17:03:06.126 | DEBUG    | primap2.pm2io._interchange_format:from_interchange_format:266 - Expected array shapes: [[1, 147, 1, 3]], resulting in size 441.
2021-04-13 17:03:06.145 | INFO     | primap2._data_format:ensure_valid_attributes:245 - Reference information is not a DOI: 'http://www.fao.org/faostat/en/#data/GV/metadata'
[5]:
<xarray.Dataset>
Dimensions:             (area (FAOstat): 147, category (FAOstat): 3, source: 1, time: 32)
Coordinates:
  * time                (time) datetime64[ns] 1990-01-01 ... 2050-01-01
  * source              (source) object 'FAOstat'
  * category (FAOstat)  (category (FAOstat)) object 'Cropland and grassland o...
  * area (FAOstat)      (area (FAOstat)) object 'Africa' 'Albania' ... 'Zambia'
Data variables:
    N2O                 (time, source, category (FAOstat), area (FAOstat)) float64 [Gg·N2O/a] ...
Attributes:
    references:  http://www.fao.org/faostat/en/#data/GV/metadata
    area:        area (FAOstat)
    cat:         category (FAOstat)
[5]: