Data reading example 4 - FAOstat¶
To run this example the file Emissions_Agriculture_Cultivated_Organic_Soils_E_All_Data_(Normalized).csv
must be placed in the same folder as this notebook. The data is available from the FAOstat.
[1]:
import primap2 as pm2
Dataset Specifications¶
Here we define which columns of the csv file contain the dimensions.
[2]:
file = "Emissions_Agriculture_Cultivated_Organic_Soils_E_All_Data_(Normalized).csv"
coords_cols = {
"unit": "Unit",
"entity": "Element",
"area": "Area",
"category": "Item",
"data": "Value",
"time": "Year",
}
coords_defaults = {
"source": "FAOstat",
}
coords_terminologies = {
"area": "FAOstat",
"category": "FAOstat",
}
# TODO: proper mapping of the area to ISO3
coords_value_mapping = {
"unit": {"gigagrams": "Gg N2O / year"},
"entity": {"Emissions (N2O) (Cultivation of organic soils)": "N2O"},
}
filter_keep = {
"f1": {
"Element": "Emissions (N2O) (Cultivation of organic soils)",
},
}
meta_data = {
"references": "http://www.fao.org/faostat/en/#data/GV/metadata"
}
Reading the data to interchange format¶
[3]:
AgN2O_if = pm2.pm2io.read_long_csv_file_if(
file,
coords_cols=coords_cols,
coords_defaults=coords_defaults,
coords_terminologies=coords_terminologies,
coords_value_mapping=coords_value_mapping,
filter_keep=filter_keep,
meta_data=meta_data,
time_format="%Y",
)
AgN2O_if.head()
[3]:
source | area (FAOstat) | entity | unit | category (FAOstat) | 1990 | 1991 | 1992 | 1993 | 1994 | ... | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2030 | 2050 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FAOstat | Africa | N2O | Gg N2O / year | Cropland and grassland organic soils | 34.3224 | 34.3224 | 34.3224 | 34.3224 | 34.3211 | ... | 35.2262 | 35.2203 | 35.1451 | 35.1434 | 35.1327 | 35.1277 | 35.0546 | 35.0546 | 35.0546 | 35.0546 |
1 | FAOstat | Africa | N2O | Gg N2O / year | Cropland organic soils | 17.4597 | 17.4597 | 17.4597 | 17.4597 | 17.4588 | ... | 18.1675 | 18.1546 | 18.1318 | 18.1307 | 18.1014 | 18.0904 | 18.0273 | 18.0273 | 18.0273 | 18.0273 |
2 | FAOstat | Africa | N2O | Gg N2O / year | Grassland organic soils | 16.8627 | 16.8627 | 16.8627 | 16.8627 | 16.8623 | ... | 17.0586 | 17.0657 | 17.0133 | 17.0127 | 17.0313 | 17.0373 | 17.0272 | 17.0272 | 17.0272 | 17.0272 |
3 | FAOstat | Albania | N2O | Gg N2O / year | Cropland and grassland organic soils | 0.0471 | 0.0471 | 0.0471 | 0.0471 | 0.0471 | ... | 0.0470 | 0.0470 | 0.0470 | 0.0470 | 0.0468 | 0.0468 | 0.0460 | 0.0460 | 0.0460 | 0.0460 |
4 | FAOstat | Albania | N2O | Gg N2O / year | Cropland organic soils | 0.0378 | 0.0378 | 0.0378 | 0.0378 | 0.0378 | ... | 0.0377 | 0.0377 | 0.0377 | 0.0377 | 0.0376 | 0.0376 | 0.0369 | 0.0369 | 0.0369 | 0.0369 |
5 rows × 37 columns
[4]:
AgN2O_if.attrs
[4]:
{'attrs': {'references': 'http://www.fao.org/faostat/en/#data/GV/metadata',
'area': 'area (FAOstat)',
'cat': 'category (FAOstat)'},
'time_format': '%Y',
'dimensions': {'*': ['source',
'area (FAOstat)',
'entity',
'unit',
'category (FAOstat)']}}
Transformation to PRIMAP2 xarray format¶
The transformation to PRIMAP2 xarray format is done using the function from_interchange_format
which takes an interchange format DataFrame. The resulting xr Dataset is already quantified, thus the variables are pint arrays which include a unit.
[5]:
AgN2O = pm2.pm2io.from_interchange_format(AgN2O_if)
AgN2O
2021-04-13 17:03:06.126 | DEBUG | primap2.pm2io._interchange_format:from_interchange_format:266 - Expected array shapes: [[1, 147, 1, 3]], resulting in size 441.
2021-04-13 17:03:06.145 | INFO | primap2._data_format:ensure_valid_attributes:245 - Reference information is not a DOI: 'http://www.fao.org/faostat/en/#data/GV/metadata'
[5]:
<xarray.Dataset> Dimensions: (area (FAOstat): 147, category (FAOstat): 3, source: 1, time: 32) Coordinates: * time (time) datetime64[ns] 1990-01-01 ... 2050-01-01 * source (source) object 'FAOstat' * category (FAOstat) (category (FAOstat)) object 'Cropland and grassland o... * area (FAOstat) (area (FAOstat)) object 'Africa' 'Albania' ... 'Zambia' Data variables: N2O (time, source, category (FAOstat), area (FAOstat)) float64 [Gg·N2O/a] ... Attributes: references: http://www.fao.org/faostat/en/#data/GV/metadata area: area (FAOstat) cat: category (FAOstat)
xarray.Dataset
- area (FAOstat): 147
- category (FAOstat): 3
- source: 1
- time: 32
- time(time)datetime64[ns]1990-01-01 ... 2050-01-01
array(['1990-01-01T00:00:00.000000000', '1991-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1993-01-01T00:00:00.000000000', '1994-01-01T00:00:00.000000000', '1995-01-01T00:00:00.000000000', '1996-01-01T00:00:00.000000000', '1997-01-01T00:00:00.000000000', '1998-01-01T00:00:00.000000000', '1999-01-01T00:00:00.000000000', '2000-01-01T00:00:00.000000000', '2001-01-01T00:00:00.000000000', '2002-01-01T00:00:00.000000000', '2003-01-01T00:00:00.000000000', '2004-01-01T00:00:00.000000000', '2005-01-01T00:00:00.000000000', '2006-01-01T00:00:00.000000000', '2007-01-01T00:00:00.000000000', '2008-01-01T00:00:00.000000000', '2009-01-01T00:00:00.000000000', '2010-01-01T00:00:00.000000000', '2011-01-01T00:00:00.000000000', '2012-01-01T00:00:00.000000000', '2013-01-01T00:00:00.000000000', '2014-01-01T00:00:00.000000000', '2015-01-01T00:00:00.000000000', '2016-01-01T00:00:00.000000000', '2017-01-01T00:00:00.000000000', '2018-01-01T00:00:00.000000000', '2019-01-01T00:00:00.000000000', '2030-01-01T00:00:00.000000000', '2050-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
- source(source)object'FAOstat'
array(['FAOstat'], dtype=object)
- category (FAOstat)(category (FAOstat))object'Cropland and grassland organic ...
array(['Cropland and grassland organic soils', 'Cropland organic soils', 'Grassland organic soils'], dtype=object)
- area (FAOstat)(area (FAOstat))object'Africa' 'Albania' ... 'Zambia'
array(['Africa', 'Albania', 'Americas', 'Angola', 'Annex I countries', 'Argentina', 'Asia', 'Australia', 'Australia and New Zealand', 'Austria', 'Bangladesh', 'Belarus', 'Belgium', 'Belgium-Luxembourg', 'Belize', 'Bosnia and Herzegovina', 'Botswana', 'Brazil', 'Brunei Darussalam', 'Bulgaria', 'Burundi', 'Cameroon', 'Canada', 'Caribbean', 'Central African Republic', 'Central America', 'Chile', 'China', 'China, mainland', 'Colombia', 'Congo', 'Costa Rica', 'Croatia', 'Czechia', 'Czechoslovakia', "C�te d'Ivoire", "Democratic People's Republic of Korea", 'Democratic Republic of the Congo', 'Denmark', 'Eastern Africa', 'Eastern Asia', 'Eastern Europe', 'Ecuador', 'Equatorial Guinea', 'Eritrea', 'Estonia', 'Ethiopia', 'Ethiopia PDR', 'Europe', 'European Union', 'Falkland Islands (Malvinas)', 'Faroe Islands', 'Fiji', 'Finland', 'France', 'French Guiana', 'Gabon', 'Germany', 'Ghana', 'Greece', 'Guinea', 'Guinea-Bissau', 'Guyana', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Ireland', 'Isle of Man', 'Italy', 'Jamaica', 'Japan', 'Kenya', 'Land Locked Developing Countries', 'Latvia', 'Least Developed Countries', 'Liberia', 'Lithuania', 'Low Income Food Deficit Countries', 'Luxembourg', 'Madagascar', 'Malawi', 'Malaysia', 'Melanesia', 'Middle Africa', 'Mongolia', 'Montenegro', 'Myanmar', 'Namibia', 'Nepal', 'Net Food Importing Developing Countries', 'Netherlands', 'New Zealand', 'Nicaragua', 'Non-Annex I countries', 'Northern Africa', 'Northern America', 'Northern Europe', 'Norway', 'OECD', 'Oceania', 'Panama', 'Papua New Guinea', 'Peru', 'Poland', 'Portugal', 'Puerto Rico', 'Republic of Moldova', 'Romania', 'Russian Federation', 'Rwanda', 'Serbia', 'Serbia and Montenegro', 'Slovakia', 'Slovenia', 'Small Island Developing States', 'Solomon Islands', 'South Africa', 'South America', 'South Sudan', 'South-Eastern Asia', 'Southern Africa', 'Southern Asia', 'Southern Europe', 'Spain', 'Sri Lanka', 'Sudan (former)', 'Suriname', 'Sweden', 'Switzerland', 'Thailand', 'Turkey', 'USSR', 'Uganda', 'Ukraine', 'United Kingdom', 'United Republic of Tanzania', 'United States of America', 'Uruguay', 'Venezuela (Bolivarian Republic of)', 'Viet Nam', 'Western Africa', 'Western Asia', 'Western Europe', 'World', 'Yugoslav SFR', 'Zambia'], dtype=object)
- N2O(time, source, category (FAOstat), area (FAOstat))float64[Gg·N2O/a] 34.32 0.0471 ... 5.921
- entity :
- N2O
Magnitude [[[[34.3224 0.0471 52.6792 ... 382.035 0.1305 9.1669]
[17.4597 0.0378 30.6613 ... 250.4325 0.0731 3.1896]
[16.8627 0.0093 22.0179 ... 131.6025 0.0574 5.9773]]]
[[[34.3224 0.0471 52.6792 ... 382.035 0.1305 9.1669]
[17.4597 0.0378 30.6613 ... 250.4325 0.0731 3.1896]
[16.8627 0.0093 22.0179 ... 131.6025 0.0574 5.9773]]]
[[[34.3224 0.0471 52.6792 ... 382.035 nan 9.1669]
[17.4597 0.0378 30.6613 ... 250.4325 nan 3.1896]
[16.8627 0.0093 22.0179 ... 131.6025 nan 5.9773]]]
...
[[[35.0546 0.046 55.6044 ... 416.8022 nan 9.1346]
[18.0273 0.0369 32.0142 ... 280.9176 nan 3.2132]
[17.0272 0.0091 23.5902 ... 135.8846 nan 5.9214]]]
[[[35.0546 0.046 55.6044 ... 416.8022 nan 9.1346]
[18.0273 0.0369 32.0142 ... 280.9176 nan 3.2132]
[17.0272 0.0091 23.5902 ... 135.8846 nan 5.9214]]]
[[[35.0546 0.046 55.6044 ... 416.8022 nan 9.1346]
[18.0273 0.0369 32.0142 ... 280.9176 nan 3.2132]
[17.0272 0.0091 23.5902 ... 135.8846 nan 5.9214]]]]Units N2O gigagram/year
- references :
- http://www.fao.org/faostat/en/#data/GV/metadata
- area :
- area (FAOstat)
- cat :
- category (FAOstat)
[5]: