{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Data reading example 2 - PRIMAP-hist v2.2 #\n", "To run this example the file `PRIMAPHIST22__19-Jan-2021.csv` must be placed in the same folder as this notebook.\n", "The PRIMAP-hist data (doi:10.5281/zenodo.4479172) is available from Zenodo: https://zenodo.org/record/4479172" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# imports\n", "import primap2 as pm2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset Specifications ##\n", "Here we define which columns of the csv file contain the coordinates.\n", "The dict `coords_cols` contains the mapping of csv columns to PRIMAP2 dimensions.\n", "Default values are set using `coords_defaults`.\n", "The terminologies (e.g. IPCC2006 for categories or the ISO3 country codes for area) are set in the `coords_terminologies` dict.\n", "`coords_value_mapping` defines conversion of metadata values, e.g. category codes.\n", "`filter_keep` and `filter_remove` filter the input data.\n", "Each entry in `filter_keep` specifies a subset of the input data which is kept while the subsets defined by `filter_remove` are removed from the input data.\n", "\n", "For details, we refer to the documentation of `read_wide_csv_file_if` located in the `pm2io` module of PRIMAP2." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "file = \"PRIMAPHIST22__19-Jan-2021.csv\"\n", "coords_cols = {\n", " \"unit\": \"unit\",\n", " \"entity\": \"entity\",\n", " \"area\": \"country\",\n", " \"scenario\": \"scenario\",\n", " \"category\": \"category\",\n", "}\n", "coords_defaults = {\n", " \"source\": \"PRIMAP-hist_v2.2\",\n", "}\n", "coords_terminologies = {\n", " \"area\": \"ISO3\",\n", " \"category\": \"IPCC2006\",\n", " \"scenario\": \"PRIMAP-hist\",\n", "}\n", "\n", "coords_value_mapping = {\n", " \"category\": \"PRIMAP1\",\n", " \"unit\": \"PRIMAP1\",\n", " \"entity\": \"PRIMAP1\",\n", "}\n", "\n", "filter_keep = {\n", " \"f1\": {\n", " \"entity\": \"CO2\",\n", " \"category\": [\"IPC2\", \"IPC1\"],\n", " \"country\": [\"AUS\", \"BRA\", \"CHN\", \"GBR\", \"AFG\"],\n", " },\n", " \"f2\": {\n", " \"entity\": \"KYOTOGHG\",\n", " \"category\": [\"IPCMAG\", \"IPC4\"],\n", " \"country\": [\"AUS\", \"BRA\", \"CHN\", \"GBR\", \"AFG\"],\n", " },\n", "}\n", "\n", "filter_remove = {\"f1\": {\"scenario\": \"HISTTP\"}}\n", "# filter_keep = {\"f1\": {\"entity\": \"KYOTOGHG\", \"category\": [\"IPC2\", \"IPC1\"]},}\n", "# filter_keep = {}\n", "# filter_remove = {}\n", "\n", "meta_data = {\"references\": \"doi:10.5281/zenodo.4479172\"}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading the data to interchange format ##\n", "To enable a wider use of the PRIMAP2 data reading functionality we read into the PRIMAP2 interchange format, which is a wide format pandas DataFrame with coordinates in columns and following PRIMAP2 specifications.\n", "Additional metadata not captured in this format are stored in `DataFrame.attrs` as a dictionary.\n", "As the `attrs` functionality in pandas is experimental it is just stored in the DataFrame returned by the reading functions and should be stored individually before doing any processing with the DataFrame.\n", "\n", "Here we read the data using the `read_wide_csv_file_if()` function. We have specified restrictive filters above to limit the data included in this notebook." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": " source scenario (PRIMAP-hist) area (ISO3) entity \\\n0 PRIMAP-hist_v2.2 HISTCR AFG CO2 \n1 PRIMAP-hist_v2.2 HISTCR AFG CO2 \n2 PRIMAP-hist_v2.2 HISTCR AFG KYOTOGHG (SARGWP100) \n3 PRIMAP-hist_v2.2 HISTCR AFG KYOTOGHG (SARGWP100) \n4 PRIMAP-hist_v2.2 HISTCR AUS CO2 \n\n unit category (IPCC2006) 1850 1851 1852 1853 ... \\\n0 Gg CO2 / yr 1 0.147 0.155 0.163 0.172 ... \n1 Gg CO2 / yr 2 0.169 0.178 0.188 0.198 ... \n2 Gg CO2 / yr 4 155.000 154.000 154.000 153.000 ... \n3 Gg CO2 / yr M.AG 615.000 668.000 719.000 770.000 ... \n4 Gg CO2 / yr 1 0.000 0.000 0.000 0.000 ... \n\n 2009 2010 2011 2012 2013 2014 2015 \\\n0 6750.0 8440.0 12200.0 10700.0 9990.0 11000.0 11700.0 \n1 191.0 207.0 207.0 268.0 341.0 318.0 269.0 \n2 3080.0 3160.0 3270.0 3400.0 3510.0 3620.0 3730.0 \n3 12400.0 14100.0 14300.0 14200.0 14100.0 14600.0 13600.0 \n4 384000.0 380000.0 378000.0 383000.0 376000.0 372000.0 380000.0 \n\n 2016 2017 2018 \n0 12700.0 13100.0 18600.0 \n1 293.0 292.0 298.0 \n2 3800.0 3900.0 4010.0 \n3 13900.0 13800.0 13300.0 \n4 389000.0 393000.0 393000.0 \n\n[5 rows x 175 columns]", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
sourcescenario (PRIMAP-hist)area (ISO3)entityunitcategory (IPCC2006)1850185118521853...2009201020112012201320142015201620172018
0PRIMAP-hist_v2.2HISTCRAFGCO2Gg CO2 / yr10.1470.1550.1630.172...6750.08440.012200.010700.09990.011000.011700.012700.013100.018600.0
1PRIMAP-hist_v2.2HISTCRAFGCO2Gg CO2 / yr20.1690.1780.1880.198...191.0207.0207.0268.0341.0318.0269.0293.0292.0298.0
2PRIMAP-hist_v2.2HISTCRAFGKYOTOGHG (SARGWP100)Gg CO2 / yr4155.000154.000154.000153.000...3080.03160.03270.03400.03510.03620.03730.03800.03900.04010.0
3PRIMAP-hist_v2.2HISTCRAFGKYOTOGHG (SARGWP100)Gg CO2 / yrM.AG615.000668.000719.000770.000...12400.014100.014300.014200.014100.014600.013600.013900.013800.013300.0
4PRIMAP-hist_v2.2HISTCRAUSCO2Gg CO2 / yr10.0000.0000.0000.000...384000.0380000.0378000.0383000.0376000.0372000.0380000.0389000.0393000.0393000.0
\n

5 rows × 175 columns

\n
" }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PMH_if = pm2.pm2io.read_wide_csv_file_if(\n", " file,\n", " coords_cols=coords_cols,\n", " coords_defaults=coords_defaults,\n", " coords_terminologies=coords_terminologies,\n", " coords_value_mapping=coords_value_mapping,\n", " filter_keep=filter_keep,\n", " filter_remove=filter_remove,\n", " meta_data=meta_data,\n", ")\n", "PMH_if.head()" ] }, { "cell_type": "code", "execution_count": 4, "outputs": [ { "data": { "text/plain": "{'attrs': {'references': 'doi:10.5281/zenodo.4479172',\n 'area': 'area (ISO3)',\n 'scen': 'scenario (PRIMAP-hist)',\n 'cat': 'category (IPCC2006)'},\n 'time_format': '%Y',\n 'dimensions': {'CO2': ['unit',\n 'entity',\n 'area (ISO3)',\n 'category (IPCC2006)',\n 'source',\n 'scenario (PRIMAP-hist)'],\n 'KYOTOGHG (SARGWP100)': ['unit',\n 'entity',\n 'area (ISO3)',\n 'category (IPCC2006)',\n 'source',\n 'scenario (PRIMAP-hist)']}}" }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PMH_if.attrs" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transformation to PRIMAP2 xarray format ##\n", "The transformation to PRIMAP2 xarray format is done using the function `from_interchange_format` which takes an interchange format DataFrame.\n", "The resulting xr Dataset is already quantified, thus the variables are pint arrays which include a unit." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2021-03-31 11:46:22.696 | DEBUG | primap2.pm2io._interchange_format:from_interchange_format:252 - Expected array shapes: [[2, 5, 4, 1, 1], [2, 5, 4, 1, 1]], resulting in size 80.\n" ] }, { "data": { "text/plain": "\nDimensions: (area (ISO3): 5, category (IPCC2006): 4, scenario (PRIMAP-hist): 1, source: 1, time: 169)\nCoordinates:\n * category (IPCC2006) (category (IPCC2006)) object '1' '2' '4' 'M.AG'\n * time (time) datetime64[ns] 1850-01-01 ... 2018-01-01\n * area (ISO3) (area (ISO3)) object 'AFG' 'AUS' 'BRA' 'CHN' 'GBR'\n * source (source) object 'PRIMAP-hist_v2.2'\n * scenario (PRIMAP-hist) (scenario (PRIMAP-hist)) object 'HISTCR'\nData variables:\n CO2 (time, area (ISO3), category (IPCC2006), source, scenario (PRIMAP-hist)) float64 [CO2·Gg/annum] ...\n KYOTOGHG (SARGWP100) (time, area (ISO3), category (IPCC2006), source, scenario (PRIMAP-hist)) float64 [CO2·Gg/annum] ...\nAttributes:\n references: doi:10.5281/zenodo.4479172\n area: area (ISO3)\n scen: scenario (PRIMAP-hist)\n cat: category (IPCC2006)", "text/html": "
\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
<xarray.Dataset>\nDimensions:                 (area (ISO3): 5, category (IPCC2006): 4, scenario (PRIMAP-hist): 1, source: 1, time: 169)\nCoordinates:\n  * category (IPCC2006)     (category (IPCC2006)) object '1' '2' '4' 'M.AG'\n  * time                    (time) datetime64[ns] 1850-01-01 ... 2018-01-01\n  * area (ISO3)             (area (ISO3)) object 'AFG' 'AUS' 'BRA' 'CHN' 'GBR'\n  * source                  (source) object 'PRIMAP-hist_v2.2'\n  * scenario (PRIMAP-hist)  (scenario (PRIMAP-hist)) object 'HISTCR'\nData variables:\n    CO2                     (time, area (ISO3), category (IPCC2006), source, scenario (PRIMAP-hist)) float64 [CO2·Gg/annum] ...\n    KYOTOGHG (SARGWP100)    (time, area (ISO3), category (IPCC2006), source, scenario (PRIMAP-hist)) float64 [CO2·Gg/annum] ...\nAttributes:\n    references:  doi:10.5281/zenodo.4479172\n    area:        area (ISO3)\n    scen:        scenario (PRIMAP-hist)\n    cat:         category (IPCC2006)
" }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PMH_pm2 = pm2.pm2io.from_interchange_format(PMH_if)\n", "PMH_pm2" ] }, { "cell_type": "code", "execution_count": 5, "outputs": [], "source": [], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }