Store and load datasets#
The native storage format for primap2 datasets is netcdf, which supports to store all data and metadata in one file, as well as compression. We again use a toy example dataset to show how to store and reload datasets.
Logging setup for the docs Hide code cell content
# setup logging for the docs - we don't need debug logs
import sys
from loguru import logger
logger.remove()
logger.add(sys.stderr, level="INFO")
1
import primap2
import primap2.tests
ds = primap2.tests.examples.toy_ds()
ds
<xarray.Dataset> Size: 3kB Dimensions: (time: 6, area (ISO3): 2, category (IPCC2006): 5, source: 2) Coordinates: * time (time) datetime64[ns] 48B 2015-01-01 ... 2020-01-01 * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'ARG' * category (IPCC2006) (category (IPCC2006)) <U3 60B '0' '1' '2' '1.A' '1.B' * source (source) <U8 64B 'RAND2020' 'RAND2021' Data variables: CO2 (time, area (ISO3), category (IPCC2006), source) float64 960B [CO2·Gg/a] ... CH4 (time, area (ISO3), category (IPCC2006), source) float64 960B [CH4·Gg/a] ... CH4 (SARGWP100) (time, area (ISO3), category (IPCC2006), source) float64 960B [CO2·Gg/a] ... Attributes: area: area (ISO3) cat: category (IPCC2006)
xarray.Dataset
- time: 6
- area (ISO3): 2
- category (IPCC2006): 5
- source: 2
- time(time)datetime64[ns]2015-01-01 ... 2020-01-01
array(['2015-01-01T00:00:00.000000000', '2016-01-01T00:00:00.000000000', '2017-01-01T00:00:00.000000000', '2018-01-01T00:00:00.000000000', '2019-01-01T00:00:00.000000000', '2020-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
- area (ISO3)(area (ISO3))<U3'COL' 'ARG'
array(['COL', 'ARG'], dtype='<U3')
- category (IPCC2006)(category (IPCC2006))<U3'0' '1' '2' '1.A' '1.B'
array(['0', '1', '2', '1.A', '1.B'], dtype='<U3')
- source(source)<U8'RAND2020' 'RAND2021'
array(['RAND2020', 'RAND2021'], dtype='<U8')
- CO2(time, area (ISO3), category (IPCC2006), source)float64[CO2·Gg/a] 0.5118 0.9505 ... 0.4559
- entity :
- CO2
Magnitude [[[[0.5118216247002567 0.9504636963259353] [0.14415961271963373 0.9486494471372439] [0.31183145201048545 0.42332644897257565] [0.8277025938204418 0.4091991363691613] [0.5495936876730595 0.027559113243068367]] [[0.7535131086748066 0.5381433132192782] [0.32973171649909216 0.7884287034284043] [0.303194829291645 0.4534978894806515] [0.13404169724716475 0.40311298644712923] [0.20345524067614962 0.2623133404418495]]] [[[0.7503646726300526 0.2804087579860399] [0.48519097443163506 0.9807371998012386] [0.9616571936637868 0.7247899407735336] [0.5412268555474342 0.2768912040453708] [0.16065200877512686 0.9699254132161326]] [[0.5160685855478787 0.11586561247077032] [0.6234897555375004 0.776683114342298] [0.6130033010530405 0.9172977047909027] [0.03959287666420286 0.5285892632600216] [0.4593358828854037 0.0623495791498756]]] [[[0.641328169139375 0.8526328384806567] [0.592941018104284 0.2600974477372232] [0.8398815210314088 0.5094958815215094] [0.510888884466533 0.7530302077021779] [0.14792203578495655 0.819626719119277]] [[0.6832869060032571 0.787096941554801] [0.19161625902013524 0.80236416113453] [0.19132392605720028 0.08155261736351271] [0.8552269742870702 0.8612834961776684] [0.8765370964165805 0.4719097193587902]]] [[[0.2740483886137183 0.007091828603166261] [0.6457208955749478 0.719909383508693] [0.8355692165002742 0.28187782736454214] [0.2152181671629736 0.6393313800665879] [0.8050548331450097 0.9636708728449709]] [[0.15052483042117748 0.48221238819933654] [0.8947158621961735 0.4227169069454373] [0.5895020620840481 0.0244906774933632] [0.6734598871529389 0.9190886196338225] [0.8268253295567211 0.8855202667099468]]] [[[0.6603553805205233 0.24555226724317758] [0.7685169988962544 0.2116747426075105] [0.8312748346644612 0.06271792257076825] [0.8254878133935558 0.1645072664741013] [0.37514699649664185 0.3167381665569643]] [[0.6913370352777413 0.17857187817437192] [0.39625616221698645 0.0058245951079809455] [0.2624947127501015 0.42118881422895527] [0.10592123670732445 0.6331599460365578] [0.38042426988653233 0.7252939380762389]]] [[[0.6538660110683944 0.4312267487774062] [0.8673205056421992 0.632135117500167] [0.8102743521062991 0.341794723940113] [0.5436692896684556 0.1962968851147534] [0.9961411901186279 0.24321546430632712]] [[0.25686746722710274 0.07319007239096598] [0.2578031189967366 0.7631285325440532] [0.6978935706830813 0.12867321231716944] [0.37623850142809423 0.4209213946174629] [0.6649842463619607 0.45592896304374886]]]]
Units CO2 gigagram/year - CH4(time, area (ISO3), category (IPCC2006), source)float64[CH4·Gg/a] 0.5865 ... 0.02262
- entity :
- CH4
Magnitude [[[[0.5865183268255314 0.8396846036089424] [0.7264736103123705 0.36500726350855894] [0.44839630934448427 0.3676995696900066] [0.10973466400669674 0.2032415440873966] [0.2838064889441311 0.3141338956023022]] [[0.3130478588199377 0.576699716252952] [0.9716899756197547 0.774664134923732] [0.7911339481728052 0.75926850053958] [0.5969877305237564 0.9176922571709127] [0.689630155447081 0.500356430736871]]] [[[0.07708380850053875 0.48844922708552385] [0.21283099534033434 0.13269629754678725] [0.506064922529373 0.785085292596959] [0.29500644280551946 0.7687717599091665] [0.5256295231622541 0.14904802337071255]] [[0.9649677439797357 0.4016362238885175] [0.2952342556626958 0.8469983706337296] [0.12446033251547983 0.7335904610737034] [0.18782474256546833 0.39249177601258245] [0.23189987846213844 0.8412279926923869]]] [[[0.39007455193986174 0.9746928128822893] [0.6252614844151068 0.6936228347029152] [0.5215251221324175 0.30896819907559114] [0.3955564210524287 0.9409341876619017] [0.20120320072466447 0.9882189012202123]] [[0.7583058620038184 0.35978692649261024] [0.6415135895056033 0.38098153929308753] [0.38149293263992 0.5038029501945296] [0.016722821635377083 0.4935715599433962] [0.971598413446888 0.28546522878557845]]] [[[0.7482179590766121 0.44278889007049504] [0.20928104261778546 0.9050025708181295] [0.016827284680212995 0.3035089265995107] [0.9990258823239375 0.2621467961899895] [0.849044521859272 0.6056831486557043]] [[0.8060357075271236 0.6303177554434782] [0.3626968570578484 0.7607887845834825] [0.026484548903972338 0.4468129517344519] [0.371854569870106 0.4770740056373495] [0.1276206864960696 0.22250686594627245]]] [[[0.5620515900997094 0.38776911565595396] [0.7916562055903406 0.6051365892775145] [0.8612666847869941 0.7323608373256045] [0.601823447758565 0.2876156021959957] [0.7827604679261685 0.2512675781710818]] [[0.07521111181440443 0.9628645784432278] [0.5400112050964692 0.7738942975498113] [0.5292228076303601 0.6115797303718215] [0.03389225051650624 0.18679367935380853] [0.6746893954347775 0.5705645979524983]]] [[[0.15855503398671267 0.9520292687261848] [0.1543536325474043 0.5103032497386853] [0.14400287501925768 0.7173717261062347] [0.27631301409515274 0.13413397512178327] [0.04598718067051566 0.17483553789762363]] [[0.1917987167235664 0.5369720795717926] [0.4510388861713912 0.9572943672535373] [0.9541513368666424 0.7965461104684304] [0.6715876339807033 0.8450230915683464] [0.9387518284798846 0.022617728887002753]]]]
Units CH4 gigagram/year - CH4 (SARGWP100)(time, area (ISO3), category (IPCC2006), source)float64[CO2·Gg/a] 12.32 17.63 ... 0.475
- entity :
- CH4
- gwp_context :
- SARGWP100
Magnitude [[[[12.316884863336158 17.63337667578779] [15.25594581655978 7.665152533679737] [9.41632249623417 7.721690963490139] [2.3044279441406315 4.268072425835328] [5.959936267826753 6.596811807648346]] [[6.574005035218692 12.110694041311993] [20.405489488014847 16.267946833398373] [16.613812911628912 15.944638511331178] [12.536742340998885 19.271537400589168] [14.4822332643887 10.50748504547429]]] [[[1.6187599785113138 10.257433768796002] [4.469450902147021 2.786622248482532] [10.627363373116832 16.486791144536138] [6.1951352989159085 16.144206958092497] [11.038219986407334 3.1300084907849635]] [[20.264322623574447 8.434360701658868] [6.199919368916611 17.78696578330832] [2.6136669828250763 15.405399682547772] [3.9443195938748343 8.24232729626423] [4.869897447704907 17.665787846540127]]] [[[8.191565590737097 20.468549070528077] [13.130491172717242 14.566079528761218] [10.952027564780767 6.488332180587414] [8.306684842101003 19.759617940899936] [4.225267215217953 20.752596925624456]] [[15.924423102080189 7.5555254563448155] [13.47178537961767 8.000612325154838] [8.011351585438321 10.579861954085121] [0.35117925434291875 10.365002758811318] [20.40356668238465 5.994769804497147]]] [[[15.712577140608854 9.298566691480396] [4.394901894973494 19.00505398718072] [0.3533729782844729 6.373687458589724] [20.979543528802687 5.50508271998978] [17.82993495904471 12.71934612176979]] [[16.926749858069595 13.236672864313043] [7.616633998214815 15.976564476253134] [0.5561755269834191 9.38307198642349] [7.8089459672722255 10.018554118384339] [2.680034416417462 4.6726441848717215]]] [[[11.803083392093896 8.143151428775033] [16.62478031739715 12.707868374827804] [18.086600380526875 15.379577583837696] [12.638292402929864 6.03992764611591] [16.437969826449535 5.276619141592718]] [[1.579433348102493 20.220156147307783] [11.340235307025852 16.251780248546037] [11.113678960237563 12.84317433780825] [0.711737260846631 3.9226672664299786] [14.168477304130326 11.981856557002464]]] [[[3.329655713720966 19.99261464324988] [3.2414262834954903 10.716368244512392] [3.024060375404411 15.064806248230926] [5.802573295998207 2.8168134775574485] [0.9657307940808287 3.671546295850096]] [[4.027773051194894 11.276413671007644] [9.471816609599214 20.103181712324282] [20.037178074199492 16.727468319837037] [14.103340313594769 17.745484922935276] [19.713788398077575 0.4749723066270578]]]]
Units CO2 gigagram/year
- timePandasIndex
PandasIndex(DatetimeIndex(['2015-01-01', '2016-01-01', '2017-01-01', '2018-01-01', '2019-01-01', '2020-01-01'], dtype='datetime64[ns]', name='time', freq='YS-JAN'))
- area (ISO3)PandasIndex
PandasIndex(Index(['COL', 'ARG'], dtype='object', name='area (ISO3)'))
- category (IPCC2006)PandasIndex
PandasIndex(Index(['0', '1', '2', '1.A', '1.B'], dtype='object', name='category (IPCC2006)'))
- sourcePandasIndex
PandasIndex(Index(['RAND2020', 'RAND2021'], dtype='object', name='source'))
- area :
- area (ISO3)
- cat :
- category (IPCC2006)
Store to disk#
Storing a dataset to disk works using the xarray.Dataset.pr.to_netcdf()
function.
import tempfile
import pathlib
# setup temporary directory to save things to in this example
with tempfile.TemporaryDirectory() as tdname:
td = pathlib.Path(tdname)
# simple saving without compression
ds.pr.to_netcdf(td / "toy_ds.nc")
# using zlib compression for all gases
compression = {"zlib": True, "complevel": 9}
encoding = {var: compression for var in ds.data_vars}
ds.pr.to_netcdf(td / "toy_ds_compressed.nc", encoding=encoding)
Load from disk#
We also provide the function primap2.open_dataset()
to load datasets back into memory.
In this example, we load a minimal dataset.
ds = primap2.open_dataset("../minimal_ds.nc")
ds
<xarray.Dataset> Size: 3kB Dimensions: (time: 21, area (ISO3): 4, source: 1) Coordinates: * area (ISO3) (area (ISO3)) <U3 48B 'COL' 'ARG' 'MEX' 'BOL' * source (source) <U8 32B 'RAND2020' * time (time) datetime64[ns] 168B 2000-01-01 ... 2020-01-01 Data variables: CH4 (time, area (ISO3), source) float64 672B [CH4·Gg/a] 0.75... CO2 (time, area (ISO3), source) float64 672B [CO2·Gg/a] 0.66... SF6 (time, area (ISO3), source) float64 672B [SF6·Gg/a] 0.00... SF6 (SARGWP100) (time, area (ISO3), source) float64 672B [CO2·Gg/a] 43.0... Attributes: area: area (ISO3)
xarray.Dataset
- time: 21
- area (ISO3): 4
- source: 1
- area (ISO3)(area (ISO3))<U3'COL' 'ARG' 'MEX' 'BOL'
array(['COL', 'ARG', 'MEX', 'BOL'], dtype='<U3')
- source(source)<U8'RAND2020'
array(['RAND2020'], dtype='<U8')
- time(time)datetime64[ns]2000-01-01 ... 2020-01-01
array(['2000-01-01T00:00:00.000000000', '2001-01-01T00:00:00.000000000', '2002-01-01T00:00:00.000000000', '2003-01-01T00:00:00.000000000', '2004-01-01T00:00:00.000000000', '2005-01-01T00:00:00.000000000', '2006-01-01T00:00:00.000000000', '2007-01-01T00:00:00.000000000', '2008-01-01T00:00:00.000000000', '2009-01-01T00:00:00.000000000', '2010-01-01T00:00:00.000000000', '2011-01-01T00:00:00.000000000', '2012-01-01T00:00:00.000000000', '2013-01-01T00:00:00.000000000', '2014-01-01T00:00:00.000000000', '2015-01-01T00:00:00.000000000', '2016-01-01T00:00:00.000000000', '2017-01-01T00:00:00.000000000', '2018-01-01T00:00:00.000000000', '2019-01-01T00:00:00.000000000', '2020-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
- CH4(time, area (ISO3), source)float64[CH4·Gg/a] 0.7543 0.3685 ... 0.5538
- entity :
- CH4
Magnitude [[[0.7542549558511776] [0.3685320940353728] [0.7369296426004138] [0.5653780221221959]] [[0.8364808087479632] [0.08813720839288863] [0.6269624635226865] [0.03678244010434961]] [[0.3264265671890325] [0.29886414437019804] [0.433428122526182] [0.752872424535261]] [[0.99013758070399] [0.6786592295407908] [0.724861880203653] [0.2479712511975457]] [[0.9168491852422332] [0.8629953227625237] [0.6589798491488441] [0.3051994326528302]] [[0.363954789171993] [0.13125277522126044] [0.5317324546827301] [0.09464381146488332]] [[0.37219028327011106] [0.9200030821096901] [0.5765632290541318] [0.38356553316313813]] [[0.21173423093184363] [0.6988319605873081] [0.07685377058063037] [0.29813804128060784]] [[0.8256908816307077] [0.9449887033172708] [0.029675864516146455] [0.691475104643898]] [[0.5508098263229796] [0.7364362832821014] [0.7567733852678931] [0.7673430557143591]] [[0.48842796850443415] [0.8079833097460629] [0.6463223733753957] [0.9945274253910127]] [[0.12629758359095455] [0.8677580617075967] [0.27030619541411705] [0.37483753711409995]] [[0.3156350508768655] [0.87448806091195] [0.8384328875893682] [0.4068677527163894]] [[0.01971608147418813] [0.8451145147260058] [0.6842069975800996] [0.3522207334108044]] [[0.23412165329055146] [0.5108025262787018] [0.45759346753578756] [0.9655410083920338]] [[0.18459977602749444] [0.10867412804182908] [0.669245140924568] [0.03768909532940734]] [[0.8286652785948826] [0.8229215128954231] [0.25918844188963486] [0.40278785991644206]] [[0.8267430216430773] [0.7679201489444207] [0.1588857548770053] [0.17391590049126393]] [[0.08214070760255643] [0.07198107823046473] [0.020714404278233012] [0.40981960314708143]] [[0.9703624792874707] [0.025004948745150135] [0.7211716718406348] [0.09394984504371917]] [[0.5485158893136658] [0.08989404965510184] [0.02304916212005137] [0.5537587740490154]]]
Units CH4 gigagram/year - CO2(time, area (ISO3), source)float64[CO2·Gg/a] 0.6635 ... 0.05699
- entity :
- CO2
Magnitude [[[0.6635201782042437] [0.9168318265574015] [0.10638744826946733] [0.6141634063871825]] [[0.3382253292557216] [0.682338632666392] [0.7395385058805823] [0.7905772609283975]] [[0.11409100697947938] [0.25072806564646855] [0.8504489730663606] [0.4087390697622033]] [[0.801477073222692] [0.8520959700808077] [0.7152128934482338] [0.08031264625923029]] [[0.6101888801109169] [0.8811252096269836] [0.5550000928935479] [0.111051447212553]] [[0.812456122446051] [0.14774847859299522] [0.14072428619992494] [0.13586945518522808]] [[0.09237210162677412] [0.9547228137993204] [0.7905119579747139] [0.5212602891531959]] [[0.6546441668962568] [0.26581566767403764] [0.9689908809264166] [0.43650759225482505]] [[0.6612736041974472] [0.5975266186962794] [0.785099419487452] [0.5551216597262866]] [[0.33259990823381236] [0.05037719350908554] [0.3369210750624343] [0.6517388086838728]] [[0.5522010519840608] [0.9106197882021331] [0.8234802866938398] [0.5275309558025338]] [[0.547885968428845] [0.6129352532104692] [0.7602238439362549] [0.8143255189588381]] [[0.5705158101627505] [0.9664940577619263] [0.2635155897055561] [0.33780248501934773]] [[0.14997833554506934] [0.3763835226478044] [0.1647368812064165] [0.34866229530317505]] [[0.4634965654407963] [0.4877882315623534] [0.037042328100195476] [0.9163693640320614]] [[0.6455829370274393] [0.7107660047838966] [0.4195285063193993] [0.40996965629779725]] [[0.4758651094425167] [0.9454818882042824] [0.03006484586084257] [0.6809799063677471]] [[0.009667561812883863] [0.13177214894370792] [0.43758244853013717] [0.07862391306443517]] [[0.24073322602911418] [0.9387762091651058] [0.6674664205953101] [0.44603773344001674]] [[0.12764283003356602] [0.6583069409300996] [0.19115895273354766] [0.7549426352132018]] [[0.5548269020548341] [0.6295689201632558] [0.1863119137604362] [0.056990272829468]]]
Units CO2 gigagram/year - SF6(time, area (ISO3), source)float64[SF6·Gg/a] 0.0018 ... 0.04208
- entity :
- SF6
Magnitude [[[0.0017999283398781873] [0.5055712944941024] [0.6764281356992125] [0.1235696696610431]] [[0.4777477627293901] [0.5269748533552134] [0.8016230618391313] [0.8096243456920031]] [[0.5331652257091747] [0.6183578342728786] [0.23948967174374636] [0.3085480136136496]] [[0.5323686911928804] [0.49716731657668833] [0.4340997424406753] [0.6836916441599878]] [[0.6672625868689808] [0.5590718835247671] [0.9382487089014679] [0.33905171176763205]] [[0.1810063661773531] [0.9437946262290234] [0.7570997700454333] [0.6703211769514555]] [[0.08218973282583775] [0.07422128273206374] [0.8080948951738635] [0.3392561054061216]] [[0.11494516659459275] [0.6427237374065423] [0.6470840960951291] [0.8051477668558303]] [[0.15393918278957064] [0.5204792846286578] [0.0889369331164156] [0.621097578936615]] [[0.8996772186819172] [0.1699100337618633] [0.6274569096295286] [0.39183340587122273]] [[0.9947633418025775] [0.23459696651127115] [0.9471595930522158] [0.7568653740660694]] [[0.11666812059542109] [0.5694154870543] [0.2972097217084779] [0.7174046976355728]] [[0.9083554803854986] [0.04460837247920335] [0.19494181397753896] [0.9040802399068613]] [[0.9194928308313104] [0.5870689265407956] [0.47963603827269163] [0.5804577179083541]] [[0.7811415855268452] [0.6576301057548541] [0.5261474587632892] [0.5584057498489272]] [[0.9128417649366534] [0.027363497052910413] [0.9751450575411759] [0.050565352755786]] [[0.6708354594464743] [0.7111453753978566] [0.6874428941931122] [0.5350717635605707]] [[0.5912540921962742] [0.018132954296472215] [0.011587687621276155] [0.5883767927200321]] [[0.5540699967566246] [0.9977180613859968] [0.1775038540227457] [0.8229456655620809]] [[0.9649063147146625] [0.26865656723726417] [0.9595548786166266] [0.1666851588620759]] [[0.9537394170806007] [0.7614149406593684] [0.2628449737891362] [0.04207827124056718]]]
Units SF6 gigagram/year - SF6 (SARGWP100)(time, area (ISO3), source)float64[CO2·Gg/a] 43.02 ... 1.006e+03
- entity :
- SF6
- gwp_context :
- SARGWP100
Magnitude [[[43.01828732308868] [12083.153938409048] [16166.632443211178] [2953.3151048989303]] [[11418.171529232424] [12594.698995189603] [19158.791177955238] [19350.021862038877]] [[12742.648894449276] [14778.752239121799] [5723.803154675538] [7374.297525366225]] [[12723.611719509843] [11882.298866182851] [10374.98384433214] [16340.23029542371]] [[15947.575826168642] [13361.818016241936] [22424.144142745085] [8103.335911246406]] [[4326.052151638739] [22556.691566873662] [18094.684504085857] [16020.676129139789]] [[1964.3346145375222] [1773.8886572963236] [19313.467994655337] [8108.220919206307]] [[2747.189481610767] [15361.097324016362] [15465.309896673587] [19243.031627854347]] [[3679.146468670738] [12439.454902624922] [2125.592701482333] [14844.232136585098]] [[21502.285526497824] [4060.8498069085335] [14996.220140145735] [9364.818400322223]] [[23774.843869081604] [5606.86749961938] [22637.11427394796] [18089.08244017906]] [[2788.368082230564] [13609.03014059777] [7103.312348832623] [17145.97227349019]] [[21709.69598121342] [1066.14010225296] [4659.109354063181] [21607.517733773984]] [[21975.87865686832] [14030.947344325015] [11463.301314717331] [13872.939458009665]] [[18669.2838940916] [15717.359527541017] [12574.924264442612] [13345.89742138936]] [[21816.918181986017] [653.9875795645589] [23305.966875234106] [1208.5119308632854]] [[16032.967480770736] [16996.374472008774] [16429.885171215385] [12788.215149097641]] [[14130.972803490955] [433.3776076856859] [276.94573414850015] [14062.205346008766]] [[13242.27292248333] [23845.461667125324] [4242.342111143623] [19668.401406933735]] [[23061.260921680434] [6420.891956970614] [22933.36159893738] [3983.7752968036143]] [[22794.37206822636] [18197.81708175891] [6281.994873560355] [1005.6706826495556]]]
Units CO2 gigagram/year
- area (ISO3)PandasIndex
PandasIndex(Index(['COL', 'ARG', 'MEX', 'BOL'], dtype='object', name='area (ISO3)'))
- sourcePandasIndex
PandasIndex(Index(['RAND2020'], dtype='object', name='source'))
- timePandasIndex
PandasIndex(DatetimeIndex(['2000-01-01', '2001-01-01', '2002-01-01', '2003-01-01', '2004-01-01', '2005-01-01', '2006-01-01', '2007-01-01', '2008-01-01', '2009-01-01', '2010-01-01', '2011-01-01', '2012-01-01', '2013-01-01', '2014-01-01', '2015-01-01', '2016-01-01', '2017-01-01', '2018-01-01', '2019-01-01', '2020-01-01'], dtype='datetime64[ns]', name='time', freq=None))
- area :
- area (ISO3)
Note how units were read and attributes restored.