xarray.DataArray.pr.set#
- DataArray.pr.set(dim: Hashable, key: Any, value: DataArray | ndarray, *, value_dims: list[Hashable] | None = None, existing: str = 'fillna_empty', new: str = 'extend') DataArray #
Set values, optionally expanding the given dimension as necessary.
The handling of already existing key values can be selected using the
existing
parameter.- Parameters:
- dim: str
Dimension along which values should be set.
- key: scalar or list of scalars
Keys in the dimension which should be set. Key values which are missing in the dimension are inserted. The handling of key values which already exist in the dimension is determined by the
existing
parameter.- value: xr.DataArray or np.ndarray
Values that will be inserted at the positions specified by
key
.value
needs to be broadcastable toda[{dim: key}]
.- value_dims: list of str, optional
Specifies the dimensions of
value
. Ifvalue
is not a DataArray andda[{dim: key}]
is higher-dimensional, it is necessary to specify the value dimensions.- existing: “fillna_empty”, “error”, “overwrite”, or “fillna”, optional
How to handle existing keys. If
existing="fillna_empty"
(default), new values overwrite existing values only if all existing values are NaN. Ifexisting="error"
, a ValueError is raised if any key already exists in the index. Ifexisting="overwrite"
, new values overwrite current values for existing keys. Ifexisting="fillna"
, the new values only overwrite NaN values for existing keys.- new: “extend”, or “error”, optional
How to handle new keys. If
new="extend"
(default), keys which do not exist so far are automatically inserted by extending the dimension. Ifnew="error"
, a KeyError is raised if any key is not yet in the dimension.
- Returns:
- daxr.DataArray
modified DataArray
Examples
>>> import pandas as pd >>> import xarray as xr >>> import numpy as np >>> da = xr.DataArray( ... [[0.0, 1.0, 2.0, 3.0], [2.0, 3.0, 4.0, 5.0]], ... coords=[ ... ("area (ISO3)", ["COL", "MEX"]), ... ("time", pd.date_range("2000", "2003", freq="YS")), ... ], ... ) >>> da <xarray.DataArray (area (ISO3): 2, time: 4)> Size: 64B array([[0., 1., 2., 3.], [2., 3., 4., 5.]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01
Setting an existing value
>>> da.pr.set("area", "COL", np.array([0.5, 0.6, 0.7, 0.8])) Traceback (most recent call last): ... ValueError: Values {'COL'} for 'area (ISO3)' already exist and contain data. ... >>> da.pr.set("area", "COL", np.array([0.5, 0.6, 0.7, 0.8]), existing="overwrite") <xarray.DataArray (area (ISO3): 2, time: 4)> Size: 64B array([[0.5, 0.6, 0.7, 0.8], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01
By default, existing values are only overwritten if all existing values are NaN
>>> da_partly_empty = da.copy(deep=True) >>> da_partly_empty.pr.loc[{"area": "COL"}] = np.nan >>> da_partly_empty <xarray.DataArray (area (ISO3): 2, time: 4)> Size: 64B array([[nan, nan, nan, nan], [ 2., 3., 4., 5.]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01 >>> da_partly_empty.pr.set("area", "COL", np.array([0.5, 0.6, 0.7, 0.8])) <xarray.DataArray (area (ISO3): 2, time: 4)> Size: 64B array([[0.5, 0.6, 0.7, 0.8], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01 >>> # if even one value contains data, the default is to raise an Error >>> da_partly_empty.pr.loc[{"area": "COL", "time": "2001"}] = 0.6 >>> da_partly_empty <xarray.DataArray (area (ISO3): 2, time: 4)> Size: 64B array([[nan, 0.6, nan, nan], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01 >>> da_partly_empty.pr.set("area", "COL", np.array([0.5, 0.6, 0.7, 0.8])) Traceback (most recent call last): ... ValueError: Values {'COL'} for 'area (ISO3)' already exist and contain data. ...
Introducing a new value uses the same syntax as modifying existing values
>>> da.pr.set("area", "ARG", np.array([0.5, 0.6, 0.7, 0.8])) <xarray.DataArray (area (ISO3): 3, time: 4)> Size: 96B array([[0.5, 0.6, 0.7, 0.8], [0. , 1. , 2. , 3. ], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 36B 'ARG' 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01
You can also mix existing and new values
>>> da.pr.set( ... "area", ... ["COL", "ARG"], ... np.array([[0.5, 0.6, 0.7, 0.8], [5, 6, 7, 8]]), ... existing="overwrite", ... ) <xarray.DataArray (area (ISO3): 3, time: 4)> Size: 96B array([[5. , 6. , 7. , 8. ], [0.5, 0.6, 0.7, 0.8], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 36B 'ARG' 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01
If you don’t want to automatically extend the dimensions with new values, you can request checking that all keys already exist using
new="error"
:>>> da.pr.set("area", "ARG", np.array([0.5, 0.6, 0.7, 0.8]), new="error") Traceback (most recent call last): ... KeyError: "Values {'ARG'} not in 'area (ISO3)', use new='extend' to automatic...
If you want to use broadcasting or have more dimensions, the dimensions of your input can’t be determined automatically anymore. Use the value_dims parameter to supply this information.
>>> da.pr.set( ... "area", ... ["COL", "ARG"], ... np.array([0.5, 0.6, 0.7, 0.8]), ... existing="overwrite", ... ) Traceback (most recent call last): ... ValueError: Could not automatically determine value dimensions, please use th... >>> da.pr.set( ... "area", ... ["COL", "ARG"], ... np.array([0.5, 0.6, 0.7, 0.8]), ... value_dims=["time"], ... existing="overwrite", ... ) <xarray.DataArray (area (ISO3): 3, time: 4)> Size: 96B array([[0.5, 0.6, 0.7, 0.8], [0.5, 0.6, 0.7, 0.8], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 36B 'ARG' 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01
Instead of overwriting existing values, you can also choose to only fill missing values.
>>> da.pr.loc[{"area": "COL", "time": "2001"}] = np.nan >>> da <xarray.DataArray (area (ISO3): 2, time: 4)> Size: 64B array([[ 0., nan, 2., 3.], [ 2., 3., 4., 5.]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 24B 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01 >>> da.pr.set( ... "area", ... ["COL", "ARG"], ... np.array([0.5, 0.6, 0.7, 0.8]), ... value_dims=["time"], ... existing="fillna", ... ) <xarray.DataArray (area (ISO3): 3, time: 4)> Size: 96B array([[0.5, 0.6, 0.7, 0.8], [0. , 0.6, 2. , 3. ], [2. , 3. , 4. , 5. ]]) Coordinates: * area (ISO3) (area (ISO3)) <U3 36B 'ARG' 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01
Because you can also supply a DataArray as a value, it is easy to define values from existing values using arithmetic
>>> da.pr.set("area", "ARG", da.pr.loc[{"area": "COL"}] * 2) <xarray.DataArray (area (ISO3): 3, time: 4)> Size: 96B array([[ 0., nan, 4., 6.], [ 0., nan, 2., 3.], [ 2., 3., 4., 5.]]) Coordinates: * area (ISO3) (area (ISO3)) object 24B 'ARG' 'COL' 'MEX' * time (time) datetime64[ns] 32B 2000-01-01 2001-01-01 ... 2003-01-01