primap2.pm2io.nir_add_unit_information#
- primap2.pm2io.nir_add_unit_information(df_nir: DataFrame, *, unit_row: str | int, entity_row: str | int | None = None, regexp_entity: str, regexp_unit: str | None = None, manual_repl_unit: dict[str, str] | None = None, manual_repl_entity: dict[str, str] | None = None, default_unit: str) DataFrame [source]#
Add unit information to a National Inventory Report (NIR) style DataFrame.
Add unit information to the header of an “entity-wide” file as present in the standard table format of National Inventory Reports (NIRs). The unit and entity information is extracted from combined unit and entity information in the row defined by unit_row. The parameters regexp_unit and regexp_entity determines how this is done by regular expressions for unit and entity. Additionally, manual mappings can be defined in the manual_repl_unit and manual_repl_entity dicts. For each column the routine tries to extract a unit using the regular expression. If this fails it looks in the manual_repl_unit dict for unit and in manual_repl_entity for entity information. If there is no information the default unit given in default_unit is used. In this case the analyzed value is used as entity unchanged.
- Parameters:
- df_nirpd.DataFrame
Pandas DataFrame with the table to process
- unit_rowstr or int
String “header” to indicate that the column header should be used to derive the unit information or an integer specifying the row to use for unit information. If entity and unit information are given in the same row use only unit_row.
- entity_rowstr or int
String “header” to indicate that the column header should be used to derive the unit information or an integer specifying the row to use for entity information. If entity and unit information are given in the same row use only unit_row
- regexp_entitystr
regular expression that extracts the entity from the cell value
- regexp_unitstr (optional)
regular expression that extracts the unit from the cell value
- manual_repl_unitdict (optional)
dict defining unit for given cell values
- manual_repl_entitydict (optional)
dict defining entity for given cell values
- default_unitstr
unit to be used if no unit can be extracted an no unit is given
- Returns:
- pd.DataFrame
DataFrame with explicit unit information (as column header)