sbmlsim.data

Module handling data (experiment and simulation).

Module Contents

Classes

Data

Data.

DataSeries

DataSet - a pd.Series with additional unit information.

DataSet

DataSet.

Functions

load_pkdb_dataframe(sid, data_path[, sep, comment])

Load TSV data from PKDB figure or table id.

load_pkdb_dataframes_by_substance(sid, data_path, **kwargs)

Load dataframes from given PKDB figure/table id split on substance.

Attributes

logger

sbmlsim.data.logger[source]
class sbmlsim.data.Data(index, symbol=None, task=None, dataset=None, function=None, variables=None, parameters=None, sid=None)[source]

Bases: object

Data.

Main data generator class which uses data either from experimental data, simulations or via function calculations.

All transformation of data and a tree of data operations. This is just a promise for data which will be fullfilled with data from tasks.

Parameters:
  • index (str) –

  • symbol (Optional[Symbols]) –

  • task (str) –

  • dataset (str) –

  • function (str) –

  • variables (Dict[str, Data]) –

  • parameters (Dict[str, float]) –

  • sid (str) –

class Types[source]

Bases: enum.Enum

Data types.

TASK = 1[source]
DATASET = 2[source]
FUNCTION = 3[source]
class Symbols[source]

Bases: enum.Enum

Symbols.

TIME = 1[source]
AMOUNT = 2[source]
CONCENTRATION = 3[source]
property selection: str[source]

Get selection string.

Depending on symbol, different selections have to be performed.

Return type:

str

property sid: str[source]

Get id.

Return type:

str

property name: str[source]

Get name.

Return type:

str

property dtype: Data[source]

Get data type.

Return type:

Data

__repr__()[source]

Get string.

Return type:

str

is_task()[source]

Check if task.

Return type:

bool

is_dataset()[source]

Check if dataset.

Return type:

bool

is_function()[source]

Check if function.

to_dict()[source]

Convert to dictionary.

get_data(experiment, to_units=None)[source]

Return actual data from the data object.

The data is resolved from the available datasets and the injected Experiment.

Parameters:
Returns:

Return type:

sbmlsim.units.Quantity

class sbmlsim.data.DataSeries(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)[source]

Bases: pandas.Series

DataSet - a pd.Series with additional unit information.

Parameters:
  • dtype (Dtype | None) –

  • copy (bool) –

  • fastpath (bool) –

property _constructor[source]

Used when a manipulation result has the same dimensions as the original.

property _constructor_expanddim[source]

Used when a manipulation result has one higher dimension as the original, such as Series.to_frame()

_metadata = ['uinfo'][source]
class sbmlsim.data.DataSet(data=None, index=None, columns=None, dtype=None, copy=None)[source]

Bases: pandas.DataFrame

DataSet.

pd.DataFrame with additional unit information in the form

of UnitInformations.

Parameters:
  • index (Axes | None) –

  • columns (Axes | None) –

  • dtype (Dtype | None) –

  • copy (bool | None) –

property _constructor[source]

Used when a manipulation result has the same dimensions as the original.

property _constructor_sliced[source]
_metadata = ['uinfo', 'Q_'][source]
get_quantity(key)[source]

Return quantity for given key.

Requires using the numpy data instead of the series.

Parameters:

key (str) –

__repr__()[source]

Return DataFrame with all columns.

Return type:

str

classmethod from_df(df, ureg, udict=None)[source]

Create DataSet from given pandas.DataFrame.

The DataFrame can have various formats which should be handled. Standard formats are 1. units annotations based on ‘*_unit’ columns, with additional ‘*_sd’

or ‘*_se’ units

  1. units annotations based on ‘unit’ column which is applied on ‘mean’, ‘value’, ‘sd’ and ‘se’ columns

Parameters:
  • df (pandas.DataFrame) – pandas.DataFrame

  • uinfo – optional units information

  • ureg (sbmlsim.units.UnitRegistry) –

  • udict (Dict[str, str]) –

Returns:

dataset

Return type:

DataSet

unit_conversion(key, factor)[source]

Convert the units of the given key in the dataset via key * factor.

Changes values in place in the DataSet.

The quantity in the dataset is multiplied with the conversion factor. In addition to the key, also the respective error measures are converted with the same factor, i.e. - {key} - {key}_sd - {key}_se - {key}_min - {key}_max

FIXME: in addition base keys should be updated in the table, i.e. if key in [mean, median, min, max, sd, se, cv] then the other keys should be updated; use default set of keys for automatic conversion

Parameters:
  • key – column key in dataset (this column is unit converted)

  • factor (sbmlsim.units.Quantity) – multiplicative Quantity factor for conversion

Returns:

None

Return type:

None

sbmlsim.data.load_pkdb_dataframe(sid, data_path, sep='\t', comment='#', **kwargs)[source]

Load TSV data from PKDB figure or table id.

This is a simple helper functions to directly loading the TSV data. It is recommended to use pkdb_analysis methods instead.

This function will be removed.

E.g. for ‘Amchin1999_Tab1’ the file

data_path / ‘Amchin1999’ / ‘.Amchin1999.tsv’

is loaded.

Parameters:
  • sid – figure or table id

  • data_path (Union[pathlib.Path, List[pathlib.Path]]) – base path of data or iterable of data_paths

  • sep – separator

  • comment – comment characters

  • kwargs – additional kwargs for csv parsing

Returns:

pandas DataFrame

Return type:

pandas.DataFrame

sbmlsim.data.load_pkdb_dataframes_by_substance(sid, data_path, **kwargs)[source]

Load dataframes from given PKDB figure/table id split on substance.

The DataFrame is split on the ‘substance’ key.

This is a simple helper functions to directly loading the TSV data. It is recommended to use pkdb_analysis methods instead.

This function will be removed.

Parameters:
  • sid

  • data_path

  • kwargs

Returns:

Dict[substance, pd.DataFrame]

Return type:

Dict[str, pandas.DataFrame]