sbmlsim.data

Module handling data (experiment and simulation).

Module Contents

Classes

Data

Data.

DataFunction

Functional data calculation.

DataSeries

DataSet - a pd.Series with additional unit information.

DataSet

DataSet.

Functions

load_pkdb_dataframe(sid, data_path, sep='\t', comment='#', **kwargs)

Load TSV data from PKDB figure or table id.

load_pkdb_dataframes_by_substance(sid, data_path, **kwargs)

Load dataframes from given PKDB figure/table id split on substance.

Attributes

logger

sbmlsim.data.logger[source]
class sbmlsim.data.Data(experiment, index, task=None, dataset=None, function=None, variables=None)[source]

Bases: object

Data.

Main data generator class which uses data either from experimental data, simulations or via function calculations.

All transformation of data and a tree of data operations.

Parameters
  • index (str) –

  • task (str) –

  • dataset (str) –

class Types[source]

Bases: enum.Enum

Data types.

TASK = 1[source]
DATASET = 2[source]
FUNCTION = 3[source]
data[source]
_register_data(self)[source]

Register data in simulation.

__str__(self)[source]

Get string.

Return type

str

property sid(self)[source]

Get id.

Return type

str

is_task(self)[source]

Check if task.

Return type

bool

is_dataset(self)[source]

Check if dataset.

Return type

bool

is_function(self)[source]

Check if function.

property dtype(self)[source]

Get data type.

Return type

Data

to_dict(self)[source]

Convert to dictionary.

get_data(self, to_units=None)[source]

Return actual data from the data object.

Parameters

to_units (str) – units to convert to

Returns

class sbmlsim.data.DataFunction(index, formula, variables)[source]

Bases: object

Functional data calculation.

The idea ist to provide an object which can calculate a generic math function based on given input symbols.

Important challenge is to handle the correct functional evaluation.

class sbmlsim.data.DataSeries(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)[source]

Bases: pandas.Series

DataSet - a pd.Series with additional unit information.

Parameters
  • dtype (Dtype | None) –

  • copy (bool) –

  • fastpath (bool) –

_metadata = ['uinfo'][source]
property _constructor(self)[source]

Used when a manipulation result has the same dimensions as the original.

property _constructor_expanddim(self)[source]

Used when a manipulation result has one higher dimension as the original, such as Series.to_frame()

class sbmlsim.data.DataSet(data=None, index=None, columns=None, dtype=None, copy=None)[source]

Bases: pandas.DataFrame

DataSet.

pd.DataFrame with additional unit information in the form

of UnitInformations.

Parameters
  • index (Axes | None) –

  • columns (Axes | None) –

  • dtype (Dtype | None) –

  • copy (bool | None) –

_metadata = ['uinfo', 'Q_'][source]
property _constructor(self)[source]

Used when a manipulation result has the same dimensions as the original.

property _constructor_sliced(self)[source]
get_quantity(self, key)[source]

Return quantity for given key.

Requires using the numpy data instead of the series.

Parameters

key (str) –

__repr__(self)[source]

Return DataFrame with all columns.

Return type

str

classmethod from_df(cls, df, ureg, udict=None)[source]

Create DataSet from given pandas.DataFrame.

The DataFrame can have various formats which should be handled. Standard formats are 1. units annotations based on ‘*_unit’ columns, with additional ‘*_sd’

or ‘*_se’ units

  1. units annotations based on ‘unit’ column which is applied on ‘mean’, ‘value’, ‘sd’ and ‘se’ columns

Parameters
  • df (pandas.DataFrame) – pandas.DataFrame

  • uinfo – optional units information

  • ureg (sbmlsim.units.UnitRegistry) –

  • udict (Dict[str, str]) –

Returns

dataset

Return type

DataSet

unit_conversion(self, key, factor)[source]

Convert the units of the given key in the dataset.

Changes values in place in the DataSet.

The quantity in the dataset is multiplied with the conversion factor. In addition to the key, also the respective error measures are converted with the same factor, i.e. - {key} - {key}_sd - {key}_se - {key}_min - {key}_max

FIXME: in addition base keys should be updated in the table, i.e. if key in [mean, median, min, max, sd, se, cv] then the other keys should be updated; use default set of keys for automatic conversion

Parameters
  • key – column key in dataset (this column is unit converted)

  • factor (sbmlsim.units.Quantity) – multiplicative Quantity factor for conversion

Returns

None

Return type

None

sbmlsim.data.load_pkdb_dataframe(sid, data_path, sep='\t', comment='#', **kwargs)[source]

Load TSV data from PKDB figure or table id.

This is a simple helper functions to directly loading the TSV data. It is recommended to use pkdb_analysis methods instead.

This function will be removed.

E.g. for ‘Amchin1999_Tab1’ the file

data_path / ‘Amchin1999’ / ‘.Amchin1999.tsv’

is loaded.

Parameters
  • sid – figure or table id

  • data_path (Union[pathlib.Path, List[pathlib.Path]]) – base path of data or iterable of data_paths

  • sep – separator

  • comment – comment characters

  • kwargs – additional kwargs for csv parsing

Returns

pandas DataFrame

Return type

pandas.DataFrame

sbmlsim.data.load_pkdb_dataframes_by_substance(sid, data_path, **kwargs)[source]

Load dataframes from given PKDB figure/table id split on substance.

The DataFrame is split on the ‘substance’ key.

This is a simple helper functions to directly loading the TSV data. It is recommended to use pkdb_analysis methods instead.

This function will be removed.

Parameters
  • sid

  • data_path

  • kwargs

Returns

Dict[substance, pd.DataFrame]

Return type

Dict[str, pandas.DataFrame]