sbmlsim.comparison.diff

Helpers for numerical comparison of data.

Used in the context of model comparison and simulations between different simulators. Allows to tests semi-automatically for problems with the various models. Used to benchmark the simulation results.

Module Contents

Classes

DataSetsComparison

Comparing multiple simulation results.

Functions

get_files_by_extension(base_path[, extension])

Get all files by given extension.

Attributes

logger

sbmlsim.comparison.diff.logger[source]
sbmlsim.comparison.diff.get_files_by_extension(base_path, extension='.json')[source]

Get all files by given extension.

Simulation definitions are json files.

Parameters:
  • base_path (pathlib.Path) –

  • extension (str) –

Return type:

Dict[str, str]

class sbmlsim.comparison.diff.DataSetsComparison(dfs_dict, columns_filter=None, time_column=True, title=None, selections=None, factors=None)[source]

Comparing multiple simulation results.

Only the subset of identical columns are compared. In the beginning a matching of column names is performed to find the subset of columns which can be compared.

The simulations must contain a “time” column with identical time points.

Parameters:
  • dfs_dict (Dict[str, pandas.DataFrame]) –

  • time_column (bool) –

  • title (str) –

  • selections (Dict[str, str]) –

  • factors (Dict[str, float]) –

tol_abs = 0.0001[source]
tol_rel = 0.0001[source]
eps_plot[source]
classmethod _process_columns(dataframes)[source]

Get the intersection and union of columns.

Parameters:

dataframes

Returns:

classmethod _filter_dfs(dataframes, columns)[source]

Filter the dataframes using the column ids occurring in all datasets.

The common set of columns is used for comparison.

Parameters:
  • dataframes

  • columns

Returns:

List[pd.DataFrame], List[str], list of dataframes and simulator labels.

df_diff()[source]

Dataframe of all differences between the files.

https://github.com/sbmlteam/sbml-test-suite/blob/master/cases/semantic/README.md Let the following variables be defined:

  • abs_tol stand for the absolute tolerance for a tests case,

  • rel_tol stand for the relative tolerance for a tests case,

  • c_ij stand for the expected correct value for row i, column j, of the result data set for the tests case

  • u_ij stand for the corresponding value produced by a given software simulation system run by the user

These absolute and relative tolerances are used in the following way: a data point u_ij is considered to be within tolerances if and only if the following expression is true:

|c_ij - u_ij| <= (abs_tol + rel_tol * |c_ij|)

is_equal()[source]

Check if DataFrames are identical within numerical tolerance.

__str__()[source]

Get string.

Return type:

str

__repr__()[source]

Get representation.

report_str()[source]

Get report as string.

Return type:

str

report()[source]

Report.

plot_diff()[source]

Plot lines for entries which are above epsilon tre(object)shold.