mpa.PredictiveInfo

Overview

The predictive information class is a good way of assessing the quality of a model inferred from a massively parallel dataset.

Usage

>>> loader = mpathic.io
>>> dataset_df = loader.load_dataset(mpathic.__path__[0] + '/data/sortseq/full-0/library.txt')
>>> mp_df = loader.load_model(mpathic.__path__[0] + '/examples/true_model.txt')
    >>> ss = mpathic.SimulateSort(df=dataset_df, mp=mp_df)
>>> temp_ss = ss.output_df
>>> temp_ss = ss.output_df
>>> cols = ['ct', 'ct_0', 'ct_1', 'ct_2', 'ct_3', 'seq']
>>> temp_ss = temp_ss[cols]
>>> pi = mpathic.PredictiveInfo(data_df = temp_ss, model_df = mp_df, start=0)
>>> print(pi.out_MI)

Class Details

class predictive_info.PredictiveInfo(**kwargs)
Parameters:
data_df: (pandas data frame)

Dataframe containing several columns representing

bins and sequence column. The integer values in bins

represent the occurrence of the sequence that bin.

model_df: (pandas dataframe)

The dataframe containing a model of the binding

energy and a wild type sequence

start: (int)

Starting position of the sequence.

end: (int)

end position of the sequence.

err: (bool)

boolean variable which indiciates the inclusion of

error in the mutual information estimate if true

coarse_graining_level: (int)

Speed computation by coarse-graining model predictions