mpa.SimulateSort¶
Contents
Overview
SimulateSort
is a program within the mpathic package which simulates
performing a Sort Seq experiment.
Usage
>>> import mpathic
>>> loader = mpathic.io
>>> mp_df = loader.load_model('./mpathic/examples/true_model.txt')
>>> dataset_df = loader.load_dataset('./mpathic/data/sortseq/full-0/library.txt')
>>> mpathic.SimulateSort(df=dataset_df,mp=mp_df)
Example Input and Output
The input table to this function must contain sequence, counts, and energy columns
Example Input Table:
seq ct val
AGGTA 5 -.4
AGTTA 1 -.2
...
Example Output Table:
seq ct val ct_1 ct_2 ct_3 ...
AGGTA 5 -.4 1 2 1
AGTTA 1 -.2 0 1 0
...
The output table will contain all the original columns, along with the sorted columns (ct_1, ct_2 …)
Class Details¶
-
class
simulate_sort.
SimulateSort
(**kwargs)¶ Simulate cell sorting based on expression.
Parameters: - df: (pandas dataframe)
Input data frame.
- mp: (pandas dataframe)
Model data frame.
- noisetype: (string, None)
Noise parameter string indicating what type of
noise to include. Valid choices include None, ‘Normal’, ‘LogNormal’, ‘Plasmid’
- npar: (list)
parameters to go with noisetype. E.g. for
noisetype ‘Normal’, npar must contain the width of the normal distribution
- nbins: (int)
Number of bins that the different variants will get sorted into.
- sequence_library: (bool)
A value of True corresponds to simulating sequencing the library in bin zero
- start: (int)
Position to start analyzed region
- end: (int)
Position to end analyzed region
- chunksize: (int)
This represents the size of chunk the data frame df will be traversed over.
Attributes: - output_df: (pandas data frame)
contains the output of the simulate_sort constructor