mpa.ProfileFreq

Overview

ProfileFreq is a program within the mpathic package which calculates the fractional occurrence of each base or amino acid at each position.

Usage

>>> import mpathic as mpa
>>> mpa.ProfileFreq(dataset_df = dataset_df)

Example Input and Output

Input tables must contain a position column (labeled ‘’pos’‘) and columns for each base or amino acid (labeled ct_A, ct_C…).

Example Input Table:

pos ct_A ct_C ct_G ct_T
0   10   20   40   30
...

Example Output Table:

pos freq_A freq_C freq_G freq_T
0   .1     .2     .4     .3
...

Class Details

class profile_freq.ProfileFreq(**kwargs)

Profile Frequencies computes character frequencies (0.0 to 1.0) at each position

Parameters:
dataset_df: (pandas dataframe)

A dataframe containing a valid dataset.

bin: (int)

A bin number specifying which counts to use

start: (int)

An integer specifying the sequence start position

end: (int)

An integer specifying the sequence end position

Returns:
freq_df: (pd.DataFrame)

A dataframe containing counts for each nucleotide/amino

acid character at each position.