This class is all static methods. When the data in a column of a
dataframe are floats, we want to group those floats into bins and assign
an int to each bin. We refer to this process as binning. This class is
mainly just a wrapper for some pandas methods that do most of the heavy
lifting.
def learning.DataBinner.DataBinner.bin_col |
( |
|
df, |
|
|
|
col_name, |
|
|
|
num_bins, |
|
|
|
do_qtls = True |
|
) |
| |
|
static |
Bins INPLACE the column called col_name in the dataframe df. By
inplace we mean that df is changed: its column col_name is replaced
by a binned version. The function returns bin_edges, a list of the
edges of the bins, and bin_to_mean, a dictionary mapping bin number
to the mean value of the points inside the bin. This is mainly a
wrapper for the Pandas functions cut() and qcut().
Parameters
----------
df : pandas.DataFrame
col_name : str
name of the column that you wish to bin
num_bins : int
number of bins
do_qtls : bool
do quantiles. If True, will bin into quantiles, if False will
use equal length bins.
Returns
-------
(list(float), list(float))