Maths
Stats Fitting
Module implementing methods to find the best fit of statistical distributions to data.
- pyhdtoolkit.maths.stats_fitting.best_fit_distribution(data: pd.Series | np.ndarray, bins: int = 200, ax: Axes = None) tuple[st.rv_continuous, tuple[float, ...]][source]
- Added in version 0.5.0. - Model data by finding the best fit candidate distribution among those in - DISTRIBUTIONS. One can find an example use of this function in the gallery.- Parameters:
- data ( - Union[pd.Series,- np.ndarray]) -- A- pandas.Seriesor- numpy.ndarraywith your distribution data.
- bins ( - int) -- The number of bins to decompose your data in before fitting.
- ax ( - matplotlib.axes.Axes, optional) -- The- matplotlib.axes.Axeson which to plot the probability density function of the different fitted distributions. This should be provided as the axis on which the distribution data is plotted, as it will add to that plot. If not provided, no plotting will be done.
 
- Returns:
- tuple[st.rv_continuous,- tuple[float,- ]]-- A- tuplecontaining the- scipy.statsgenerator corresponding to the best fit to the data among the provided candidates, and the parameters for said generator to best fit the data.
 - Example - best_fit_func, best_fit_params = best_fit_distribution(data, 200, axis) 
- pyhdtoolkit.maths.stats_fitting.make_pdf(distribution: rv_continuous, params: tuple[float, ...], size: int = 25000) Series[source]
- Added in version 0.5.0. - Generates a - pandas.Seriesfor the distributions’s Probability Distribution Function. This Series will have axis values as index, and PDF values as column. One can find an example use of this function in the distribution fitting gallery.- Parameters:
- distribution ( - scipy.stats.rv_continuous) -- The- scipy.statsgenerator for the distribution to generate the PDF from.
- params ( - tuple[float,- ]) -- The parameters for this generator, as given back by the fit to data or as guessed from the user.
- size ( - int) -- The number of points to evaluate the PDF on.
 
- Returns:
- pandas.Series-- A- pandas.Serieswith the PDF as values, and the corresponding axis values as index.
 - Example - best_fit_func, best_fit_params = best_fit_distribution(data, 200, axis) pdf = fitting.make_pdf(best_fit_func, best_fit_params) 
- pyhdtoolkit.maths.stats_fitting.set_distributions_dict(dist_dict: dict[rv_continuous, str]) None[source]
- Added in version 0.5.0. - Sets - DISTRIBUTIONSas the provided- dict. This allows the user to define the distributions to try and fit against the data. One can find an example use of this function in the distribution fitting gallery.- Warning - This function modifies the global - DISTRIBUTIONS- dictthat is used by other functions in this module. It’s not the cleanest way to do things that you’ll ever see.- Parameters:
- dist_dict ( - dict[st.rv_continuous,- str]) -- dictionnary with the wanted distributions, in the same format as the- DISTRIBUTIONSdict, aka with a- scipy.statsgenerator object as key, and a string representation of their name as value.
 - Example - import scipy.stats as st tested_dists = {st.chi: "Chi", st.expon: "Exponential", st.laplace: "Laplace"} set_distributions_dict(tested_dists) 
Utilities
Module with utility functions used throughout the nonconvex_phase_sync
and stats_fitting modules.
- pyhdtoolkit.maths.utils.get_magnitude(value: float) int[source]
- Added in version 0.8.2. - Returns the determined magnitude of the provided value. This corresponds to the power of 10 that would be necessary to reduce value to a \(X \cdot 10^{n}\) form. In this case, n is the result. - Parameters:
- value ( - float) -- Value to determine the magnitude of.
- Returns:
 - Examples - get_magnitude(10) # returns 1 - get_magnitude(0.0311) # returns -2 - get_magnitude(1e-7) # returns -7 
- pyhdtoolkit.maths.utils.get_scaled_values_and_magnitude_string(values_array: pd.DataFrame | np.ndarray, force_magnitude: float | None = None) tuple[pd.DataFrame | np.ndarray, str][source]
- Added in version 0.8.2. - Conveniently scales the provided values to the best determined magnitude, and returns the scaled values and the magnitude string to use in plots labels. - Parameters:
- values_array ( - Union[pd.DataFrame,- np.ndarray]) -- Vectorised structure containing the values to scale.
- force_magnitude ( - float, optional) -- A specific magnitude value to use for the scaling, if desired.
 
- Returns:
- tuple[pandas.DataFrame | numpy.ndarray,- str]-- A- tupleof the scaled values (same type as the provided ones) and the string to use for the scale in plots labels and legends.