clophfit.fitting.core ===================== .. py:module:: clophfit.fitting.core .. autoapi-nested-parse:: Clophfit: Fitting of Cl- binding and pH titration curves. This module provides a comprehensive suite of tools for analyzing titration data, particularly for chloride binding and pH titration experiments common in biochemistry, such as those involving fluorescent probes. Core Functionality: ------------------- 1. **Data Modeling**: Implements a 1-site binding model suitable for both ligand concentration and pH titrations. 2. **Spectral Data Processing**: - Processes raw spectral data (e.g., from fluorescence spectroscopy). - Offers two methods for data reduction: - Singular Value Decomposition (SVD) to extract the most significant spectral component. - Band integration over a specified wavelength range. 3. **Curve Fitting**: Provides three distinct fitting backends to determine the dissociation constant (K) and other parameters: - **Least-Squares (LM)**: Utilizes the `lmfit` library for robust non-linear least-squares minimization. Supports iterative reweighting and outlier removal. - **Orthogonal Distance Regression (ODR)**: Employs `odrpack` to account for uncertainties in both x and y variables, which is crucial when x-values (e.g., pH measurements) have errors. - **Bayesian Modeling (PyMC)**: Implements a hierarchical Bayesian model using `pymc`. This approach is powerful for: - Quantifying parameter uncertainties as full posterior distributions. - Modeling errors in x-values as latent variables. - Sharing information between multiple experiments (hierarchical fitting) to obtain more robust parameter estimates. 4. **Result Visualization**: Includes extensive plotting functions to visualize: - Raw and processed spectra. - Fitted curves with confidence intervals. - Diagnostic plots for SVD and Bayesian analyses (e.g., corner plots). Functions --------- .. autoapisummary:: clophfit.fitting.core.weight_da clophfit.fitting.core.weight_multi_ds_titration clophfit.fitting.core.analyze_spectra clophfit.fitting.core.analyze_spectra_glob clophfit.fitting.core.fit_binding_glob Module Contents --------------- .. py:function:: weight_da(da, *, is_ph) Estimate initial weights for a DataArray by fitting it individually. The standard error of the residuals from this initial fit is used as the uncertainty (``y_err``) for subsequent weighted fits. :param da: The data array to be weighted. :type da: DataArray :param is_ph: Whether the titration is pH-based. :type is_ph: bool :returns: True if the weighting fit was successful, False otherwise. :rtype: bool .. py:function:: weight_multi_ds_titration(ds) Assign weights to all DataArrays within a Dataset. Iterates through each `DataArray` in the `Dataset`, calling `weight_da` to estimate `y_err`. For any `DataArray` where weighting fails (e.g., due to insufficient data), a fallback error is assigned based on the errors from successfully fitted arrays. Optimized version with reduced set operations and memory allocations. .. py:function:: analyze_spectra(spectra, *, is_ph, band = None) Analyze spectra titration, fit the data, and plot the results. This function performs either Singular Value Decomposition (SVD) or integrates spectra over a specified band. :param spectra: The DataFrame containing spectra (one spectrum for each column). :type spectra: pd.DataFrame :param is_ph: Whether the x-axis represents pH. :type is_ph: bool :param band: If provided, use the 'band' integration method. Otherwise, use 'svd'. :type band: tuple[int, int] | None :returns: An object containing the fit results and the summary plot. :rtype: FitResult[Minimizer] :raises ValueError: If the band parameters are not in the spectra's index when the band method is used. .. rubric:: Notes Creates plots of spectra, principal component vectors, singular values, fit of the first principal component and PCA for SVD; only of spectra and fit for Band method. .. py:function:: analyze_spectra_glob(titration, ds, dbands = None) Analyze multi-label spectra visualize the results. .. py:function:: fit_binding_glob(ds, *, method = 'lm', reweight = None, remove_outliers = None, max_iter = 15, tol = 0.01, scale_covar = True) Analyze multi-label titration datasets and visualize the results. Unified fitting function that supports standard least-squares and robust fitting with optional iterative reweighting and outlier detection. :param ds: Input dataset with *x*, *y*, and *y_err* for each label. :type ds: Dataset :param method: Fitting method: ``"lm"`` (default) for standard least-squares or ``"huber"`` for Huber-loss robust fitting (reduces outlier influence). :type method: str, optional :param reweight: Reweighting strategy to apply **after** each residual evaluation: * ``"irls"`` - iteratively reweighted least-squares (uniform scale per label from MA-residual). Default is ``None`` (no reweighting). :type reweight: str | None, optional :param remove_outliers: Outlier-removal specification of the form ``"zscore:threshold:min_keep"`` where *threshold* is the z-score cutoff and *min_keep* is the minimum number of points required per label. Default is ``None``. :type remove_outliers: str | None, optional :param max_iter: Maximum number of iterations for iterative procedures (reweighting). Default is 15. :type max_iter: int, optional :param tol: Convergence tolerance on the reduced chi-squared. The loop stops when the improvement drops below this value. Default is 0.01. :type tol: float, optional :param scale_covar: Whether to scale the covariance matrix. Default is ``True``. :type scale_covar: bool, optional :returns: An object containing the fit results, plot figure, minimizer, and dataset copy. :rtype: FitResult[Minimizer] :raises InsufficientDataError: If there are not enough data points for the number of parameters. .. rubric:: Notes Parameter uncertainties are scaled by :math:`\\sqrt{\\chi^2_\\nu}` via lmfit's Minimizer(scale_covar=True), which improves coverage when errors are underestimated. Residuals returned are WEIGHTED (weight * (observed - predicted)) where weight = 1/y_err. This is appropriate for heteroscedastic data where different observations have different uncertainties.