BayesicFitting

Model Fitting and Evidence Calculation

View project on GitHub



class BaseFitter( object )Source

Base class for all Fitters.

The Fitter class is to be used in conjunction with *Model classes.

The Fitter class and its descendants fit data to a model. Fitter itself is the variant for linear models, ie. models linear in its parameters.

For both linear and nonlinear models it holds that once the optimal estimate of the parameters is found, a variety of calculations is exactly the same: standard deviations, noise scale, evidence and model errors. They all derive more or less from the inverse Hessian matrix ( aka the covariance matrix ). All these calculations are in this Fitter class. Other Fitter classes relegate their calculation in these issues to this one.

Examples

It is not possible to use this class. User Fitter, CurveFitter etc. in stead

Note Also

  1. The calculation of the evidence is an Gaussian approximation which is
        only exact for linear models with a fixed scale.
  2. Attributes labelled as read only should not be set by a user.
  • Author : Do Kester

Attributes

  • model : Model
         the model to be fitted

  • xdata : array_like
         independent variable(s)

  • nxdata : int (read only)
         length of the xdata vector(s)

  • ndim : int (read only)
         number of xdata vectors

  • weights : array_like
         the weights on the data from the last fit

  • imageAssistant : ImageAssistant
         to convert images to pixels indices, needed for a fit

  • keep : dict of {int : float}
         to keep the indexed (int) parameter at the provided value (float)

  • fitIndex : list of int (or None)
         list of parameter indices to fit (None is all)

  • npfit : int
         the number of parameters fitted in the last fit.

  • fixedScale : float
         the fixed noise scale.
         The presence of fixedScale has consequences for the definitions of chisq,
         (co)variance, stdevs and evidence

  • minimumScale : float
         introduce a minimum value for the noise scale

  • design : matrix (read only)
         the design matrix (partial of model to parameters)
         returns self.getDesign()

Attributes (available after a call to fit())

  • yfit : array_like
         The model result at the optimal value for the parameters.
         If map is true, a map is returned.
  • chisq : float (read only)
         chisquared of the fit
  • parameters : ndarray
         parameters fitted to the model
         returns self.model.parameters
  • stdevs, standardDeviations : array_like (read only)
         the standard deviations on the parameters
         returns self.getStandardDeviations()
  • hessian : matrix (read only)
         the hessian matrix
  • covariance : matrix (read only)
         the covariance matrix
         returns self.getCovarianceMatrix()
  • scale : float
         the noise scale
         returns self.getScale()
  • sumwgt : float (read only)
         sum of the weights
  • logZ : float (read only)
         the e-log of the evidence
         returns self.getLogZ()
  • evidence : float (read only)
         the 10log of the evidence (logZ / log(10))
         returns self.getEvidence()

Attributes (available after a call to getLogZ() or getEvidence())

  • logOccam : float (read only)
         Occam factor
  • logLikelihood : float (read only)
         log of the likelihood

BaseFitter( xdata, model, map=False, keep=None, fixedScale=None )

Create a new Fitter, providing inputs and model.

A Fitter class is defined by its model and the input vectors ( the independent variable ). When a fit to another model and/or another input vector is needed a new object should be created.

Parameters

  • xdata : array_like
         independent input variable(s)
  • model : Model
         the model function to be fitted
  • map : bool (False)
         When true, the xdata should be interpreted as a map.
         The fitting is done on the pixel indices of the map,
         using ImageAssistant
  • keep : dict of {int:float}
         dictionary of indices (int) to be kept at a fixed value (float)
         The values of keep will be used by the Fitter as long as the Fitter exists.
         See also fit( ..., keep=dict )
  • fixedScale : None or float
         None : the noise scale is not fixed
         float: value of the fixed noise scale
         The value of fixedScale only influences the evidence calculation

Raises

ValueError when one of the following is true
     1. Dimensionality of model and input does not match.
     2. Nans in input stream.
     3. Model is not the head of a compound model chain.

setMinimumScale( scale=0 )
Introduce a minimum in scale calculation and consequently in chisq.
     chi2 >= sumwgt * scale2

Parameters

  • scale : float
         minimum scale

fitprolog( ydata, weights=None, accuracy=None, keep=None )
Prolog for all Fitters.
  1. Checks data/weighs/accuracy for Nans
  2. Makes fitIndex.

Parameters

  • ydata : array_like
         the data vector to be fitted
  • weights : array_like
         weights pertaining to the data
  • accuracy : float or array_like
         accuracy of (individual) data
  • keep : dict of {int:float}
         dictionary of indices (int) to be kept at a fixed value (float)

Returns

  • fitIndex : ndarray of int
         Indices of the parameters that need fitting
  • ydata : ndarray
         Only different from input when ydata is a map
  • fitwgts : float or ndarray
         Combines weights and accuracy into ( weights / accuracy^2 )
         1.0 if both are None

fitpostscript( ydata, plot=False )
Produce a plot of the results.

keepFixed( keep=None )
Keeps parameters fixed at the provided values.
  1. The model will act exactly as if it were a model with less
        parameters, although slightly less efficient.
  2. Repeated calls start from scratch.
  3. Reset with keepFixed( None )

Parameters

  • keep : dict of {int:float}
         dictionary of indices (int) to be kept at a fixed value (float)

Returns

  • fitIndex : list of int (or None)
         list of parameter indices to be kept

insertParameters( fitpar, index=None, into=None )
Insert fitparameters into the parameters when fitIndex is present.

Parameters

  • fitpar : list of float
         (fitted) parameters
  • index : list of int
         list of parameter indices to be kept
  • into : list of float
         array into which the fitpar need to be inserted.

modelFit( ydata, weights=None, keep=None )
Return model fitted to the data.

Parameters

  • ydata : array_like
         the data vector to be fitted.
  • weights : None or array_like
         weights to be used
  • keep : dict of {int:float}
         dictionary of indices (int) to be kept at a fixed value (float)
         The values will override those at initialization.
         They are only used in this call of fit.

limitsFit( ydata, weights=None, keep=None )
Fit the data to the model. When a parameter(s) transgresses the limits, it set and fixed at that limit and the fit is done again, excluding the parameter(s) When the chisq landscape is largely monomodal (no local minima) this is OK.

Parameter

  • ydata : array_like
         data that the model needs to be fit to
  • weights : array_like
         weights partaining to the data.
  • keep : dict of {int:float}
         dictionary of indices (int) to be kept at a fixed value (float)
         The values will override those at initialization.
         They are only used in this call of fit.

Returns

  • pars : array_like
         the parameters of the fit

Raises

Warning when parameters have been reset at the limits.

fit( ydata, weights=None, keep=None )
Return model parameters fitted to the data.

Parameters

  • ydata : array_like
         the data vector to be fitted.
  • weights : array_like
         weights to be used
  • keep : dict of {int:float}
         dictionary of indices (int) to be kept at a fixed value (float)
         The values will override those at initialization.
         They are only used in this call of fit.

Raises

NotImplementedError. BaseFitter cannot perform fits by itself.

checkNan( ydata, weights=None, accuracy=None )
Check there are no Nans or Infs in ydata or weights or accuracy. Check also for zeros or negatives in accuracy.

Parameters

  • ydata : array_like
         data to be fitted.
  • weights : array_like
         weights pertaining to ydata
  • accuracy : float or array_like
         accuracy of (individual) data

Raises

ValueError.

getVector( ydata, index=None )
Return the β-vector.

It includes "normalized" data if present. See normalize().

Parameters

  • ydata : array_like
         the data vector to be fitted. When such is appliccable, it should be
         multiplied by weights and/or appended by normdata.
  • index : list of int
         index of parameters to be fixed

getHessian( params=None, weights=None, index=None )
Calculates the hessian matrix for a given set of model parameters.

It includes "normalized" data if present. See normalize()

Parameters

  • params : array_like
         the model parameters to be considered
  • weights : array_like
         weights to be used
  • index : list of int
         index of parameters to be fixed

getInverseHessian( params=None, weights=None, index=None )
Return the inverse of the Hessian Matrix, H.

Parameters

  • ydata : array_like
         the data vector to be fitted.
  • weights : array_like
         weights to be used
  • index : list of int
         index of parameters to be fixed

getCovarianceMatrix( )
Returns the inverse hessian matrix over the fitted parameters,
         multiplied by the variance.

Stdevs are found from this as np.sqrt( np.diag( covarianceMatrix ) )

makeVariance( scale=None )
Return the (calculated) variance of the remaining noise. I.e.
     var = chisq / dof when automatic noise scaling is requested or
     var = scale2 when we have a fixed scale.

Parameters

  • scale : float
         noise scale to be used

normalize( normdfdp, normdata, weight=1.0 )
If for some reason the model is degenerate, e.g when two parameters measure essentially the same thing, This method can disambiguate these parameters.

It is like adding a dummy measurement of one (or more) parameter to the data.

Parameters

  • normdfdp : array_like
         for each parameter to sum to value (same length as self.parameters)
  • normdata : float
         simulated data value
  • weight : float
         weight of this measurement

getDesign( params=None, xdata=None, index=None )
Return the design matrix, D. The design matrix is also known as the Jacobian Matrix.

Parameters

  • xdata : array_like
         the independent input data
  • params : array_like
         parameters of the model
  • index : list of int
         index of parameters to be fixed

chiSquared( ydata, params=None, weights=None )
Calculates Chi-Squared for data and weights.

It is the (weighted) sum of the squared residuals.

Parameters

  • ydata : array_like
         the data vector to be fitted.
  • params : array_like
         parameters for the model
  • weights : array_like
         weights to be used

Raises

ValueError when chisq <= 0.

getStandardDeviations( )
Calculates of standard deviations pertaining to the parameters.

     σi = s * sqrt( C[i,i] )

where C is the Covariance matrix, the inverse of the Hessian Matrix and s is the noiseScale.

Standard deviation are calculated for the fitted parameters only.

Note that the stdev will decrease with sqrt( N ) of the number of datapoints while the noise scale, s, does not.

monteCarloError( xdata=None, monteCarlo=None)
Calculates σ-confidence regions on the model given some inputs.

From the full covariance matrix (inverse of the Hessian) random samples are drawn, which are added to the parameters. With this new set of parameters the model is calculated. This procedure is done by default, 25 times. The standard deviation of the models is returned as the error bar.

The calculation of the confidence region is delegated to the class MonteCarlo. For tweaking of that class can be done outside BaseFitter.

Parameters

  • xdata : array_like
         input data over which to calculate the error bars.
  • monteCarlo : MonteCarlo
         a ready-made MonteCarlo class.

getScale( )
Return
  • float : the noise scale
         scale = sqrt( chisq / DOF )

Raise

RuntimeError when DoF <= 0. The number of (weighted) datapoints is too small.

getEvidence( limits=None, noiseLimits=None )
Calculation of the evidence, log10( Z ), for the model given the data.

     E = log10( P( Model | data ) )

The calculation of the evidence uses a Gaussion approximation of the Posterior probability. It needs to know the limits of the parameters (and the noise scale if applicable), either from the priors in the model or from keywords "limits/noiseLimits".

Parameters

  • limits : list of 2 floats/array_likes
         possible range of the parameters. ( [low,high] )
  • noiseLimits : list of 2 floats
         possible range on noise scale ( [low,high] )

Raises

ValueError when no Prior is available

getLogLikelihood( autoscale=False, var=1.0 )
Return the log likelihood.

It is implementing eq 19/20 last parts (Kester 2002) term by term

Parameters

  • autoscale : bool
         whether the noise scale is optimized too
  • var : float
         variance

getLogZ( limits=None, noiseLimits=None )
Calculation of the evidence, log( Z ), for the model given the data.

     logZ = log( P( Model | data ) )

The calculation of the evidence uses a Gaussion approximation of the Posterior probability. It needs to know the limits of the parameters (and the noise scale if applicable), either from the priors in the model or from keywords "limits/noiseLimits".

Parameters

  • limits : list of 2 floats/array_likes
         possible range of the parameters. ( [low,high] )
  • noiseLimits : list of 2 floats
         possible range on noise scale ( [low,high] )

Raises

ValueError when no Prior is available

RuntimeError when DoF <= 0. The number of (weighted) datapoints is too small.

plotResult( xdata=None, ydata=None, model=None, residuals=True, confidence=False, show=True )
Plot the results of the fit.

Parameters

  • xdata : array_like
         xdata of the problem
  • ydata : array_like
         ydata of the problem
  • model : Model
         the model the ydata are fitted to at xdata.
  • residuals : bool
         plot the residuals in a separate panel.
  • confidence : bool
         plot confidence region
  • show : bool
         display the plot.