BayesicFitting

Model Fitting and Evidence Calculation

View project on GitHub



class RobustShell( IterativeFitter )Source

RobustShell tries to make a fit more robust in the presence of outliers.

It is a shell around another fitter. Technically it is in itself a a Fitter, but with limited possiblities.

A RobustShell tries to make a fit more robust in the presence of outliers. It does it by manipulating the weights: outliers are downweighted. "Normal" points keep their weights, more or less.

Apart from methods specific to the robustification, RobustShell has a fit method and little else from the Fitter family. Methods to get chi2, the covariance matrix, the evidence, the noise scale etc. should be taken from the embedded fitter.

The theory behind robust fitting can be found in Wikipedia: Robust Statistics, and the references therein.

To determine which points are "normal" and which are "outliers" we need a previous fit. The difference to this earlier fit determines the normalcy. Its amount is used to adjust the weights.

The fact that we need a previous iteration makes robust estimation a iterative procedure. Typically the procedure should stabilize in half a dozen steps maximally.

There are several schemes to adjust the weights. But before we go into that we need two values in each scheme. Firstly the amount of noise present in the data. By default the noise is taken from the previous fit via Fitter.scale. The second value needed is the size of the influence domain in terms of the noise scale.

For all schemes the deviant is calculated as the difference between data and fit divided by noisescale and domainsize

d = ( data - fit ) / ( noisescale * domainsize )

The domainsize is defined such that deviants upto 3 times the noisescale fall within the fwhm.

A number of weighting schemes are provided.

Kernel domain support edges comment
Biweight 5.54 bound smooth Tukey: Default kernel
CosSquare 6.00 bound smooth
Tricube 5.08 bound smooth
Triweight 6.60 bound smooth
Uniform 3.00 bound hard clip outside domain
Cosine 4.50 bound hard
Triangle 6.00 bound hard
Parabola 4.50 bound hard
Huber 1.50 unbound smooth In domain mean; out domain median
Gauss 2.12 unbound smooth
Lorentz 3.00 unbound smooth

Other schemes can be written by making another Kernel or writing a function
     wgts = func( d ) where d is the deviant as above.

Attributes

  • fitter : BaseFitter
         The fitter to be used
  • kernel : Kernel or callable
         Kernel take function from this kernel
         callable in the form f(d), where d = ( data - mock ) / ( domain * scale )
  • domain : float
         domain of the kernel function
  • onesided : [-1,0,+1]
         -1 apply to negative residuals
          0 aplly to both sided (not onesided)
         +1 apply to positive residuals.

Notes

Robust fitting is even more dangerous than ordinary fitting. Never trust what you get without thorough checking.

Example

model = PolynomialModel( 1 )                # some model
x = numpy.arange( 100, dtype=float ) / 100  # some x values
y = numpy.arange( 100, dtype=float ) / 4    # digitization noise
y[1,11,35,67] += 10                         # create outliers
ftr = Fitter( x, model )                    # a fitter, a model and a x
rob = RobustShell( ftr, domain=7 )          # robust fitter using someFtr
par = rob.fit( y )                          # solution
print( rob )                                #
print( rob.weights )                        # print final weights
print( ftr.chisq )                          # get from someFtr

Author Do Kester.

RobustShell( fitter, kernel=Biweight, domain=None, onesided=None, **kwargs )

Create a new class, providing the fitter to be used.

Parameters

  • fitter : BaseFitter
          to be used
  • kernel : Kernel or callable
         All Kernels have a method result( d ) which is applied to the deviants.
         where d = ( data - model ) / ( domain * scale )
         If kernel is a callable method it is assumed to be a similar result mathod.
  • domain : None or float
         Width of the kernel.
         None : automatic calculation of domain according to table in class doc.
         float : overrides autocalculation.
  • onesided : None or "positive" or "p" or "negative" or "n"
         None : apply robust weights to positive and negative residuals
         "positive" : apply robust weights to positive residuals only
         "negative" : apply robust weights to negative residuals only

setKernel( kernel )
Set the robust kernel to be used.

Parameters

  • kernel : Kernel or callable
         All Kernels have a method result( d ) which is applied to the deviants.
         where d = ( data - model ) / ( domain * scale )
         If kernel is a callable method it is assumed to be a similar result mathod.

Raises

ValueError when kernel is not recognized.

setOneSided( onesided )
set self.onesided to either 0 or +1 or -1.

Parameters

  • onesided : None or "positive" or "negative"
         None : apply robust weights to positive and negative residuals
         "positive" : apply robust weights to positive residuals only
         "negative" : apply robust weights to negative residuals only

Raises

ValueError when onesided could not be interpreted.

fit( data, weights=None, kernel=None, domain=None, onesided=None, **kwargs )
Perform a robustification step.

Parameters

  • data : array_like
         the data as they go into a fitter
  • kwargs : dict
         keyword args to be passed to fitter.fit()
Methods inherited from IterativeFitter
Methods inherited from BaseFitter