Virtual U.org
Get Personal Training on VU Today
    
Top shadow
 
 register/help
User Name:

Password:

NEWMATNL.H Source File
Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members  

NEWMATNL.H

Go to the documentation of this file.
00001 //Owner: Fred
00002 //$$ newmatnl.h           definition file for non-linear optimisation
00003 
00004 // Copyright (C) 1993,4,5: R B Davies
00005 
00006 #ifndef NEWMATNL_LIB
00007 #define NEWMATNL_LIB 0
00008 
00009 #include "newmat.h"
00010 
00011 #ifdef use_namespace
00012 namespace NEWMAT {
00013 #endif
00014 
00015     /*
00016 
00017       This is a beginning of a series of classes for non-linear optimisation.
00018 
00019       At present there are two classes. FindMaximum2 is the basic optimisation
00020       strategy when one is doing an optimisation where one has first
00021       derivatives and estimates of the second derivatives. Class
00022       NonLinearLeastSquares is derived from FindMaximum2. This provides the
00023       functions that calculate function values and derivatives.
00024 
00025       A third class is now added. This is for doing maximum-likelihood when
00026       you have first derviatives and something like the Fisher Information
00027       matrix (eg the variance covariance matrix of the first derivatives or
00028       minus the second derivatives - this matrix is assumed to be positive
00029       definite).
00030 
00035       class FindMaximum2
00036 
00037       Suppose T is the ColumnVector of parameters, F(T) the function we want
00038       to maximise, D(T) the ColumnVector of derivatives of F with respect to
00039       T, and S(T) the matrix of second derivatives.
00040 
00041       Then the basic iteration is given a value of T, update it to
00042 
00043       T - S.i() * D
00044 
00045       where .i() denotes inverse.
00046 
00047       If F was quadratic this would give exactly the right answer (except it
00048       might get a minimum rather than a maximum). Since F is not usually
00049       quadratic, the simple procedure would be to recalculate S and D with the
00050       new value of T and keep iterating until the process converges. This is
00051       known as the method of conjugate gradients.
00052 
00053       In practice, this method may not converge. FindMaximum2 considers an
00054       iteration of the form
00055 
00056       T - x * S.i() * D
00057 
00058       where x is a number. It tries x = 1 and uses the values of F and its
00059       slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
00060       choses x to maximise the resulting function. This gives our new value of
00061       T. The program checks that the value of F is getting better and carries
00062       out a variety of strategies if it isn't.
00063 
00064       The program also has a second strategy. If the successive values of T
00065       seem to be lying along a curve - eg we are following along a curved
00066       ridge, the program will try to fit this ridge and project along it. This
00067       doesn't work at present and is commented out.
00068 
00069       FindMaximum2 has three virtual functions which need to be over-ridden by
00070       a derived class.
00071 
00072       void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
00073 
00074       T is the column vector of parameters. The function returns the value of
00075       the function to f, but may instead set oorg to true if the parameter
00076       values are not valid. If wg is true it may also calculate and store the
00077       second derivative information.
00078 
00079       bool NextPoint(ColumnVector& H, Real& d);
00080 
00081       Using the value of T provided in the previous call of Value, find the
00082       conjugate gradients adjustment to T, that is - S.i() * D. Also return
00083 
00084       d = D.t() * S.i() * D.
00085 
00086       NextPoint should return true if it considers that the process has
00087       converged (d very small) and false otherwise. The previous call of Value
00088       will have set wg to true, so that S will be available.
00089 
00090       Real LastDerivative(const ColumnVector& H);
00091 
00092       Return the scalar product of H and the vector of derivatives at the last
00093       value of T.
00094 
00095       The function Fit is the function that calls the iteration.
00096 
00097       void Fit(ColumnVector&, int);
00098 
00099       The arguments are the trial parameter values as a ColumnVector and the
00100       maximum number of iterations. The program calls a DataException if the
00101       initial parameters are not valid and a ConvergenceException if the
00102       process fails to converge.
00103 
00107       class NonLinearLeastSquares
00108 
00109       This class is derived from FindMaximum2 and carries out a non-linear
00110       least squares fit. It uses a QR decomposition to carry out the
00111       operations required by FindMaximum2.
00112 
00113       A prototype class R1_Col_I_D is provided. The user needs to derive a
00115       class from this which includes functions the predicted value of each
00116       observation its derivatives. An object from this class has to be
00117       provided to class NonLinearLeastSquares.
00118 
00119       Suppose we observe n normal random variables with the same unknown
00120       variance and such the i-th one has expected value given by f(i,P)
00121       where P is a column vector of unknown parameters and f is a known
00122       function. We wish to estimate P.
00123 
00124       First derive a class from R1_Col_I_D and override Real operator()(int i)
00125       to give the value of the function f in terms of i and the ColumnVector
00126       para defined in class R1_CoL_I_D. Also override ReturnMatrix
00127       Derivatives() to give the derivates of f at para and the value of i
00128       used in the preceeding call to operator(). Return the result as a
00129       RowVector. Construct an object from this class. Suppose in what follows
00130       it is called pred.
00131 
00132       Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
00133       an iteration limit and an accuracy critierion.
00134 
00135       NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
00136 
00137       The accuracy critierion should be somewhat less than one and 0.0001 is
00138       about the smallest sensible value.
00139 
00140       Define a ColumnVector P containing a guess at the value of the unknown
00141       parameter, and a ColumnVector Y containing the unknown data. Call
00142 
00143       NLLS.Fit(Y,P);
00144 
00145       If the process converges, P will contain the estimates of the unknown
00146       parameters. If it doesn't converge an exception will be generated.
00147 
00148       The following member functions can be called after you have done a fit.
00149 
00150       Real ResidualVariance() const;
00151 
00152       The estimate of the variance of the observations.
00153 
00154       void GetResiduals(ColumnVector& Z) const;
00155 
00156       The residuals of the individual observations.
00157 
00158       void GetStandardErrors(ColumnVector&);
00159 
00160       The standard errors of the observations.
00161 
00162       void GetCorrelations(SymmetricMatrix&);
00163 
00164       The correlations of the observations.
00165 
00166       void GetHatDiagonal(DiagonalMatrix&) const;
00167 
00168       Forms a diagonal matrix of values between 0 and 1. If the i-th value is
00169       larger than, say 0.2, then the i-th data value could have an undue
00170       influence on your estimates.
00171 
00172     */
00173 
00178     class FindMaximum2 {
00179         virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
00180         virtual bool NextPoint(ColumnVector&, Real&) = 0;
00181         virtual Real LastDerivative(const ColumnVector&) = 0;
00182     public:
00183         void Fit(ColumnVector&, int);
00184     };
00185 
00189     class R1_Col_I_D {
00190         // The prototype for a Real function of a ColumnVector and an
00191         // integer.
00192         // You need to derive your function from this one and put in your
00193         // function for operator() and Derivatives() at least.
00194         // You may also want to set up a constructor to enter in additional
00195         // parameter values (that won't vary during the solve).
00196 
00197     protected:
00198         ColumnVector para;                          // Current x value
00199 
00200     public:
00201         virtual bool IsValid() { return true; }
00202         // is the current x value OK
00203         virtual Real operator()(int i) = 0;         // i-th function value at current para
00204         virtual void Set(const ColumnVector& X) { para = X; }
00205         // set current para
00206         bool IsValid(const ColumnVector& X)
00207             { Set(X); return IsValid(); }
00208         // set para, check OK
00209         Real operator()(int i, const ColumnVector& X)
00210             { Set(X); return operator()(i); }
00211         // set para, return value
00212         virtual ReturnMatrix Derivatives() = 0;
00213         // return derivatives as RowVector
00214     };
00215 
00219     class NonLinearLeastSquares : public FindMaximum2 {
00220         // these replace the corresponding functions in FindMaximum2
00221         void Value(const ColumnVector&, bool, Real&, bool&);
00222         bool NextPoint(ColumnVector&, Real&);
00223         Real LastDerivative(const ColumnVector&);
00224 
00225         Matrix X;                                     // the things we need to do the
00226         ColumnVector Y;                               // QR triangularisation
00227         UpperTriangularMatrix U;                      // see the write-up in newmata.txt
00228         ColumnVector M;
00229         Real errorvar, criterion;
00230         int n_obs, n_param;
00231         const ColumnVector* DataPointer;
00232         RowVector Derivs;
00233         SymmetricMatrix Covariance;
00234         DiagonalMatrix SE;
00235         R1_Col_I_D& Pred;                             // Reference to predictor object
00236         int Lim;                                      // maximum number of iterations
00237 
00238     public:
00239         NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
00240             : Pred(pred), Lim(lim), criterion(crit) {}
00241         void Fit(const ColumnVector&, ColumnVector&);
00242         Real ResidualVariance() const { return errorvar; }
00243         void GetResiduals(ColumnVector& Z) const { Z = Y; }
00244         void GetStandardErrors(ColumnVector&);
00245         void GetCorrelations(SymmetricMatrix&);
00246         void GetHatDiagonal(DiagonalMatrix&) const;
00247 
00248     private:
00249         void MakeCovariance();
00250     };
00251 
00252     // The next class is the prototype class for calculating the
00253     // log-likelihood.
00254     // I assume first derivatives are available and something like the
00255     // Fisher Information or variance/covariance matrix of the first
00256     // derivatives or minus the matrix of second derivatives is
00257     // available. This matrix must be positive definite.
00258 
00265     class LL_D_FI {
00266     protected:
00267         ColumnVector para;                          // current parameter values
00268         bool wg;                                    // true if FI matrix wanted
00269 
00270     public:
00271         virtual void Set(const ColumnVector& X) { para = X; }
00272         // set parameter values
00273         virtual void WG(bool wgx) { wg = wgx; }
00274         // set wg
00275 
00276         virtual bool IsValid() { return true; }
00277         // return true is para is OK
00278         bool IsValid(const ColumnVector& X, bool wgx=true)
00279             { Set(X); WG(wgx); return IsValid(); }
00280 
00281         virtual Real LogLikelihood() = 0;           // return the loglikelihhod
00282         Real LogLikelihood(const ColumnVector& X, bool wgx=true)
00283             { Set(X); WG(wgx); return LogLikelihood(); }
00284 
00285         virtual ReturnMatrix Derivatives() = 0;
00286         // column vector of derivatives
00287         virtual ReturnMatrix FI() = 0;              // Fisher Information matrix
00288     };
00289 
00290     // This is the class for doing the maximum likelihood estimation
00291 
00294     class MLE_D_FI : public FindMaximum2 {
00295         // these replace the corresponding functions in FindMaximum2
00296         void Value(const ColumnVector&, bool, Real&, bool&);
00297         bool NextPoint(ColumnVector&, Real&);
00298         Real LastDerivative(const ColumnVector&);
00299 
00300         // the things we need for the analysis
00301         LL_D_FI& LL;                                  // reference to log-likelihood
00302         int Lim;                                      // maximum number of iterations
00303         Real Criterion;                               // convergence criterion
00304         ColumnVector Derivs;                          // for the derivatives
00305         LowerTriangularMatrix LT;                     // Cholesky decomposition of FI
00306         SymmetricMatrix Covariance;
00307         DiagonalMatrix SE;
00308 
00309     public:
00310         MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
00311             : LL(ll), Lim(lim), Criterion(criterion) {}
00312         void Fit(ColumnVector& Parameters);
00313         void GetStandardErrors(ColumnVector&);
00314         void GetCorrelations(SymmetricMatrix&);
00315 
00316     private:
00317         void MakeCovariance();
00318     };
00319 
00320 #ifdef use_namespace
00321 }
00322 #endif
00323 #endif

Generated on Fri Aug 23 01:37:11 2002 for VirtualU by doxygen1.2.17