Damping Factors |
Unstable Refinements
In the section on the Rietveld method, it was stated that the process of non-linear least-squares could be used to calculate parameter shifts, δpi that could be added to the parameters, pi, so as to obtain improved values, i.e.
the whole process being iterated many times so as to achieve minimisation. There was little discussion on some of the practical problems of least-squares other than a brief comment made on the effects of poor model (matrix difficult to invert) and poor data (bad shifts on the parameters). When the model and data are both good, then the shifts on the refined parameters will be small: consequently, the refinement should converge smoothly to a well-defined minumum (as illustrated in the figure on least-squares).
However, sometimes our model may not be as close to reality as we desire, in which case the assumption made in our non-linear least-squares derivation will be flawed. Nonetheless, our model may be sufficently close to reality that the direction given by the first derivatives is correct, but the not the magnitude, i.e. the shifts on the parameters may be in the right direction but not by the right amount. If one were to use such calculated shifts, one would overshoot the minimum, possibly by so much that our refined model deviates even further from reality. One solution to this problem is to use damping factors.
Damping Factors
There are several types of damping factor in crystallography: some now in use in single crystal refinement programs are quite sophisticated, while many available in Rietveld software programs can be quite crude. The simplest (and crudest) approach is to apply a damping multiplier, D, to the calculated shifts as follows:
where the value of D lies in the range 0 to 1. Clearly, a zero value results in the parameters remaining fixed, while a value of unity corresponds the full shift being applied.
Due to the nature of the equations in the Rietveld refinement procedure, some parameters are more non-linear than others. Consequently, some programs may have separate damping parameters for various classes of parameter, e.g. atomic coordinates, thermel displacement parameters, scale and site occupancy factors, peak-shape parameters, and so on. As a rule of thumb, the more non-linear the parameter, the smaller the value of D required.
After a few cycles of refinement, damping factors should be reset to unity since the model should be sufficient close to reality that the refinement will be stable. In addition, this will permit the miminisation point to be reached in an optimum number of least-squares cycles.
Shift Limiting Contraints
Damping factors are crude, but functional. A better approach is to consider whether the calculated shifts are reasonable. As a simple example of this approach, consider the refinement of the atomic coordinates, x,y,z, of an atom: if our model is approximately correct, then it may reasonably be assumed that the starting position of the atom is, say, within 0.1 Å of its true position. Thus if the coordinate shifts, δx,δy,δz, result in a move of the atom by more than this distance, then it can be assumed that these specific coordinate shifts require damping. The solution is to move the atom by the reasonable distance, say 0.1 Å, along the vector determined by the excessive shift parameters. This approach has the big advantage in that the unstable parameters can be targeted for damping and not every parameter. Unfortunately, while this approach is common in single-crystal refinement programs, it is often missing in the equivalent Rietveld software.
Marquart Parameter
Both of the above approaches modify the shifts to the parameters after the inversion of the least-squares matrix. Given that refinement instabilities often result from large correlations between refined parameters, then an alternative approach is possible. Without going into mathematical detail, information concerning correlations between refined parameters is contained in the off-diagonal elements of the least-squares matrix. (You may recall that the off-diagonal elements of the inverted matrix provide the correlation coefficients.) The off-diagonal elements of the least-squares matrix may be down-weighted before inversion by increasing the value of the elements on the main body-diagonal of the matrix by an amount determined by the so-called Marquart parameter, named after the mathematician who invented the method. This reduces correlations between parameters, but at the expense of a much slower convergence towards the minimum. A few Rietveld programs have the Marquart method installed as a default. As with all damping methods, its main disadvantage is that it slows down convergence towards the true minimum.
Block Diagonalisation
The ultimate reduction in down-weighting off-diagonal elements is to set all of the off-diagonal elements in a block to zero. This leaves small block of non-zero matrix on the main diagonal, each of which may be inverted independently. The rationale behind this approach is that some parameters should not be correlated: in single-crystal studies, this may well be true, but the problem in powder diffraction is that all of the parameters usually have some correlation with the profile parameters thus making this approach in Rietveld refinment less reasonable.
Wrong Model
It is often very tempting to apply one or more of the above damping methods to solve the problem of an unstable Reitveld refinement. However, the fact that the refinement is unstable should suggest that the model is not as good as it should be. For example, any attempt to refine parameters related by symmetry will usually result in an unstable least-squares matrix. The correct solution here is not to apply damping, but to change the least-squares matrix by changing the model. This may require more thought and so is often the last method tried: nevertheless, it should be considered more often.
Finally, one should emphasise again that unstable refinements of this type are different to the case of instabilities due to poor data: for the latter scenario, one solution is to improve the "observed" data by adding in more known information, e.g. by using bond length and angle restraints.
© Copyright 2002-2006. Birkbeck College, University of London. | Author(s): Jeremy Karl Cockcroft |