Abstract
Background
The introduction and statistical formalisation of landmarkbased methods for analysing biological shape has made a major impact on comparative morphometric analyses. However, a satisfactory solution for including information from 2D/3D shapes represented by ‘semilandmarks’ alongside welldefined landmarks into the analyses is still missing. Also, there has not been an integration of a statistical treatment of measurement error in the current approaches.
Results
We propose a procedure based upon the description of landmarks with measurement covariance, which extends statistical linear modelling processes to semilandmarks for further analysis. Our formulation is based upon a self consistent approach to the construction of likelihoodbased parameter estimation and includes corrections for parameter bias, induced by the degrees of freedom within the linear model. The method has been implemented and tested on measurements from 2D fly wing, 2D mouse mandible and 3D mouse skull data. We use these data to explore possible advantages and disadvantages over the use of standard Procrustes/PCA analysis via a combination of MonteCarlo studies and quantitative statistical tests. In the process we show how appropriate weighting provides not only greater stability but also more efficient use of the available landmark data. The set of new landmarks generated in our procedure (‘ghost points’) can then be used in any further downstream statistical analysis.
Conclusions
Our approach provides a consistent way of including different forms of landmarks into an analysis and reduces instabilities due to poorly defined points. Our results suggest that the method has the potential to be utilised for the analysis of 2D/3D data, and in particular, for the inclusion of information from surfaces represented by multiple landmark points.
Introduction
The introduction of geometric morphometrics has laid the foundations for a quantitative description of shapes and shape differences, thus revolutionising the century old quest for comparing anatomical features of organisms [1]. It is now also increasingly used to link quantitative descriptions of shape with developmental processes and associated genetic factors [2]. This process generally involves the construction of a parametric model based upon exemplar biological shape specimens, and the most popular of these are linear models. These are used to quantify and predict the correlations in shape variation between and within species. The objectives of this paper are to improve the statistical efficiency of analysis techniques used in the genetic interpretation of shape variation (morphometrics) and to broaden the scope of problems which can be tackled with shape analysis tools. In particular we believe that much shape data is not suitable for use in current approaches, and ‘semilandmarks’ (those poorly localised in one direction and the majority of measurements for smooth 3D shape) cannot be appropriately utilised [3,4].
Over a decade during the 70’s, biomathematical and biometrical aspects of biological shape studies were treated separately. This early work was later criticised during the 80’s by Bookstein [5], Goodall [6] and Kendall [7]. Later, Bookstein [8] worked towards converging notations from Goodall, Kendall and himself, for the biometric analysis of landmark data in a biomathematically interpretable framework of shape. As a consequence of these efforts, the standard method for analysis of variation in landmark position is generally regarded as ‘Procrustes’. It comprises a leastsquares alignment of a set of landmark features to a mean shape, and this is often followed by eigenvector analysis of the linear correlations in variation around that mean. While the technique is now very popular the approach has several limitations with regard to the types of variation with which it can deal. One of these limitations is due to the assumption associated with taking leastsquares differences and eigenvector summaries of distributions. Though many regard these as simply definitional, and in particular associated with ‘shape’, any statistical interpretation suggests that data are measures with homogeneous noise. On the other hand, the Mantel test [9,10] has sometimes been used as an alternative to Procrustes distance to compute correlation between distance matrices (usually symmetric). Though many papers have been published in this area, we are aware of no work in this, or any related, area of point distribution modelling that has provided a framework to allow data to be analysed according to a measurement process.
Although landmarks are generally carefully chosen in order to allow accurate measurements of position within the image, problems will occur if ‘semilandmarks’, measured from smooth curves or surfaces and only accurately localised in one dimension, are input to the analysis. Landmarks with a high degree of variability can act as outliers in the alignment stage, generating correlated compensating shifts and rotations of the other points. As PCA aims to describe the main sources of variation, high levels of such correlated movement will then necessarily contaminate the extraction of eigenvectors [11]. This contamination cannot be considered a generic variation, as it has occurred purely due to the uncertainty in the measurement. This in turn follows from the subjective definition of the landmark leading to the view that problems can be avoided via appropriate definition. The mathematical concept of homology (and mapping) underlies many of the considerations behind much theoretical work that is described with the mathematical formalisms of isomorphism. Because of such restrictions on the definition of landmarks, semilandmarks were introduced [12] in order to allow inclusion of other points which are not homologous among the specimens. By this we mean that a unique corresponding location can not be defined. Measurement at these locations must be regularised by a constraint, such as bending energy [12,13], in order to recover the information missing due to the nature of local structure.
From a statistical perspective a homology (in this context) must be augmented by distributions indicative of the extent to which a correspondence can be established. The standard way to deal with inappropriate weighting of data in a leastsquares fit is to generalise the leastsquares cost to a Mahalanobis distance, computed using measurement covariances. By avoiding the requirement of specifying a unique homologous location, this has the advantage of accommodating varying precision in measured data without having to try to recreate missing data. There have been several attempts in the literature to include measurement errors for landmark points. For example, Fitzpatrick et al. [14] worked on the relationship between localisation error and registration error in rigidbody, pointbased registration. Chui and Rangarajan [15] proposed a general framework for nonrigid point matching, where outliers are effectively rejected. Rohlf and Slice [16], and Walker [17] investigated how to estimate measurement covariances in forms. However, Richtsmeier et al. [18], Adams et al. [1] and Rohlf [19] all stated that further research was needed in this area. Also, Walker [17] and Lele [20] concluded that generalised Procrustes analysis (GPA) estimators of the variancecovariance matrix are flawed. Despite the fact that some biologists have noticed these problems, they seem to know of no available alternatives and continue to use GPA to estimate covariances [21].
Text books [22] state that using weighted Procrustes does not lead to a Kendall’s shape space. Claiming that “statistical analysis cannot employ parametric models”, they suggested that resamplingbased methods must be used instead. Another reason for rejecting the idea of a weighted Procrustes was said to be a “lack of clear criteria for determining appropriate weighting of semilandmarks”. These criticisms can only really be interpreted once a method for weighting is specified. Goodall [23] suggested a method in which the same covariance was used for all landmarks. By this we mean there was no separate description of the perturbation of individual landmarks. It has been noted that such a matrix is inestimable [24]. Goodall himself acknowledged that “as a model of measurement error this is a drawback, as the direction of greatest variation may vary considerably between landmarks”. Despite this problem, later work [25] generalised this idea to a Bayesian framework. We believe that it makes sense instead to suggest an approach which can support the process of landmark location as measurement, with a covariance describing the localisation of each landmark separately (see [26,27] for example). Specifically, Rohr et al. [26] used covariance matrices in a Mahalanobis distance form for nonisotropic data, where covariances were estimated from image data through landmark localisation, i.e. using greyvalue information from local pixels around each landmark for matching an image area/volume structure to another through optimisation of a cost function. The minimal localisation uncertainty for each point were estimated using the CramerRao Bound (CRB). Also, smoothness was included as the second term in their functional and controlled using a regularisation parameter. To our knowledge, they have been the first to provide a relatively comprehensive approach for incorporating anisotropic covariances into image registration using splines. However, here we only deal with predefined landmark data and, unlike their method (and our recently published method [28]), do not attempt to extract landmarks and their corresponding covariances from image data. Specifically, in [28], we have applied smoothing to local edge data (where information is) prior to optimisation in order to remove the effects of spatial noise and obtain meaningful CRB estimates. However in our current study, the only input data fed to our method are a number of shapes represented by fixed landmark points. Hence, we do not take into account any information about the local structure surrounding each landmark. This way, the task of covariance estimation may be seen even more challenging. We are aware that in biological studies it is now commonly accepted that for pointbased shapes, extra information about the local/global pixels in the image plane/volume (for 2D/3D data) is usually available using modern imaging equipment. However, here our observation is that geometric morphometrics should originally be capable of dealing with the study of 2D/3D forms [18] even for nonbiological data or cases where information about the local structure around each landmark is missing or difficult to access or process. It is worth mentioning here that one reason why Procrustes still is popular is that apart from the forms (shapes) represented by landmark points it does not require any further data such as images from which the points have been originated. Hence, even though the datasets we use in our experiments are biological and one could also feed in the image data, in this study we chose to start the process from predefined landmarks only. Ideally, covariances extracted from image data (using other methods such as ours [28]) could be fed to our current method and be used, for instance, as initial estimates. This is however a subject for future investigation.
There have been further publications on anisotropic weighting, for instance in [29,30]. Mathematically, these methods are all equivalent to our approach, in that they use a Mahalanobis distance based upon anisotropic distributions of individual points. However, they do not have a welldefined mechanism for the estimation of these distributions. This is a key issue when applying these ideas to shape samples. Our work provides such a mechanism while incorporating corrections for estimation bias [7]. The basic concept can be implemented via a standard technique used in pattern recognition, often referred to as whitening [31]. For instance, in the context of shape analysis, the whitening transform and shape decorrelation were used as a preprocessing step in PCA/ICA analysis [32,33]. However, there is a difference between using whitening methods to model the signal variation of data (as used in these papers) and using the same technique to better construct a likelihood function that accounts for correlation in measured data (as we do here). Recently, the technique has been applied to the within group biological covariances [34], but again not to the process of noise on measurements. Here, we shall investigate possible generalisations of Procrustes along these lines, and the different ways such a measurement covariance may be estimated. As a key issue here is the computability of these covariances, the stability of the resulting analysis is an important question for investigation. The theory presented here can thus be classified in the same category as both Procrustes based shape analysis [35] and active shape models [36]. The main difference, however, being that our model is for a realisable system and selfconsistent estimation of the associated model parameters.
There has been an ongoing discussion in the biology literature regarding appropriate ways to deal with nonhomologous landmarks (points defined on smooth curves and surfaces) during statistical analysis. For instance, Klingenberg [37] has objected to Polly’s conclusions [38] regarding the benefits of existing homologyfree approaches. He believes that these approaches all depend critically on some sense of homology since they are not really free of assumptions about the correspondence of parts. Oxnard and O’Higgins [39] have recommended that it is biology that has to inform morphometrics in planning the landmark configuration (mainly mathematical landmarks, i.e. those computed using geometric constraints based on the neighbouring true landmarks) in relation to the hypothesis available. The approach to dealing with semilandmarks in the morphometric analysis of shape currently seems to be divided between two alternatives, both of which aim to adjust the position of these landmarks by optimising a specific metric, before constructing a linear model of variation about the mean. These metrics are bending energy (BE) and Procrustes distance (PD) [3]. Arguments for and against these approaches are based upon specific examples in biology. Although evidence has been reported of utility [40], Slice [41] has stated that the application of the BE approach to biomedical and anthropological problems has been minimal. Vignon and Pierre [4], and Prez et al. [42] have shown concern regarding the observation that different methods for handling semilandmarks could result in different conclusions in a discriminant analysis study. GomezRobles et al. [43] have examined the advantages and disadvantages of different novel methods in geometric morphometric analyses including homologyfree approaches, landmarkbased approaches, and combinations of both techniques.
Comparison between results from shape analysis and genetics is an important research topic in evolutionary biology. For instance, Frederich et al. [44] have attempted to estimate the statistical correlation between morphological, genetic and geographical distances. We offer an alternative shape analysis method that tackles the existing problem in the literature, so that well defined comparisons become statistically valid and informative.
Methods
Suppose that there are K shapes in our dataset and each shape vector w_{k} contains N landmark points, i.e. w_{k}=[w_{1x},w_{1y},w_{2x},w_{2y},...,w_{Nx},w_{Ny}]_{k} for the case of 2D data. We then apply a scale s_{k}, a rotation R_{k} and a translation t_{k} to the original data to get an aligned version of the data called z_{k}, where z_{k}=[z_{1x},z_{1y},z_{2x},z_{2y},...,z_{Nx},z_{Ny}]_{k} and z_{k}=s_{k}R_{k}(w_{k}−t_{k}).
The mathematical description of the model so far can accommodate any value of scale or orientation for the definition of mean model. We therefore define the orientation of mean shape so that the line between a specified pair of points is horizontal. This also has the benefit that initial estimates of alignment for sample k can be set according to the relative positions of these points. We also use the average distance between these same landmarks to rescale the mean shape at each iteration so that scale remains numerically defined.
For 2D data, we assume a different but fixed 2×2 covariance matrix for each landmark derived from the measurement process. These are composed into the matrix C. This is a tridiagonal matrix, the diagonal line of which contains data for individual landmarks. Outside of the 2×2 covariances, the off diagonal elements of C are zero, i.e. there are no correlations between landmarks. The use of a fixed data covariance cancels out when taking the weighted mean, to regenerate the conventional formula for the mean;
where m=[m_{1x},m_{1y},m_{2x},m_{2y},...,m_{Nx},m_{Ny}]. This definition for mean shape has previously been shown to provide unbiased estimates using MonteCarlo resampling studies [19], which is to be expected for a valid likelihood estimate of parameters.
The points z_{k} do not have uniform independent noise distributions, which is one of the assumptions for the application of PCA. However, this property can be obtained via a whitening transformation. Although transformation of data can be considered as a new space, it can also be interpreted as an affine reprojection. The points obtained by applying a whitening transformation are referred to here as ‘ghost points’. Ghost points are accordingly defined in the original coordinate system and, being scaled projections relative to the shape centroid, are an alternative way to summarise the original measurement relative to the observable structure. This is an important philosophical issue for those who believe that the original coordinate system is somehow more meaningful as a description of biological variation than any linear reprojection (see Discussions). The process amplifies the spatial variation in directions which are well measured relative to those which are not so that the resulting locations have isotropic errors (as required). In turn, this allows accurately measured structure to be encoded in the most significant eigenvectors (those with largest eigenvalues) of the linear model. We transform z_{k} to ghost points g_{k} using the matrix W so that .
By applying singular value decomposition to C^{−1}, i.e. C^{−1}=U^{T}VU, and making it equivalent to W^{T}IW, we find that the required whitening matrix is W=V^{1/2}U. Application of PCA to g_{k} follows for construction of the shape covariance, giving the eigenvectors e_{j} and eigenvalues μ_{j} for the whitened space of ghost points as those which minimise the unexplained variance for fixed J<N, where J is the number of eigenvectors used in the model. Hence, for any specific shape example k, linear factors λ_{jk} = e_{j}.g_{k} can be computed to best approximate z_{k} with the model ;
A genuine likelihood should be based upon the variation of the data around the assumed model. Failure to do this results in residuals which cannot be meaningfully interpreted ^{a}. Using this argument, if we wish to align to the mean shape we should use a covariance that is consistent with the distribution around the model. In order to find the best R_{k},t_{k},s_{k} parameters for each k, we minimise a Mahalanobis distance which is given by
This is simply the likelihood estimate for the location of the shape given the linear model and the assumed measurement covariances and can be interpreted directly as a χ^{2} statistic. By replacing C with I and with m this reduces to the leastsquares function for standard Procrustes. We can therefore interpret this as a generalisation of the standard approach. However, we do not wish to generalise further by using for example PPCA (probabilistic principal component analysis) [45], as an additional assumption of a Gaussian distribution over derived variables is generally invalidated in morphometric data sets.
Use of Eq. (3) requires an initial estimate of the model and transformed data z_{k}. By setting the initial estimates of the measurement covariance C to an identity matrix, these parameters are given by the Procrustes result. We can therefore use Procrustes to set up the initial transformation estimates. To reach the best possible alignment using our new method (anisotropic C), we iteratively reestimate R_{k}, t_{k} and s_{k} using the assumed e_{j}, m, C^{−1} and W^{−1}. This gives us a new z_{k}, and so a new m and F for construction of e_{j}. For fixed covariances, convergence can be monitored via construction of the total likelihood . One may use the final estimates of z_{k} and to construct the sample covariance;
For a well defined likelihood method this covariance should be consistent with the assumed distribution C. However, the use of free parameters during alignment and model construction introduces biases that must be addressed in an iterative analysis in order to avoid instabilities, which will now be described.
Covariance correction
When attempting to estimate C, the use of free parameters during model fitting reduces the sample covariance obtained from residuals. This mechanism is precisely that identified in [7], whereby the observable variation in any single shape sample is reduced onto a manifold in 2N−4 dimensions for a 2D shape defined for N points. This has generally been considered as a bias in the overall data distribution rather than being associated specifically with a statistical estimation error (as here). A possible outcome of this is the over weighting of landmarks leading to a runaway convergence on one landmark, during iterative estimation of C. However, this bias effect is estimable and therefore correctable, as will be illustrated by MonteCarlo simulation (below). For a single scale parameter associated with an approximate linear vector f we can use error propagation to estimate the expected average reduction Δ in the covariance for each 2×2 landmark component of the matrix arising from errors in parameter estimates Δ _{n}C as
Note that the denominator is the change in χ^{2} expected due to a unit change in f, and f_{n} = D_{n}f is the 2D component of f corresponding to landmark n (where D_{n} is an operator which zeros all but those quantities associated with the nth landmark and ⊗ is for outer product between two vectors). For an eigenvector e_{j} defined in the whitened ghost space, this would suggest a total correction of
The known structure of the covariance can be enforced by zeroing relevant offdiagonal terms. The parameters of the linear model, including scale, rotation, translation and linear model weightings can also be treated in this way. If Θ _{i} represents one of the direction vectors of these parameters (with 2N elements for 2D data), it follows that direction vectors corresponding to translation in x and y directions Θ _{1}=[1,0,1,0,...] and Θ _{2}=[0,1,0,1,...] are orthogonal, i.e. Θ _{1}.Θ _{2}=0. Similarly, direction vectors Θ _{3}=m=[m_{1x},m_{1y},m_{2x},m_{2y},...] and Θ _{4}=[−m_{1y},m_{1x},−m_{2y},m_{2x},...] corresponding to scaling and rotation are orthogonal, and so Θ _{3}.Θ _{4}=0. Note that m is identical to the mean vector defined in Eq. (1).
Strictly, Kendall’s definition of shape explicitly removes aspects of object transformation before model construction. Joint estimation of shape and alignment parameters is potentially unstable as estimated linear shape parameters can correlate with transformation parameters. Here we stabilise this process by removing first order correlations from the data covariance F prior to model construction.
Hence, to orthogonalise the model, we modify ghost points as follows.
where the unit vector is the normalised form of Θ _{i}. The new g_{k} is computed iteratively using each so that any variation about the mean that could have been described by an alignment parameter is removed from the correlation matrix F prior to model construction. The corresponding measurement covariance correction term is hence given by
Therefore, the measurement covariance is estimated using
Using the above formula, the contribution to the χ^{2} lost by using a scaling parameter associated with each vector e_{j} and Θ _{i} contributes a value of unity to the χ^{2} for each additional independent degree of freedom, totalling J+4. Our method for covariance correction is therefore consistent with a degree of freedom correction as described in conventional analysis approaches [46]. As a consequence the covariance estimation process can be considered equivalent to the ExpectationMaximisation (EM) algorithm, both in operation and parameter estimates, so that the conventional proof of convergence is applicable [47].
Extension from 2D to 3D
Here we outline the mechanism we use to extend 2D shape rotation analysis, and the extraction of corrected anisotropic measurement covariances, to 3D. The methods are demonstrated in the analysis of 3D mouse skull data, both as a test of the theory/software implementation and as an illustration of use for the identification of outlier landmarks.
The extension to 3D data is mainly involved with the mechanism of representing and estimating 3D shape rotations. We define a fixed orientation coordinate system from a set of 3D datapoints based upon a selection of three landmark points. We then represent a rotation matrix in terms of three separate rotations about the coordinate axes. Finally we compute the linear vectors which approximate the first order shifts seen in the 3D points due to these rotations. These are then used in the linearised approximation for sample covariance correction, as described earlier. These extensions are enough to support a quantitative analysis of 3D landmark data, for the estimation of landmark accuracy and identification of outlier data. The mathematical model used is described in detail here and in Appendix A. We provide quantitative tests in the Results and discussion section which demonstrate the numerical stability of the algorithms using MonteCarlo data.
Rotation matrix
Our first task is to define a coordinate system for a 3D dataset, from which we can define certain basic properties of orientation for the mean shape, and so that individual data samples can be approximately oriented prior to optimisation during linear model construction. In the 2D case this is done by defining the line between two landmark points in the mean model as horizontal. In 3D, in order to stay consistent with the 2D, we define 2 points to establish a horizontal, and then a third to define the vertical relative to the first two.
Given a 3D shape, we take three points P_{1}, P_{2}, P_{3}, with relatively large distance from each other (Figure 1) to define the orientation plane for the shape. The rotation matrix R^{T} is hence found based on basic vector calculations (see Appendix A).
Figure 1. The geometry shows how to define a baseplane in 3D (consistent with the baseline in 2D) using three landmark points.
Roll, pitch and yaw angles
Given the rotation matrix that brings a data set into alignment with the preferred coordinate system it is possible to represent the rotation as a sequence of rotations about three orthogonal axes. According to basic 3D rotation formulas, and using α, β, and γ as yaw, pitch, and roll respectively, the 3D rotation matrix is defined as three consecutive rotations around the z, y, and x coordinate axes.
By making the rotation matrix R^{T} equivalent to R_{xyz}, we find the yaw, pitch, and roll angles (see Appendix A). Thus we can convert easily between the rotation matrix and rotation parameters.
Orientation adjustments
We initialise the rotation angles, by computing the R^{T} matrix for every original shape in the data set based upon the three identified landmark points and extracting the corresponding α, β, and γ angles. These are then further adjusted during iterative alignment via optimisation of the anisotropic measurementbased Mahalanobis distance. We perform orientation adjustment on the mean shape following every iteration over the set of shape samples. In this case the set of yaw, pitch, and roll angles corresponding to the mean shape are subtracted from the corresponding rotation angles for each shape sample, so that the computed mean shape complies with the threepoint orientation constraint.
Direction vectors
In order to correct the covariances due to alignment parameters in 3D, we need the approximate linear direction vectors corresponding to translation, rotation and scale. Computing these for translation and scale is straightforward. If m=[m_{1x},m_{1y},m_{1z},m_{2x},m_{2y},m_{2z},...] is the vector corresponding to the 3D mean shape (with 3N elements), then the direction vectors due to translation in x, y and z directions are simply given by Θ _{1}=[1,0,0,1,0,0,...], Θ _{2}=[0,1,0,0,1,0,...] and Θ _{3}=[0,0,1,0,0,1,...]. Also, the direction vector due to scaling is Θ _{4}=m.
For rotation, we compute the direction vector corresponding to each individual rotations R_{z}, R_{y} and R_{x}. For the mean shape m rotated by the yaw angle α around the z axis, we have m^{′}=R_{z}(α)m. As α becomes very small, the tangential direction of movement in landmark point n due to this rotation is
By applying the same method, one can find the direction vectors due to rotation by the pitch angle β around the y axis and by the roll angle γ around the x axis (see Appendix A). Hence we have
It follows that Θ _{5}=[−m_{1y},m_{1x},0,−m_{2y},m_{2x},0,...], Θ _{6}=[m_{1z},0,−m_{1x},m_{2z},0,−m_{2x},...] and Θ _{7}=[0,−m_{1z},m_{1y},0,−m_{2z},m_{2y},...]. The set of vectors Θ _{1}, Θ _{2} and Θ _{3} on one hand, and the set of vectors Θ _{5}, Θ _{6} and Θ _{7} on the other hand are mutually orthogonal and orthogonal to the vector Θ _{4} due to scaling. These direction vectors now constitute the linearised parameterisations needed for corrections to the sample covariance (where I=7 in Eq. 9).
Procedures
Here, in Table 1, we provide the stepbystep procedure for our new shape analysis method that involves linear model construction, data alignment and anisotropic covariance estimation and correction. Note that there is an arbitrary order for the application of the transformation parameters which remains consistent throughout the whole process. In fact, whatever this order, the net effect of the covariance correction (Eqs. 79) is to subtract the same total linear subspace.
Table 1. The algorithmic procedure for our new method
Model selection
A method is needed to select appropriate linear model order based upon the outputs from our analysis. If the linear model is valid then estimated measurement covariances will combine two processes of statistical fluctuation. The first of these will be measurement precision σ_{r} (our ability to define homologous points reliably), and the second will be due to random (unmodellable) biological variation σ_{b}. So that the observed statistical variation seen in a given direction v for any landmark σ_{v} is
Unfortunately we cannot know the expected value of σ_{b} in advance. However, the first of these terms can be estimated via reproducibility experiments and compared to the measured directional covariances, using the observation that σ_{v}≥σ_{r}. Thus if we observe individual estimates of measurement covariance which begin to surpass the limiting accuracy known to be set by reproducibility tests, then the model must be overfitting the data and therefore has too many parameters. We check that for a given model order this inequality is satisfied within statistical limits by considering the principle axes of each landmark measurement distribution. We use a 1% confidence level to set the hypothesis test for overfitting. This test is expected to be most reliable for the largest variances.
MonteCarlo tests and outlier identification
As our method is based on likelihood, we require that the assumed distribution matches the corrected covariance. The standard way to validate this is through generating MonteCarlo (MC) data using the known distributions. In what follows we experiment with MC data and display a number of informative scatter plots for two forms of test; Test A: When applying our method to the MC data, the mean shape, eigenvectors and measurement covariances used are identical to the ones used when generating the simulated data. Test B: All parameters are estimated using the MC data in order to compare the measurement covariances estimated using the simulated data with those expected, i.e. the ones assumed when generating the MC data.
For Test A the covariances estimated using our method are expected to be within statistical sampling limits of the ones used when generating the MC data. Failure to do so is taken as an indication of a problem with the data sample (i.e. outliers). Outliers can be identified at early stages of analysis as those points which have the largest normalised residual errors.
We use 2.8 standard deviations of the error on the sample variance (or being allowed to have 1% of data falling outside the limits), where the error on the standard deviation σ is with K being the number of samples [48]. Additional variance is expected for Test B (beyond that seen in Test A), where the linear model must also be estimated. Therefore, having excluded the possibility of outliers using Test A, we can interpret variations beyond the statistical limits as due to instability in linear model construction (specifically the mean and eigenvectors).
χ^{2} test
A test is needed to confirm the equivalence of measurement covariances computed during repeatability experiments, in order to confirm that our methods generate estimates which are consistent. This can also be done by splitting the data into two separate groups if there are a sufficient number of samples. We perform a modified χ^{2} test based upon the construction of corrected covariances on one data set and then used for the calculation of χ^{2} for the second set. For large numbers of samples ( K>30) the resulting statistic when applied to each 2D landmark is expected to be approximately Gaussian with mean 2K and variance 4K. We set the statistical test for significant difference on the basis of an allowable range of χ^{2}/DoF corresponding to ±2.8 S.D., i.e. [0.8, 1.2] for 200 samples. The corresponding plot would confirm the stability of the method if 99% of the χ^{2}/DoF values fall inside the range expected.
Fisher information
Fisher information (FI) is a concept for quantifying the constraint on an estimated value associated with data. It has the useful property that the amount of estimated information is linear in the quantity of data. It is generally defined according to the second derivative of a loglikelihood function, but from the association of this function and the CRB we can also observe that, for good model fits, it is proportional to the inverse variance. An empirical estimate of the FI contained in data, and associated with a particular model, can therefore be obtained from the residual distributions following parameter estimation.
We use this idea here to summarise the amount of information that has been extracted from data for a specific analysis. As this quantity scales linearly with the quantity of data it allows us to make comparative statements regarding the statistical efficiency associated with the estimation process. For example, if the FI is seen to double on the same dataset when applying an alternative analysis then this is statistically equivalent to having four times as much data to begin with. A poor analysis method might need a lot more data to reach the same level of statistical equivalence in a hypothesis test than a good method.
Ethics
The animal datasets used in this paper have been approved according to German ethical standards. They were registered under number V31272241.12334 (978/07) and approved by the ethics commission of the Ministerium für Landwirtschaft, Umwelt und ländliche Räume on 27.12.2007.
Results and discussion
We have used example datasets to investigate the stability of covariance weighted shape analysis and to compare quantitative performance figures to the standard approach using Procrustes. We have selected several datasets in order to demonstrate behaviour with different quantities of data, data dimensionality (i.e. 2D and 3D) and model order.
As standard methods, even those including landmark weighting, are not conventionally used in a way that would support estimation of landmark variability we have made some assumptions regarding what would be the most straightforward approach. As mentioned earlier, in this paper we are interested in analysing pointbased shape datasets without seeking to obtain extra knowledge about local structures surrounding each landmark. Hence, in conjunction with our method, we have not used methods that estimate localisation errors from the original image data such as those described in [26,28]. For Procrustes we use the residuals from the fitted models to make an estimate of landmark measurement error (although this is widely concluded in the literature not to work [22]). For methods that would support anisotropic weighting, we use a variation of our own method (incorporating iterative reweighted alignment) to estimate the resulting residuals during iterative analysis. The difference between this and our preferred method is the lack of correction for degree of freedom biases, we therefore refer to this as the “uncorrected” method.
Data
We experiment with two 2D data sets of manual markups (Figure 2). The first data set, called MM1, corresponds to mouse mandible microCT images and consists of 337 samples with 14 landmarks per sample. We also have a repeat data set, called MM2, for which same mandible images have been used to markup the points.
Figure 2. Typical landmarks corresponding to sample images of fly wings (left) and mouse mandibles (right); for the fly wing data, landmarks 115 correspond to the original data sets FL1, FL2, FR1 and FR2, while landmarks 1619 were added later (to FL1) in order to experiment with semilandmarks (PFL1).
Next, we use some fly wing data in order to test the performance of our method on semilandmarks and also to test the statistical stability of our method. There are four original data sets available from left and right wings(L and R) of 200 female flies, called FL1, FL2, FR1 and FR2 [49]. Two images of each wing were taken from slightly different viewing positions (1 and 2), and used for markingup in order to perform reproducibility tests [49]. Each of these four data sets has 200 samples with 15 landmarks per sample. Further, as we had access to the fly wing images, we have added four semilandmarks to each sample of the original data set FL1. Once finished, we removed 5% outliers and stored 189 samples with 19 landmarks per sample. This resulting data set, which is called PFL1, plays an important role in our experiments with semilandmarks. In order to be able to test the repeatability with these added semilandmarks, we have repeated the markingup process only for the four new landmark points and using a subset of the left fly wing images.
We also experiment with the mouse skull (MS) 3D data of semiautomatic markups (Figure 3) produced based on training examples and the corresponding microCT images. This makes a typical 3D data set of interest in evolutionary biology research. We have used our automatic tool to localise landmarks on these mouse skulls based on few given manual markup examples [50,51]. This is based on landmark localisation technique recently described in [28] (where more details may be found). The automatic tool also identifies outliers for manual correction and so we do not expect any outlier in this data set. The markup data set obtained this way (MS) consists of 42 samples with 50 landmarks per sample. Further, there are two subsets of repeat data based on manual markups (on the mouse skulls) each consisting of 12 samples to be used in repeatability tests.
Figure 3. Typical landmarks for a sample volume (topleft) from the 3D mouse skull (MS) data when projected on the xy (topright), zy (bottomleft) and xz (bottomright) planes (the 50 points are too close to display full numbering).
Model selection
In Figures 4, 5 and 6, we plot the eigenvalues corresponding to the errors estimated against those computed from the repeat data. These are the magnitudes of the errors in the direction of major eigenvectors. It can be seen that while for the fly wing data the errors are comparable (Figure 4), for the mouse mandibles there are several landmarks for which the error estimates are much larger than expected (Figure 5). We cannot argue for an increased model order as this then reduces other values to well below the observed repeatability (over fitting). As the additional variance seen is due to the inability of the model to predict correlations in the data, our conclusion must be that either this data is not well described by a linear model, or the repeatability estimate systematically underestimates the true accuracy with which points can be meaningfully located. This can happen if local image features (which are themselves not well biologically related to the main structures, such as the brightest pixel) are used to identify locations. The plot for the more complex mouse skull data (Figure 6) suggests that 14 components is about the number needed by the linear model. Hence, when experimenting with the 3D MS data we use 14 model components, while using 6 components with the 2D MM data and 23 components with the 2D FL and PFL data. The 1% allowable range is set in accordance with 12 repeat data samples.
Figure 4. Fly wing data (PFL1): major eigenvalues of the error using our 1, 2 and 3 component models against those computed using a repeatability test on four new semilandmarks placed manually on a subset of data; the 2 component model gives closest agreement to the expected localisation values; the two dashed lines show the±2.8σ range.
Figure 5. Mouse mandible data: major eigenvalues of the error estimated using our 5, 6 and 7 component models on MM1 data against those computed using the corresponding repeatability test (MM1 and MM2); the two dashed lines show the±2.8σ range.
Figure 6. Mouse skull data: major eigenvalues of the error estimated using our 8, 10, 12 and 14 component models for the 3D MS data against those computed using the corresponding repeatability test; the two dashed lines show the±2.8σ range.
MonteCarlo tests
We show the MonteCarlo plots for the Test A in Figures 7, 8 and 9, while the Figures 10, 11 and 12 show those for the Test B. The results for the Test A on 2D data sets (Figures 7 and 8) indicate very little difference for the low parameter fly wing data, and a more obvious systematic underestimate of covariance (as expected) for the 6 dimensional mouse mandible data (prior to correction).
Figure 7. Fly wing data (PFL1): error eigenvalues estimated using the MonteCarlo data (where mean shape, eigenvectors, and measurement covariances are identical to the model which generated the simulated data) against the expected ones (Test A); using 2 model components; there is only marginal evidence of estimation bias before correction;the two dashed lines show the±2.8σ range.
Figure 8. Mouse mandible data (MM1): error eigenvalues estimated using the MonteCarlo data (where mean shape, eigenvectors, and measurement covariances are identical to the 6component model which generated the simulated data) against the expected ones (Test A); for this number of parameters there is now evidence of a systematic underestimate of covariance (prior to correction); the two dashed lines show the±2.8σ range.
Figure 9. Mouse skull data (MS): error eigenvalues estimated using the MonteCarlo data (where mean shape, eigenvectors, and measurement covariances are identical to the 14component model which generated the simulated data) against the expected ones (Test A); the two dashed lines show the±2.8σ range.
Figure 10. Fly wing data (PFL1): error eigenvalues estimated using the MonteCarlo data against the expected ones (estimated using the original data) which were used when generating the simulated data; independent models (Test B); using 2 model components; the two dashed lines show the±2.8σ range.
Figure 11. Mouse mandible data (MM1): error eigenvalues estimated using the MonteCarlo data against the expected ones (estimated using the original data) which were used when generating the simulated data; independent 6component models (Test B); for this number of linear model components there is considerable error in the uncorrected estimates; the two dashed lines show the±2.8σ range.
Figure 12. Mouse skull data (MS): error eigenvalues estimated using the MonteCarlo data against the expected ones (estimated using the original data) which were used when generating the simulated data; independent 14component models (Test B); the two dashed lines show the±2.8σ range.
Further, the results for the Test B (Figures 10 and 11) indicate that even for the mouse mandible data, the values of covariance are significantly different, due to the amplification of initial estimation bias during the process of iterative linear model estimation. The correction process now removes these instabilities bringing estimated covariances back close to the expected sampling limits and symmetrically around the expected correlation line.
Turning to the 3D MS data, for the Test A (Figure 9) the eigenvalues fall inside the allowable range (dashed lines). However for the Test B (Figure 12), the eigenvalues appear to fall under the lower bound. The underestimation seen is in accordance with a correction factor based upon the number of samples and model complexity (K−J)/K. Unlike the earlier biases this underestimation does not destabilise the analysis, as a common multiplicative change on all variance estimates leaves the estimated model parameters unaffected.
Note that the equivalent residual distributions estimated here from the conventional Procrustes analysis have no associated correction process and (along with uncorrected estimates from our own algorithm) are probably indicative of anything which could be attempted based upon estimating sample covariances for existing weighted methods.
Here, we compare the results obtained using Procrustes (Figures 13, 14 and 15) to those shown earlier (Figures 10, 11 and 12) using our likelihoodbased method. To produce such quantitative results, we apply Procrustes to the same MonteCarlo data which were generated based on our corrected covariances. Following Procrustes alignment, eigenvalues are computed using the remaining error residuals. These eigenvalues are then plotted against the expected ones, where values on horizontal axis are identical to those used in Figures 10–12. Clearly, (and in contrast to Figures 10–12) in all plots corresponding to Procrustes the measured values are not within the predicted statistical limits (dashed lines). By inference, the linear model vectors constructed using Procrustes are contaminated by random errors associated with poorly measured landmarks, as expected. When compared to the plots from our weighted method with the correction process switched off (e.g. Figure 13 compared to Figure 10), eigenvalues extracted from Procrustes residuals are further away from the expected values. As seen in the figures, changing the number of degrees of freedom of the model is also not sufficient to correct this issue. We can conclude that Procrustes generates a linear model which is a less efficient description of the true information contained in the data.
Figure 13. Fly wing data (PFL1): error eigenvalues computed using the residuals after Procrustes alignment on the MonteCarlo data, against the expected ones which were used when generating the simulated data; for 2, 3 and 4 model components; the two dashed lines show the±2.8σ range.
Figure 14. Mouse mandible data (MM1): error eigenvalues computed using the residuals after Procrustes alignment on the MonteCarlo data, against the expected ones which were used when generating the simulated data; for 4, 6 and 8 model components; the two dashed lines show the±2.8σ range.
Figure 15. Mouse skull data (MS): error eigenvalues computed using the residuals after Procrustes alignment on the MonteCarlo data, against the expected ones which were used when generating the simulated data; for 12, 14 and 16 model components; the two dashed lines show the±2.8σ range.
Shape analysis
In Figures 16, 17 and 18, we show the anisotropic error bars computed using the eigenvectors and eigenvalues of the 2×2 covariance matrices for the 2D data sets. All error bars are rescaled for visualisation purposes (see captions). Error bars for each landmark show the extent of an elliptical (nonisotropic) distribution around the corresponding point in the mean shape. Such distributions estimated using our method show exactly why we cannot assume isotropic distributions for the data as assumed in Procrustes. The PFL1 data used in Figure 16 consists of 19 landmarks (15 + 4) while the FL1 data used in Figure 17 consists of 15 common landmarks only. Using these plots, one can see that the semilandmarks have anisotropic covariances which match the expected localisation stability. Also we can see how after adding the 4 semilandmarks the anisotropic errors estimated using our method remain stable, while with Procrustes some change both in orientation and in size, e.g. landmark points 11 and 15 (see Figure 2 for landmark numbers). These error bars are shown again in Figures 19 and 20 with the corresponding aligned data superimposed. In these figures, the extent of error distributions illustrated by the error bars are not expected to match to those illustrated by the alignment, as general biological shape variation and measurement error are independent processes. Although, localisation is determined by local shape characteristics and measurement accuracy plays a role in the overall distribution of landmarks around the mean shape. As a consequence poorly measured landmarks may have a variation about the mean that is dominated by noise.
Figure 16. Fly wing data (PFL1): error bars (×20) estimated using our method (left), and computed from the residuals left using Procrustes (right); 2component models.
Figure 17. Fly wing data (FL1): error bars (×20) estimated using our method (left), and computed from the residuals left using Procrustes (right); 2component models.
Figure 18. Mouse mandible data (MM1): error bars (×20) estimated using our method (left), and computed from the residuals left using Procrustes (right); 6component models.
Figure 19. Fly wing data (PFL1): error bars (×20) estimated using our method (left), and computed from the residuals left using Procrustes (right), with the corresponding aligned data superimposed; 2component models.
Figure 20. Mouse mandible data (MM1): error bars (×20) estimated using our method (left), and computed from the residuals left using Procrustes (right), with the corresponding aligned data superimposed; 6component models.
Turning to 3D data, in Figure 21, we have shown the anisotropic error bars estimated using our method (subfigures on the top row) and those computed using Procrustes residuals (subfigures on the bottom row). In Figure 22, we have shown the corresponding aligned data using our method only, as these dense aligned data are visually quite similar for the two methods. In order to display the 3D results we have used their projections on three 2D planes in the original coordinate system.
Figure 21. Mouse skull data: error bars (×30) estimated using our covariancebased method (top); and those computed using Procrustes residuals (bottom); 14component models; projection planes: zy (left), xy (middle) and xz (right).
Figure 22. Mouse skull data: aligned data obtained using a 14component model based on our covariancebased method; projection planes: zy (left), xy (middle) and xz (right).
In both Figures 21 and 22, from the left to the right, we show the projected results on zy, xy and xz planes respectively. Using the mouse skull volume shown in Figure 3, one can see how these viewing planes (zy, xy and xz) correspond to the coronal, sagittal and transverse planes respectively. In these 42 data sets, five had a marked asymmetry of the nasal bones (affecting landmark 1), three had a partially open frontal suture (affecting landmark 3), and one exhibited both of these effects. In Figure 21, one can observe that the largest error bars estimated using our method are for the landmarks 3 and 1. This is consistent with the data clouds corresponding to these landmarks in Figure 22 where in each case some points stand away from the main cloud due to the deformations mentioned above. This is not the case for Procrustes where the error residuals left after alignment for landmarks 3 and 1 show severe underestimation. This is due to the fact that Procrustes translates strong shifts in one landmark position into smaller shifts in all landmarks. However, in this example the observed variation is largely restricted to deformations of the nasal bones (landmark 1) and partially open frontal suture (landmark 3) without displaying noticeable shape changes in other parts of the skull. Hence the larger error bars of our method give in this case a more accurate representation of the observed biological variation. This is in agreement with the results shown earlier in Figure 15 where for two landmarks Procrustes residuals are much smaller than the expected error values (standard deviations) with which the MonteCarlo data were generated. For our method, however, estimated errors are all comparable to the expected ones as shown earlier in Figure 12. In order to compare the magnitude of errors estimated using our method to those suggested by the repeat data, one should revisit Figure 6. The figure again suggests comparable error estimations. Finally, it is clear from the zy and xz projection planes that expected symmetry is achieved to a large extent in orientation and size for most corresponding error bars (in either method).
Now we turn to further comparing our method to Procrustes in a quantitative manner. The inconsistency observed earlier in error bars corresponding to the residuals left after applying Procrustes to the fly wing data (Figures 16 and 17) is displayed more clearly using a scatter plot in Figure 23. Here we plot the eigenvalues corresponding to the 15 common landmarks after the 4 semilandmarks are added against those without any additional landmarks. We plot this for both the likelihood and Procrustes methods. It is clear here that there are departures from the permitted scatter region when Procrustes is used. This indicates a significant change in the unexplained variance following linear model construction, which itself implies differences in the linear model itself, i.e. the Procrustes model is unstable following the addition of poorly measured landmarks.
Figure 23. Fly wing data: error eigenvalues estimated using the likelihood method and those computed using the residuals after Procrustes alignment, when each method is applied to PFL1 (4 semilandmarks (1619) added to FL1) against those when applied to FL1; the plot is for the 15 common landmarks (115); the Procrustes results have many points which are well beyond the expected limits, suggesting that model parameters are not consistently determined upon inclusion of semilandmarks; the two dashed lines show the±2.8σ range.
Further, we performed a χ^{2} test based upon the construction of corrected covariances on one data set (FL1/FR1) and then used for the calculation of χ^{2} for a second data set (FL2/FR2). The corresponding plot for χ^{2} test in Figure 24 confirms the stability of our method, as all χ^{2}/DoF values fall in the range expected. Further χ^{2} tests (not shown here) with different numbers of data samples and combinations of data sets indicate the appropriateness of the assumed linear model for the fixed number of components.
In Table 2, we list the Fisher Information (FI) value for the two methods and the three data sets studied. Again, for Procrustes, variances used to compute the FI value are obtained from the residuals left after alignment between the data and the simulated linear model. Our method, which is based on likelihood and measurement covariance, gives FI values roughly between two and four times those obtained using Procrustes. The largest difference corresponds to the 3D MS data with 14 model components. As FI is proportional to the quantity of data, this demonstrates that the changes away from the isotropic assumption inherent to Procrustes/PCA has a significant effect on the efficacy of the model, equivalent to having defined only a third as many landmarks from the outset. We can also see in this table the effect of adding 4 semilandmarks to the 15 original landmarks. The numbers in the parentheses show the contributions from the 4 added landmarks to the total FI values. The reason for the decrease in the FI values after adding 4 landmarks is that we are using the same number of degrees of freedom (DoF) to describe correlations between more points. Also these values are computed for uncorrected covariance estimates, because correction is not available when using Procrustes.
Table 2. Fisher information (FI) values: listed for the Procrustes and our method when applied to the fly wing data (PFL1), mouse mandible data (MM1) and 3D mouse skull data (MS)
Finally, the PCA analysis shows that in fly wing data 3 components can account for about 65% of variance, while for mouse mandibles 6 components are needed to achieve the same level. In both cases, the model order preferred by our analysis is significantly less than the heuristic limit of 90% used by some researchers.
Conclusions
Our analysis approach has been driven by the requirements of statistical estimation, quantitation and self consistency, i.e. distributions assumed during likelihood construction match the data and estimated parameters match those generating the data. From a more philosophical standpoint we can consider what we are doing when we identify landmark locations and attempt to compare them between sets (shapes). We do not expect that biology manipulates the locations of our chosen landmarks directly, they simply appear to move around as the net effect of distributed developmental and evolutionary influences. Recent considerations of biology have introduced the phrase “palimpsest” [52], as an analogy with repeatedly erasing and rewriting text in an ancient parchment, to describe the way that structures develop. Notice that the initial choice of landmarks is subjective, not only in terms of the features selected but also how we chose to define their locations. A landmark is the result of a localisation procedure (partly influenced by multiple biological considerations) which has an associated positional uncertainty. In this work we have associated the problems of working with semilandmarks in biological shape analysis as being a consequence of the statistical assumptions implicit to analysis techniques such as Procrustes/PCA. We have implemented a new method which takes appropriate account of measurement and landmark localisation stability in order to obtain a new form of analysis which is consistent with a likelihoodbased definition of the alignment and model building tasks. This method can be equivalently interpreted as a redefinition of the landmark location as ghost points.
The conventional interpretation of Procrustes is that the resulting linear model is a pure shape description which can be directly associated with biological processes. Some may argue that extending the approach to weight data, even to accommodate semilandmarks, breaks with this tradition. However, it is our belief that any distinction between the original landmark and our definition of a ghost point, as locations which are somehow true measurements of biology in one case but not the other, is arbitrary. Reweighting of data using a covariance is statistically equivalent to modifying the information available by changing the specified set of landmarks. Use of a leastsquares measure (which assumes isotropic errors) does not introduce some absolute measurement of biology. Both approaches need to be calibrated using known samples with identifiable biological cause in order to make any scientific interpretation.
Now that we have a specific definition for how to weight landmark data, we can see that using ghost points does not invalidate use of Kendall’s statistics as suggested in [22]. The use of these approaches follows due to scale normalisation of the shape data, it is not an intrinsic property of the use of the original landmarks coordinates perse. We can also reproject scaled (whitened) shapes onto the tangent space defined in the transformed ghost space if we wish, in order to remove local curvature arising from scale normalisation.
Far from there being no objective way to define these covariances [22,24], there are at least three; a) one can estimate them directly from repeatability of measurements (e.g. see [48]); b) they can be directly estimated via conventional statistical means when using likelihoodbased landmark location (CRB) (e.g. see [25,26,28]); c) they can be estimated as the unexplained stochastic variation (residuals) in fitted data (as in this paper and e.g. [27]). For the latter, when estimated using residuals of the fitted shape model, we will see contributions additional to the measurement process; this is the stochastic (therefore unmodellable) behaviour of the biology itself. Our results indicate that measurement covariances can be reliably estimated in our data for sample sizes at least as small as 40.
Our result indicate that the new method summarises the information content of the measured data better (improved FI scores), and with more stability than Procrustes/PCA (consistent models are generated following the addition of new points). Although we have not provided empirical evidence in this paper, the expected theoretical advantages of this approach are several; a) as all landmarks of fixed local structure have an associated measurement covariance, the approach described provides a consistent way of incorporating qualitatively different forms of landmark (type I, type II, semilandmarks, geometric landmarks, etc.) into the analysis; b) provided that landmark stability is well described by a Gaussian distribution, our method removes the instabilities inherent in the analysis due to poorly determined points; c) as the parameters for the linear model are now selfconsistently estimated for an identifiable generative scheme (embodied here via MonteCarlo simulation) it affords the application of an eigenvector analysis statistical rigour; d) it offers the possibility of interpreting the linear modelling process as a statistical approximation, with consequent interpretations of the requirement for the number of linear model components; e) finally, generalisation of the approach would seem to be possible which would support the analysis of dense landmarks on surfaces and curves.
We have also demonstrated how linear model order selection can be performed by comparing baseline reproducibility errors with those estimated from the model. Finally, we have shown how the use of repeated analysis on matched samples can be used to confirm the stability of the estimated anisotropic error. We believe that these tools are sufficient to allow use of this technique in biological studies. More study is needed in order to develop an understanding of the value of our new technique in a greater range of biological analyses.
The methods described in this paper are freely available from the TINA web site [51] via the Geometric Morphometric toolkit, as a system for quality assessment and validation of output data.
Endnote
^{a}Bookstein [53]: “Wherever there is partial registration the true value of a (vector deformation) is inaccessible.”
Appendix A
Rotation matrix
Based on the geometry shown in Figure 1, we first calculate the vectors , and .
The rotation matrix R^{T} is hence given by:
Roll, pitch and aw angles
The multiplication of the rotation matrices R_{x}(γ), R_{y}(β) and R_{z}(α) gives
[hpb!]
Hence by enforcing R^{T}=R_{xyz}, it is straightforward to find the rotation angles α, β and γ.
Direction vectors
At each landmark point n with m_{nx}, m_{ny} and m_{nz} as the mean coordinates, the rotated vector by angle α around the z axis is
The first derivatives of this vector with respect to α gives
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
HR undertook software and methods development, performed experiments, and produced the final version of the manuscript and responses to reviews. NAT conceived the new statistical methods for shape analysis and provided technical project coordination. PAB developed the automatic landmarking software and provided maintenance of software libraries, web pages and infrastructure. DT provided overall scientific management and coordination of the project. ACS participated in acquisition of datasets including manual/automatic landmark identification. All authors read and approved the final manuscript.
Acknowledgements
This work was funded by institutional resources of the MaxPlanck Society. The authors would like to thank Chris Klingenberg (at the University of Manchester) and Louis Boell (at the MaxPlanck Institute for Evolutionary Biology) for providing the fly wing data and the mouse mandible data respectively.
References

Adams DC, Rohlf FJ, Slice DE: Geometric morphometrics: ten years of progress following the ‘revolution’.
Ital J. Zool 2004, 71:516. Publisher Full Text

Klingenberg CP: Evolution and development of shape: integrating quantitative approaches.

Mitteroecker P, Gunz P: Advances in geometric morphometrics.
Evol Biol 2009, 36(2):235247. Publisher Full Text

Vignon M, Sasal P: The use of geometric morphometrics in understanding shape variability of sclerotized haptoral structures of monogeneans (Platyhelminthes) with insights into biogeographic variability.
Parasitol Int 2010, 59(2):183191. PubMed Abstract  Publisher Full Text

Bookstein FL: Tensor biometrics for changes in cranial shape.
Ann Human Biol 1984, 11:413437. Publisher Full Text

Goodall CR: The Statistical Analysis of Growth in Two Dimensions. USA. Harvard University: Department of Statistics; 1983.

Kendall DG: Shapemanifolds, procrustean metrics, and complex projective spaces.
Bull London Math Soc 1984, 16(2):81121. Publisher Full Text

Bookstein FL: Size and shape spaces for landmark data in two dimensions.
Stat Sci 1986, 1:181242. Publisher Full Text

Mantel NA: The detection of disease clustering and a generalized regression approach.
Cancer Res 1967, 27:209220. PubMed Abstract  Publisher Full Text

PeresNeto PR, Jackson DA: How well do multivariate data sets match?The advantages of a procrustean superimposition approach over the mantel test.
Oecologia 2001, 129:169178. Publisher Full Text

Hubert M, Rousseeuw PJ, Branden K: ROBPCA: A new approach to robust principal component analysis.
Technometrics 2005, 47:6479. Publisher Full Text

Gunz P, Mitteroecker P, Bookstein FL: Semilandmarks in three dimensions. In Modern Morphometrics in Physical Anthropology. Edited by Slice DE. New York: Kluwer Academic/Plenum Publishers; 2005:7398.

Bookstein FL: Landmark methods for forms without landmarks: morphometrics of group differences in outline shape.
Med Image Anal 1997, 1(2):225243. PubMed Abstract  Publisher Full Text

Fitzpatrick JM, West JB, Maurer CR: Predicting error in rigidbody pointbased registration.
IEEE Trans Med Imaging 1998, 17(5):694702. PubMed Abstract  Publisher Full Text

Chui H, Rangarajan A: A new point matching algorithm for nonrigid registration.
Comput Vision Image Underst 2003, 89(23):114141. Publisher Full Text

Rohlf FJ, Slice DE: Extensions of the procrustes method for the optimal superimposition of landmarks.
Syst Zool 1990, 39:4059. Publisher Full Text

Walker JA: Ability of geometric morphometric methods to estimate a known covariance matrix.
Syst Biol 2000, 49(4):686696. PubMed Abstract  Publisher Full Text

Richtsmeier JT, Deleon VB, Lele SR: The promise of geometric morphometrics.

Rohlf FJ: Bias and error in estimates of mean shape in geometric morphometrics.
J Human Evol 2003, 44:665683. Publisher Full Text

Lele S: Euclidean distance matrix analysis (EDMA): estimation of mean form and mean form difference.
Math Geol 1993, 25(5):573602. Publisher Full Text

MartinezAbadias N, Heuze Y, Wang Y, Jabs EW, Aldridge K, Richtsmeier JT: FGF/FGFR signaling coordinates skull development by modulating magnitude of morphological integration: evidence from Apert syndrome mouse models.

Zelditch ML, Swiderski DL, Sheets HD, Fink WL: Geometric Morphometrics for Biologists, A Primer. New York: Elsevier Academic Press; 2004.

Goodall CR: Procrustes methods in the statistical analysis of shape.

Lele S, Richtsmeier JT: Statistical models in morphometric: are they realistic?
Syst Zool 1990, 39:6069. Publisher Full Text

Theobald DL, Wuttke DS: Empirical bayes hierarchical models for regularizing maximum likelihood estimation in the matrix gaussian procrustes problem.
Proc Natl Acad Sci USA 2006, 103(49):1852118527. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Rohr K, Stiehl HS, Sprengel R, Buzug TM, Weese J, Kuhn MH: Landmarkbased elastic registration using approximating thinplate splines.
IEEE Trans Medical Imaging 2001, 20(6):526534. Publisher Full Text

Ragheb H: Morphometric shape analysis with measurement covariance estimates. In Proc. British Machine Vision Conference.. Dundee, UK; 2011.

Ragheb H, Thacker NA: Quantitative localisation of manually defined Landmarks. In Proc. Medical Image Understanding and Analysis. London, UK.; 2011:221225.

Balachandran R: Iterative solution for rigidbody pointbased registration with anisotropic weighting.
Proc. of SPIE (Medical Imaging), vol. 7261. 2009, 72613D1–72613D10.

Beinat A, Crosilla F: A generalized factored stochastic model for the optimal global registration of LIDAR range images.

Hyvarinen A, Karhunen J, Oja E: Independent Component Analysis. New York: John Wiley & Sons; 2001.

Ruto A, Lee M, Buxton B: Comparing principal and independent modes of variation in 3D human torso shape using PCA and ICA. In ICA Research Network International Workshop. Liverpool, UK; 2006:14.

Uzumcu M, Frangi RF, Reiber JHC, Lelieveldt BPF: Independent component analysis in statistical shape models.

Mitteroecker P, Bukstein F: Linear discrimination, ordination, and the visualization of selection gradients in Modern Morphometrics.
Evol Biol 2011, 38:100114. Publisher Full Text

Luo B, Hancock ER: Iterative procrustes alignment with the EM algorithm.

Cootes TF, Taylor CJ, Cooper DH, Graham J: Active shape modelstheir training and application.
Comput Vision Image Underst 1995, 61:3859. Publisher Full Text

Klingenberg CP: Novelty and “homologyfree” Morphometrics: What’s in a name?
Evol Biol 2008, 35(3):186190. Publisher Full Text

Polly PD: Developmental dynamics and GMatrices: Can morphometric spaces be used to model phenotypic evolution?
Evol Biol 2008, 35(2):8396. Publisher Full Text

Oxnard C, O’Higgins P: Biology clearly needs morphometrics. Does morphometrics need biology?
Biol Theory 2009, 4:8497. Publisher Full Text

Bookstein FL, Slice DE, Gunz P, Mitteroecker P: Anthropology Takes Control of Morphometrics. Vienna, Austria: Institute for Anthropology: University of Vienna; 2004.

Slice DE: Geometric morphometrics.
Annual Rev Anthropol 2007, 36:261281. Publisher Full Text

Perez SI, Bernal V, Gonzalez PN: Differences between sliding semilandmark methods in geometric morphometrics, with an application to human craniofacial and dental cariation.
J Anat 2006, 208(6):769784. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

GomezRobles A, Olejniczak AJ, MartinonTorres M, PradoSimon L, de Castro JMB: Evolutionary novelties and losses in geometric morphometrics: a Practical approach through Hominin molar morphology.
Evolution 2011, 65(6):17721790. PubMed Abstract  Publisher Full Text

Frederich B, Liu SYV, Dai CF: Morphological and genetic divergences in a coral reef damselfish, pomacentrus coelestis.
Evol Biol 2012, 39:359370. Publisher Full Text

Tipping ME, Bishop CM: Probabilistic principal component analysis.
J R Stat Soc B 1999, 61(3):611622. Publisher Full Text

Akaike H: A new look at the statistical model identification.
IEEE Trans Automatic Control 1974, 19(6):716723. Publisher Full Text

Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm.

Barlow RJ: Statistics: A Guide to the Use of Statistical Methods in the Physical Sciences.. WileyBlackwell; 1989.

Klingenberg Lab, The University of Manchester.
[http://www.flywings.org.uk webcite]

Schunke AC: TINA manual landmarking tool: software for the precise digitization of 3D landmarks.

Bromiley PA: The TINA Geometric Morphometrics Toolkit. [http://www.tinavision.net/docs/memos/2010007.pdf webcite]

Halgrimsson B, Lieberman DE, Young NM, Parsons T, Wat S: Evolution of covariance in mammalian skull. In Novartis Foundation SymposiumTinkering: The Microevolution of Development,Volume 284. Edited by Hall BK, Lieberman DE. New York: WileyLiss; 2007:164184.

Bookstein FL: Registration error and functional image analysis. In Workshop on Biomedical Statistics. Leeds: UK; 2001.