System Evaluation » Filter Evaluation Metrics

Absolute Trajectory Error (ATE)

The Absolute Trajectory Error (ATE) is given by the simple difference between the estimated trajectory and groundtruth after it has been aligned so that it has minimal error. First the "best" transform between the groundtruth and estimate is computed, afterwhich the error is computed at every timestep and then averaged. We recommend reading Zhang and Scaramuzza [47] paper for details. For a given dataset with $N$ runs of the same algorithm with $K$ pose measurements, we can compute the following for an aligned estimated trajectory $\hat{\mathbf{x}}^+$ :

\begin{align*} e_{ATE} &= \frac{1}{N} \sum_{i=1}^{N} \sqrt{ \frac{1}{K} \sum_{k=1}^{K} ||\mathbf{x}_{k,i} \boxminus \hat{\mathbf{x}}^+_{k,i}||^2_{2} } \end{align*}

Relative Pose Error (RPE)

The Relative Pose Error (RPE) is calculated for segments of the dataset and allows for introspection of how localization solutions drift as the length of the trajectory increases. The other key advantage over ATE error is that it is less sensitive to jumps in estimation error due to sampling the trajectory over many smaller segments. This allows for a much fairer comparision of methods and is what we recommend all authors publish results for. We recommend reading Zhang and Scaramuzza [47] paper for details. We first define a set of segment lengths $\mathcal{D} = [d_1,~d_2,\cdots,~d_V]$ which we compute the relative error for. We can define the relative error for a trajectory split into $D_i$ segments of $d_i$ length as follows:

\begin{align*} \tilde{\mathbf{x}}_{r} &= \mathbf{x}_{k} \boxminus \mathbf{x}_{k+d_i} \\ e_{rpe,d_i} &= \frac{1}{D_i} \sum_{k=1}^{D_i} ||\tilde{\mathbf{x}}_{r} \boxminus \hat{\tilde{\mathbf{x}}}_{r}||_{2} \end{align*}

Root Mean Squared Error (RMSE)

When evaluating a system on a single dataset is the Root Mean Squared Error (RMSE) plots. This plots the RMSE at every timestep of the trajectory and thus can provide insight into timesteps where the estimation performance suffers. For a given dataset with $N$ runs of the same algorithm we can compute the following at each timestep $k$ :

\begin{align*} e_{rmse,k} &= \sqrt{ \frac{1}{N} \sum_{i=1}^{N} ||\mathbf{x}_{k,i} \boxminus \hat{\mathbf{x}}_{k,i}||^2_{2} } \end{align*}

Normalized Estimation Error Squared (NEES)

Normalized Estimation Error Squared (NEES) is a standard way to characterize if the estimator is being consistent or not. In general NEES is just the normalized error which should be the degrees of freedoms of the state variables. Thus in the case of position and orientation we should get a NEES of three at every timestep. To compute the average NEES for a dataset with $N$ runs of the same algorithm we can compute the following at each timestep $k$ :

\begin{align*} e_{nees,k} &= \frac{1}{N} \sum_{i=1}^{N} (\mathbf{x}_{k,i} \boxminus \hat{\mathbf{x}}_{k,i})^\top \mathbf{P}^{-1}_{k,i} (\mathbf{x}_{k,i} \boxminus \hat{\mathbf{x}}_{k,i}) \end{align*}

Single Run Consistency

When looking at a single run and wish to see if the system is consistent it is interesting to look a its error in respect to its estimated uncertainty. Specifically we plot the error and the estimator $3\sigma$ bound. This provides insight into if the estimator is becoming over confident at certain timesteps. Note this is for each component of the state, thus we need to plot x,y,z and orientation independently. We can directly compute the error at timestep $k$ :

\begin{align*} \mathbf{e}_k &= \mathbf{x}_{k} \boxminus \hat{\mathbf{x}}_{k} \\ &\textrm{where} ~~\mathbf{e}_k\sim \mathcal N (0, \mathbf P) \end{align*}