System Evaluation » Filter Timing Analysis

Collection

To profile the different parts of the system we record the timing information from directly inside the ov_msckf::VioManager. The file should be comma separated format, with the first column being the timing, and the last column being the total time (units are all in seconds). The middle columns should describe how much each component takes (whose names are extracted from the header of the csv file). You can use the bellow tools as long as you follow this format, and add or remove components as you see fit to the middle columns.

To evaluate the computational load (not computation time), we have a python script that leverages the psutil python package to record percent CPU and memory consumption. This can be included as an additional node in the launch file which only needs the node which you want the reported information of. This will poll the node for its percent memory, percent cpu, and total number of threads that it uses. This can be useful if you wish to compare different methods on the same platform, but doesn't make sense to use this to compare the running of the same algorithm or different algorithms across different hardware devices.

<node name="recorder_timing" pkg="ov_eval" type="pid_ros.py" output="screen">
    <param name="nodes"   type="str" value="/run_subscribe_msckf" />
    <param name="output"  type="str" value="/tmp/psutil_log.txt" />
</node>

It is also important to note that if the estimator has multiple nodes, you can subscribe to them all by specifying their names as a comma separated string. For example to evaluate the computation needed for VINS-Mono multi-node system we can do:

<node name="recorder_timing" pkg="ov_eval" type="pid_ros.py" output="screen">
    <param name="nodes"   type="str" value="/feature_tracker,/vins_estimator,/pose_graph" />
    <param name="output"  type="str" value="/tmp/psutil_log.txt" />
</node>

Processing & Plotting

Script "timing_flamegraph"

The flame graph script looks to recreate a FlameGraph of the key components of the system. While we do not trace all functions, the key "top level" function times are recorded to file to allow for insight into what is taking the majority of the computation. The file should be comma separated format, with the first column being the timing, and the last column being the total time. The middle columns should describe how much each component takes (whose names are extracted from the header of the csv file).

rosrun ov_eval timing_flamegraph <file_times.txt>
rosrun ov_eval timing_flamegraph timing_mono_ethV101.txt

Example output:

[TIME]: loaded 2817 timestamps from file (7 categories)!!
mean_time = 0.0037 | std = 0.0011 | 99th = 0.0063  | max = 0.0254 (tracking)
mean_time = 0.0004 | std = 0.0001 | 99th = 0.0006  | max = 0.0014 ( propagation)
mean_time = 0.0032 | std = 0.0022 | 99th = 0.0083  | max = 0.0309 (msckf update)
mean_time = 0.0034 | std = 0.0013 | 99th = 0.0063  | max = 0.0099 (slam update)
mean_time = 0.0012 | std = 0.0017 | 99th = 0.0052  | max = 0.0141 (slam delayed)
mean_time = 0.0009 | std = 0.0003 | 99th = 0.0015  | max = 0.0027 (marginalization)
mean_time = 0.0128 | std = 0.0035 | 99th = 0.0208  | max = 0.0403 (total)
Image

Script "timing_comparison"

This script is use to compare the run-time of different runs of the algorithm. This take in the same file as the flame graph and is recorded in the ov_msckf::VioManager. The file should be comma separated format, with the first column being the timing, and the last column being the total time. The middle columns should describe how much each component takes (whose names are extracted from the header of the csv file).

rosrun ov_eval timing_comparison <file_times1.txt> ... <file_timesN.txt>
rosrun ov_eval timing_comparison timing_mono_ethV101.txt timing_stereo_ethV101.txt

Example output:

======================================
[TIME]: loading data for timing_mono
[TIME]: loaded 2817 timestamps from file (7 categories)!!
mean_time = 0.0037 | std = 0.0011 | 99th = 0.0063  | max = 0.0254 (tracking)
mean_time = 0.0004 | std = 0.0001 | 99th = 0.0006  | max = 0.0014 ( propagation)
mean_time = 0.0032 | std = 0.0022 | 99th = 0.0083  | max = 0.0309 (msckf update)
mean_time = 0.0034 | std = 0.0013 | 99th = 0.0063  | max = 0.0099 (slam update)
mean_time = 0.0012 | std = 0.0017 | 99th = 0.0052  | max = 0.0141 (slam delayed)
mean_time = 0.0009 | std = 0.0003 | 99th = 0.0015  | max = 0.0027 (marginalization)
mean_time = 0.0128 | std = 0.0035 | 99th = 0.0208  | max = 0.0403 (total)
======================================
======================================
[TIME]: loading data for timing_stereo
[TIME]: loaded 2817 timestamps from file (7 categories)!!
mean_time = 0.0077 | std = 0.0020 | 99th = 0.0124  | max = 0.0219 (tracking)
mean_time = 0.0004 | std = 0.0001 | 99th = 0.0007  | max = 0.0023 ( propagation)
mean_time = 0.0081 | std = 0.0047 | 99th = 0.0189  | max = 0.0900 (msckf update)
mean_time = 0.0063 | std = 0.0023 | 99th = 0.0117  | max = 0.0151 (slam update)
mean_time = 0.0020 | std = 0.0026 | 99th = 0.0079  | max = 0.0205 (slam delayed)
mean_time = 0.0019 | std = 0.0005 | 99th = 0.0031  | max = 0.0052 (marginalization)
mean_time = 0.0263 | std = 0.0063 | 99th = 0.0410  | max = 0.0979 (total)
======================================
Image

Script "timing_percentages"

This utility allows for comparing the resources used by the algorithm on a specific platform. An example usage would be how the memory and cpu requirements increase as the state size grows or as more cameras are added. You will need to record the format using the pid_ros.py node (see Collection for details on how to use it). Remember that 100% cpu usage means that it takes one cpu thread to support the system, while 200% means it takes two cpu threads to support the system (see this link for an explanation). The folder structure needed is as follows:

psutil_logs/
    ov_mono/
        run1.txt
        run2.txt
        run3.txt
    ov_stereo/
        run1.txt
        run2.txt
        run3.txt
rosrun ov_eval timing_percentages <timings_folder>
rosrun ov_eval timing_percentages psutil_logs/

Example output:

======================================
[COMP]: processing ov_mono algorithm
    loaded 149 timestamps from file!!
    PREC: mean_cpu = 83.858 +- 17.130
    PREC: mean_mem = 0.266 +- 0.019
    THR: mean_threads = 12.893 +- 0.924
======================================
======================================
[COMP]: processing ov_stereo algorithm
    loaded 148 timestamps from file!!
    PREC: mean_cpu = 111.007 +- 16.519
    PREC: mean_mem = 5.139 +- 2.889
    THR: mean_threads = 12.905 +- 0.943
======================================
Image