Existing work on scene flow estimation focuses on autonomous driving and mobile robotics, while automated solutions are lacking for motion in nature, such as that exhibited by debris flows. We propose DeFlow, a model for 3D motion estimation of debris flows, together with a newly captured dataset. We adopt a novel multi-level sensor fusion architecture and self-supervision to incorporate the inductive biases of the scene. We further adopt a multi-frame temporal processing module to enable flow speed estimation over time. Our model achieves state-of-the-art optical flow and depth estimation on our dataset, and fully automates the motion estimation for debris flows.
Architecture overview. The inputs of our network are two consecutive camera images and the corresponding (synchronous) range maps generated by a LiDAR sensor. The optical and depth information are encoded by image and sparse depth encoders. Next, a multi-level sensor fusion scheme combines depth features, image features, and the feature correlation volume. The aggregated feature maps are fed into a multi-task decoder to output the depth and optical flow estimates. After the learnable part, we employ deterministic geometric relations to back-project the depth image into the point cloud and compute 3D motion from correspondences defined by the optical flow.
Multi-frame Smoothing Module
For better experience, please watch this video at a high resolution (>=1080p)
3D Flow Speed Estimate
Please cite both papers if you find our method and dataset useful.