Using COLMAP and OpenMVS to create a 3D model based on smartphone video
As mentioned in my previous post, I’ve been attempting to make a 3D model of a densely wooded area with structure from motion techniques. The main goal is to reconstruct the ground, with its height variations, as accurately as possible. After some trial and error, I finally found a workflow that provides reasonable results.
Data
For the data collection, I used an iPhone 13 to capture wide-angle video with 2k resolution. I collected data for roughly 15 minutes, with the phone both in vertical and horizontal orientations. Videos were captured at 24 frames per second, walking around the mapped area (approx. 50x50 metres) slowly and as steadily as I could in the rough terrain. The videos were downsampled with the previously presented blur-aware downsampling, resulting in a total of 1528 images.
COLMAP and OpenMVS
I used the available docker containers to run both COLMAP and OpenMVS. I applied COLMAP to compute camera poses for the images, as well as a sparse point cloud recostruction based on the extracted features. This was carried out with the automatic reconstructor tool, to avoid detailed parameter configurations. The poses and the sparse pointcloud were used as input for OpenMVS for further processing. With OpenMVS, I created a rough mesh, a refined mesh, and finally a textured mesh, based on the input data. I skipped computing a dense pointcloud before meshing, since this would have required a ridiculous amount of memory (64gb RAM and 500gb swapfile were not enough in my tests).
Results
COLMAP managed to very well capture the camera poses and generate the sparse point cloud. This is visualised below, with the camera poses in red.
Resulting OpenMVS refined and textured mesh came out a little bit weird on the first glance. This is shown below. The surrounding trees created a bowl type of effect around the area. However, the mesh within looked good.
After a little bit of manual cleaning in Meshlab, the mesh of the area came nicely visible. As I was mainly interested in the shape of the ground for further modelling tasks, I also cleaned out most of the trees. Result shown below.
Discussion
The resulting mesh nicely represented the shape and height differences of the ground, so it suited my purposes well. Overall, the mesh could have been a bit sharper, as the trees and other objects in the area (car, trailer…) came out a bit warped. This was likely due to using the sparse pointcloud, instead of densifying it.