Photogrammetry: How to create a 3D model from 360 video in an hour or less
Written by Axel Busch on 20 March 2024
The advantages of using 360 video for photogrammetry are price, speed, and ease of use. Unlike traditional photogrammetry techniques, current 360 cameras make it very easy to create useful 3D models, especially in overhead environments (e.g. caves and wrecks).
On a recent dive trip we anchored near a cave (Venus Cave, NZ), and I wanted to find out if we could reconstruct a 3D model of it from a quick recording with the Insta360 ONE RS 1-inch 360 Edition camera in Mantis RS360 housing. For lighting we had three Sola Video Pro 3800 video lights, you can see the setup in the photo above.
This is the 360 video we shot:
And this is the 3d model we re-construct in under one hour:
In this article …
What are the advantages of 360 degree video compared to standard photography for photogrammetry?
What are the disadvantages of using 360 degree video for photogrammetry?
How do you take good videos/pictures with a 360 camera for photogrammetry?
How do I create a 3D model from 360 degree video with Metashape
Can I use 360 video for photogrammetry?
The answer is 'yes.' Photogrammetry software like Agisoft Metashape and 3DF Zephyr allow you to import and process 360 videos directly without relying on other tools.
If the photogrammetry software does not support 360 videos, you can save screenshots, for example every 1 or 2 seconds, and import those.
You can even use photogrammetry software that does not support 360 degree spherical input at all, by splitting each 360 photo into 6 individual views (like a cube). The free version of 3DF Zephyr and the free Meshroom include tools for this.
What are the advantages of 360 degree video compared to standard photography for photogrammetry?
Put simply, the advantages of 360 degree video compared to standard photography are price, speed and ease of use. They make it very affordable and easy to create useful 3D models from underwater structures.
Price: Single-shot 360 cameras are very inexpensive compared to a DSLR or mirrorless camera. Camera and housing costing only about $1,000-2,400 compared to $6,000-10,000 for a mirrorless setup (not including lights).
Speed: With 360 video you can record a large area in minutes that can take days to cover with traditional techniques.
Ease of use: 360 cameras automatically record the whole spherical panorama as well as direction and orientation in auto mode. This makes it possible to very quickly reconstruct a model including all turns and twists and elevation changes.
Traditional photogrammetry techniques rely on DSLR and mirrorless cameras on a tripod or drones and LIDAR. They produce very accurate models but are very time-consuming and difficult to use. You would typically carefully set up a camera on a tripod with specialist panoramic heads, take 8 or more photos, then move the tripod a little further and repeat the process.
With a 360 camera on record on the other hand you can just walk, swim, or drive through a scene . The camera's sensor synchronisation, gyroscope and calibration algorithm takes care of the rest. Even curves in the paths and altitude changes in depth are reflected accurately in the 3D model. It is all so effortless, a true win for technology.
What are the disadvantages of using 360 degree video for photogrammetry?
The biggest disadvantage is the limited resolution. 360 Cameras record only 16-60 Megapixels per frame covering the whole panorama, which is much less than the 200-300 Megapixels you would get from 8 photos recorded with a modern DSLR or mirrorless camera. That means the resulting model and texture will have a lot less detail.
But often that doesn’t matter since we’re more interested in the shape and “floor plan” than capturing millimeter details on the wall. And 360 will do that with the minimum amount of time and cost.
If you need the highest surface detail possible, 360 cameras are not the first choice for photogrammetry. But 360 cameras are unbeatable when it comes to ease of use, speed, and price.
Which software is best for photogrammetry?
The most popular photogrammetry programs are Agisoft Metashape, RealityCapture, 3DF Zephyr, PhotoModeler, Pix4D, Meshroom (free) and WebODM (free).
They differ in price, speed, and unique capabilities making them more suitable for certain scenarios than for others. I tried Metashape, RealityCapture, 3DF Zephyr, Pix4D and Meshroom, and found that when it comes to photogrammetry from 360 photos/videos, Agisoft Metashape was ahead of the competition for the following reasons:
Metashape can process 360 degree videos and photos without needing other software or steps.
Metashape’s photo alignment algorithm works extremely well with 360 photos. We had no problem aligning our datasets with Metashape, while other photogrammetry software we tried failed (Zephyr, RealityCapture, Meshroom).
Metashape supports 360 degree videos and photos in the inexpensive Standard Edition. Free 30 day trial available.
Metashape can produce very fast and very detailed models, depending on your priorities.
Metashape is available for Windows, macOS, and Linux.
For the examples in this article we used Metashape Standard Edition version 2.1, which costs US$179 for a perpetual license at the time of writing. This is considered very affordable in the photogrammetry software space.
There are, however, two noticeable limitations to the Standard Edition that we noticed during our tests:
The Standard Edition does not include tools to automatically delight textures.
The Standard Edition does not include geo-referencing and scaling tools.
How do you take good videos/pictures with a 360 camera for photogrammetry?
Taking good photos/videos with a 360 camera for photogrammetry is generally the same as with a regular camera, but simpler: the 360 camera has a fixed lens and fixed focus, so it’s impossible to make mistakes in this regard.
There is however one important difference between taking photos/videos with a 360 camera for photogrammetry than with a regular camera: Stabilisation, which should be turned on for 360 photos/video.
When using a regular camera it is important to turn off image stabilisation, as it can interfere with the picture alignment algorithm of the photogrammetry software.
Why should we turn stabilisation on for 360 cameras? It’s about the stitching.
When using a 360 camera, we have an additional step between taking the photos/video and importing them into the photogrammetry software: stitching the 360 image. And for the stitching step it is best if stabilisation is enabled, because it ensures that the sky is up no matter how the camera was held at the time. And that is precisely what we want for our 360 photos/videos.
And one more thing, especially for underwater capture … use lights! You can read all about it in our article: What is the best lighting for underwater 360 video?
What is the best camera setup for photogrammetry?
While photogrammetry software can handle most images and camera settings, you will get better results with certain settings on your camera.
Fortunately for the Insta360 cameras supported by Mantis Sub housings, these are the same settings that we also recommend for general 360 photos/video, so you don't have to change the settings. You can use the same recording for your 360 video and 3D reconstruction.
Recommended 360 degree camera settings for photogrammetry:
Resolution: Highest resolution.
Image Quality: Highest quality.
White balance: Manual/fixed, not automatic.
Exposure: Automatic (generally).
max ISO: 200–400 for smaller then 1/2”, up to 800 for 1/2”, up to 1600 for 1" or larger.
RAW Images: Not needed (generally).
Sharpening: Off or 'soft'.
Stabilisation: On.
Direction lock: On.
If you want to learn more about how to set up your Insta360 camera for underwater photogrammetry, check out our other articles:
What computer specs do I need for photogrammetry?
Generally, any computer you would consider for professional video production, as a CAD workstation, or for high-end 3D gaming is also a good choice for photogrammetry. In 2024 that means
Apple Macbook with M1/M2/M3 Pro/Max/Ultra processor, or a
PC with Core i7/Ryzen 7 or Core i9/Ryzen 9 processor and a Graphics card with at least 8GB VRAM (16GB recommended).
At least 32GB memory (for both Apple and PC)
Especially Metashape is very good at using multiple CPU cores, so a CPU with 16 or more cores like a Core i9, Ryzen 9, or M1/2/3 Max or Ultra really make a difference.
But don’t choose cores over CPU clock speed (Ghz), because clock speed has the single biggest impact on the processing times - as long as you have enough memory (RAM). If the computer does not have enough memory then the software will have to temporarily store parts of the memory on disk (called 'swapping'), which makes computations run very slow. It will still work, but it will take hours instead of minutes.
Regarding memory, 16GB RAM is the bare minimum for small models (e.g. less than 50 images). For more than 50 images 32 GB RAM is needed, and if you want to create very detailed Textures then 64 GB RAM or more are very much recommended.
For PCs, the Graphics card (GPU) can also have a big impact, but only if it has enough dedicated VRAM (Video memory) to store the data it’s working on. For smaller models 8GB VRAM can be enough, but for larger models (more than 100 images) and for more detailed textures 16GB are often required. If the VRAM is not large enough to hold the data, then the GPU cannot be used and all processing happens on the CPU and RAM.
GPU memory worries do not apply to Apple, since CPU and GPU cores share the same fast memory.
Rules of thumb for buying a photogrammetry computer:
If you have to choose between a faster CPU and a better GPU, get the faster CPU.
If you have to choose between a faster CPU and more RAM, get more RAM.
It does not have to be a new machine! Consider a refurbished machine instead of buying a low-spec new computer. The 9th generation Intel Core i7 and Core i9 CPUs and newer, and Apple M1 Max and newer are all excellent for photogrammetry.
When possible choose a Desktop PC over a Laptop PC. Many Laptop PCs run hot under sustained workload and slow down dramatically (rule does not apply to Apple laptops).
How do I create a 3D model from 360 degree video with Metashape?
There are five general steps involved in creating a 3D model through photogrammetry:
Import your video/photos - videos will be converted to photos.
Align photos - estimate the camera position and orientation at time of capture and match key points across images.
Build the point cloud - points in 3D space that represent points on the surface of the model.
Build the 3D Model - a triangle mesh model that represents the external surface of the model.
Create textures - a texture contains the colour information for every surface area of the model.
With these steps in mind Metashape is very easy, and almost intuitive to use. Simply choose the next step from the "Workflow" menu.
The dialog that comes up for each step is well laid out, with the most likely choice already selected. And while there is no context help for each option it is all explained very well in the accompanying PDF manual.
There are however a few tweaks that can have a big effect on the processing time and quality of the model, or where our choice might be different from the standard selection. Read on …
Step 1: Import video or photos
You can choose to import a video file or photos. Internally all photogrammetry software works on photos, so when you choose to import a video, Metashape will extract photo snapshots automatically.
To add a video click on the menu item File -> Import -> Video ...
To add photos click on the menu item Workflow -> Add Photos...
When importing a video, you can specify the interval at which snapshots are extracted using the "Frame step" parameter in the import dialog.
The Insta360 Pro2 and One RS cameras record 360 video at 30 frames per second, so a frame step of 30 means that one photo is saved every second.
You can also preview the video and specify a different start and end time, should you want to only import a shorter section.
After importing save the project by clicking on the menu item File -> Save.
Step 2: Align photos
This processing step calculates the camera position and rotation at the time of capture and matches key points across images.
Camera positions and key points are then used to calculate depth maps for overlapping image pairs using dense stereo matching.
But first we have to tell Metashape that our images are spherical panoramas, or the alignment process will fail.
Click on the menu item Tools -> Camera Calibration
Then then the parameter "Camera type" to "Spherical" and press [OK].
Now click on the menu item Workflow -> Align Photos...
Then set the following parameters in the Align Photos dialog:
General:
Accuracy: High
Generic preselection: On
Reference preselection: On Type: Sequential
Advanced:
Exclude stationary tie points: On
Press [OK]
The option Sequential tells Metashape that the image positions advances with each image.
The option Exclude stationary tie points tells Metashape to exclude parts of the image that doesn't change - e.g. the nadir or camera pole.
For 98 source photos this step took 3 minutes on an i9-10850 CPU using all 20 cores at 4.5GHz and 15GB of RAM and produced 92,900 tie points and 98 depth maps.
The positions of the cameras are displayed as spheres. This can be toggled on and off with the Menu item Model -> Show/Hide Items -> Show Cameras.
Tie points can be edited within Metashape environment, for example unwanted points can be selected and deleted.
If you’re in a hurry you can now already create a 3D model just from the tie-points. It only takes about 2 seconds but has less than 100,000 faces.
Step 3: Build the point cloud
This processing step uses the calculated camera positions and rotations from the alignment step to calculate a large set of data points that represent the 3D shape of the model.
Click on the menu item Workflow -> Build Point Cloud...
Then set the following parameters in the Build Point Cloud dialog:
General:
Source data: Depth maps
Quality: High
Advanced:
Depth filtering: mild
Calculate point colors: On
Calculate point confidence: On
Press [OK]
For 98 source photos this step took 10 minutes on an i9-10850 CPU using all 20 cores at 4.5GHz and 16GB of RAM.
The point cloud can now be edited within the Metashape environment, for example unwanted points can be selected and deleted.
The point cloud can also be exported to an external tool for further analysis.
Step 4: Build the 3D model
After the point cloud has been reconstructed it is possible to generate a polygonal mesh model based on the point cloud data or depth maps data.
You can adjust the volume that should be reconstructed by changing the bounding box using the bounding box tools (resize region, move region, rotate region).
Metashape supports three reconstruction methods that vary in speed and quality of the generated model:
Tie-points based reconstruction: Very fast reconstruction of a low-detail model solely based on tie points. Takes only seconds.
Depth-maps based reconstruction: Slow reconstruction of a high-quality model using the GPU.
Point-cloud based reconstruction: Very slow reconstruction of a high-quality model based on the previously reconstructed or imported point cloud.
The recommended setting is "Depth-maps", unless the point cloud was edited prior to model reconstruction.
To start the reconstruction, click on the menu item Workflow -> Build Model...
Then set the following parameters in the Build Model dialog:
General:
Source data: Depth maps
Surface type: Arbitrary (3D)
Quality: High
Face count: High (or a desired number e.g. 1,000,000)
Advanced:
Interpolation: Enabled (default)
Depth filtering: Mild
Calculate vertex colors: On
Reuse depth maps: On
Press [OK]
You can create more than one Model using different parameters. Just leave the parameter "Replace default model" unchecked and a new Model will be saved.
When interpolation is used, Metashape will try to fill holes. This is usually desired and the default setting. These extra surfaces can later always be deleted easily if they are unwanted.
We recommend to create at least two models:
One model with Face count: High, and then
another model with a face count below 1,000,000
The second model is more suitable to use for online sharing or in a game engine. Models with a lower face count have less details, but you can use the high-resolution model to create a normal map texture which can recover that detail very well when viewing. More about this in the next section.
The model now shows the shape of our cave very well, but the walls only show very little colour.
Timing: For 98 source photos this step (using Depth Maps) took 3 minutes on an i9-10850 CPU using all 20 cores at 4.7GHz, 12GB of RAM, and 20% GPU utilization using 6GB of VRAM (Nvidia RTX2080TI)
Step 5: Build textures
Textures are the colours in the model. They are tied to each reconstructed model. If you have created several 3D models in the previous step, remember to set the desired model as the default model before creating the texture.
To set a model as active, right-click on the model and press "Set as default"
To start the reconstruction, click on the menu item Workflow -> Build Texture...
Then set the following parameters in the Build Texture dialog to build the Diffuse map:
General:
Texture type: Diffuse map
Source data: Images
Mapping mode: Generic
Blending mode: Mosaic
Texture size/count: 8192 x 1
Advanced:
Enable hole filling: On
Enable ghosting filter: On
Press [OK]
Texture type
The diffuse map is the standard colour texture. There is another type of texture that is very useful for models with fewer number of polygons and that’s the normal map. Normal maps are used to add details without using more polygons (faces). To create a normal map you need a higher polygon model, which is why we recommend creating a model with Face count “High”, which will often create several million polygons. And then a second with more usable number below 1M.
Texture size/count
The texture size and count control how detailed the texture will be. Texture files are always square images, so a number of 8192 means 8192x8192 pixels = 67 MP. A count of 1 means that one texture is generated, so the detail for the whole model is packed into one image file with 67 MP. A count of 2 means two files, so 134MP.
While larger and mode textures provide more details, there are limits when it comes to what devices can display. Graphics engines and mobile phone memory used to be very limited and a maximum texture size of 2048 was recommended. With smaller textures, increasing the texture count is then a way to maintain the same detail level: Because 2048 is 1/4th of 8192 and it’s a square format, it only has 1/16th of the area. That means 16 textures of size 2048x2048 are needed to store the same information as in one 8192x8192 texture.
Fortunately, very few modern devices should have a problem with a 8192 texture size. But low-range phones and older mid-range phones might still have a problem with required memory, and it’s safer to use 4096 x 4 instead of 8192 x 1.
For larger models like our 100m cave (with an area of about 600 sqm), one 8192 texture is not enough for adequate detail. So we’re generating 4196 x 8, which is the same as 8192 x 2. We also tested 8192 x 4, and while it used a lot more memory and time, it didn’t add noticeably more detail.
Creating the normal map
The normal map adds detail to the low-polygon model by extracting “bumpiness” information from a higher-detail model.
Set the following parameters in the Build Texture dialog to build the Normal map:
General:
Texture type: Normal map
Source data: 3D model (highest face count)
Mapping mode: Keep uv
Blending mode: Mosaic
Texture size/count: (same as Diffuse map)
Press [OK]
Metashape can copy textures between models, so you don’t always have to recreate them.
And that’s it. Now you’re ready to export your model!
Export the model
The most popular format for 3D files is Wavefront OBJ files (.obj). These files include the textures as external image files and a material description file (.mtl) that ties the model and textures together.
If you prefer to store model and textures in one file, we recommend the Binary glTF (.glb) format.
Note: Due to a severe security vulnerability in the Autodesk FBX (.fbx) format, this is no longer recommended and it is no longer supported by popular viewers such as Microsoft 3D Viewer.
What can I do with a 3D model?
The most popular use case after exporting a model from the photogrammetry software is probably to share it online. Popular platforms are Sketchfab or Construkted, and we recommend a model size of less than 1M polygons.
Sometimes you might want to edit the model more than what Metashape allows, or just view it on your computer. Popular free 3D viewing/editing softare are Blender and Meshlab. The most popular commercial packages are 3ds Max and Maya.
You can also import the model into a game engine like Unity, Godot, or Unreal and create a walkthrough experience or game.
And of course you can 3D print it!
More from the Mantis Sub Academy.
With the Mantis Sub Academy we want to provide a set of free resources to help you getting started with creating 360/VR content. Here are some more articles: