The Future of Filmmaking: Cinematography in the Age of Photogrammetry

Since the dawn of cinema, new technology has come around every few years that radically changes the way we make movies.

In the 1940s, filmmakers began to embrace color film stocks, which literally added another dimension to cinematic storytelling. At about the same time, broadcast TV began delivering live sporting events, variety shows, and other programs to mass audiences. Then the movie industry responded to television’s popularity with widescreen formats (including Cinerama and CinemaScope) and larger-gauge film (such as 65mm and VistaVision) that helped filmmakers of the 1950s and 1960s enter a new world of creative possibilities.

In the age of digital filmmaking, these transformative technologies are coming about faster and faster.

It wasn’t that long ago that shooting raw video demanded Hollywood budgets, and digital color correction was reserved for only the highest-end studios. And it’s really only in the last 10 years that 4K (and now 8K and even 12K) cameras have become technologically feasible (much less accessible). Now, these tools are all commonplace.

But as revolutionary as each of these technologies were in their time, they all do basically the same thing: they turn the living action in front of the lens into a flat, recorded image.

Today, we’re at the cusp of technical innovation that will change what it means to truly capture a scene. Simply shooting beautiful footage is no longer enough, and filmmakers should pay attention

In today’s article, we’ll explain what photogrammetry is, explore why it’s already a valuable tool in the creative industry, and discover how it can be used in video workflows of every scale.

All the world’s a stage

When we capture a scene, what are we really doing?

The way movie cameras work is basically unchanged from the last century: light is focused onto a flat plane (whether photochemical film or digital image sensor) that turns the action in front of the lens into a two-dimensional image. Even 3D camera systems work this way, just with multiple sensors.

But since the 2000s, productions have started capturing a lot more data on set than just that picture.

Many cameras now have accelerometers to record data about tilts, pans, and movements. There are also intelligent lens-mount systems that record metadata for iris, focus, and focal length.

These new types of data are vital for post-production teams, like VFX departments, whose job it is to convincingly mesh their digital creations into the real-world scene that was captured by the camera.

This is one of the key issues facing modern filmmakers: how do you capture more, better information about the world around your camera in a way that enables modern post-production techniques?

Bridging that gap between the real world and the digital world is something that modern productions need to prioritize, because our workflows and tools will continue to demand more and more data.

That’s where photogrammetry comes in.

What is photogrammetry?

Depending on who you talk to, photogrammetry can mean a lot of different things. So let’s start with the basics.

Strictly speaking, photogrammetry refers to ways of measuring physical objects and environments by analyzing photographic data.

Put simply, it’s a way of generating extra data about the real world in front of the camera or cameras. With a single camera rolling, filmmakers can capture one point of view on a scene. But with multiple cameras and a bit of photogrammetry, you can extract multiple digital representations of physical objects or environments.

For example, photogrammetric tools can create 3D topographical maps from multiple 2D frames of aerial photography. This is achieved by analyzing multiple perspectives of real world information (like the photographs themselves, altitude data, and terrain features) and then calculating new digital data that fills in gaps and enables new ways of interacting with or visualizing the images.

This is part of the technology behind Microsoft Flight Simulator, which allows players to explore an insanely detailed 1:1 representation of the entire Earth. Yet the entire model was created by applying this type of photogrammetric analysis to 2D satellite images and aerial photography.

Game developers commonly employ photogrammetry to precisely recreate digital representations of well-known props, models, and locations from films, like EA did with Star Wars: Battlefront.

But photogrammetry is also appealing because it can save time and enable unique creative experiences.

Studio Kong Orange’s recently crowdfunded Vokabulantis is a videogame that used these techniques to capture animated stop-motion puppetry. Each pose is shot eight times, with light coming from six different directions. Those animations are then transferred into 3D-scanned environments that can be scaled and manipulated independently of the photogrammetric characters.

This gives the game a quaint, handmade look, but it also allows the team to create 3D assets in record time.

“Our tram was built by hand in two weeks, exterior and interior,” says filmmaker and stop-motion animator Johan Oettinger, citing just one richly detailed asset from the game. “It would take a CG artist at least five weeks to build it in a computer, to that level of detail, with texture. There is real time to be saved. And who doesn’t love handmade objects?”

How is photogrammetry used in movies and TV?

When it comes to film production, photogrammetry can manipulate, displace, and/or duplicate reality.

That’s how David Stump, ASC, used it in a characteristically spectacular action sequence in the James Bond film Quantum of Solace (2008) — one of the first cases of photogrammetry being used in a major motion picture.

“Do you remember the skydiving scene, where they jump out of the burning DC-3 with one parachute between them?” Stump asks.

The scene features Daniel Craig and co-star Olda Kurylenko in free fall; they are shot in close-up, from multiple camera angles and with their bodies and faces clearly identifiable, as they plummet toward the desert landscape below. It looks for all the world like Craig, Kurylenko, and a daredevil cameraman were all up there, falling through the sky together.

But it was a clever use of photogrammetry. “That scene was done in a vertical wind tunnel in Bedfordshire, England.” Stump reveals.

Craig and Kurylenko were shot from 17 different angles (the cameras were precisely synchronized with the cesium atomic clock at the National Institute of Standards and Technology in Boulder, CO) in a five-meter-wide wind tunnel that physically simulated free-falling at 120mph.

Using the synced frames from all the cameras, CGI meshes were generated from Craig’s and Kurylenko’s real bodies. The original photography of the actors was mapped onto those shapes, producing photoreal 3D geometry.

That allowed the VFX artists to synthesize completely new camera angles on both actors, irrespective of whether there had actually been a camera in that position, and relight the CG figures before compositing them into aerial photography taken over Mexico.

“There were no belly bands, no green screens, no blue screens, no fans in their faces,” Stump says. Because the actors were recorded under actual freefall conditions, with no cables or safety harnesses attached, their facial expressions and body movements were utterly authentic.

Earlier examples of photogrammetry and image-based rendering in major feature films include The Matrix (1999), Fight Club (1999), and Panic Room (2002). So these techniques have been in serious use for over two decades now.

A tool for every budget

But don’t get the idea that photogrammetry is just for James Bond movies. It was employed on Quantum of Solace for aesthetic reasons (director Marc Forster just doesn’t like the look of green-screen photography).

There are purely practical or logistical applications for this tech, too.

Photogrammetry has become a staple of VFX workflows because it can save time and money on all kinds of productions, not just action set pieces. That’s how HBO’s Big Little Lies and Sharp Objects became showcases for the tech.

Instead of driving co-stars Nicole Kidman, Reese Witherspoon and Shailene Woodley to Monterey every time they needed to shoot a scene that takes place at a real Monterey restaurant, Big Little Lies rebuilt the restaurant itself on set, shooting against a green screen backdrop.

VFX studio Real by Fake photographed the harborside location from multiple angles and recreated it as a 3D environment using crude photogrammetric techniques. The results were seamlessly integrated as background elements in green-screen shots.

Later, Real by Fake deployed a drone to shoot photos of a Victorian-style house in Redwood Valley, CA, that was a setting for HBO’s Sharp Objects so that a digital twin of the location could be used as needed. (One consideration: it was fire season in California and HBO couldn’t be certain the house wouldn’t burn down before the series finished production.

More recently, the company scanned New York locations with lidar for use as environments in the Apple TV+ series The Morning Show, which shot entirely in Los Angeles.

While we’re on the subject of lidar, technically speaking it is not photogrammetry.

Lidar surveys a location by measuring reflected laser light and calculating the time of flight for the bounce, while photogrammetric calculations are based on photographic images. But the techniques (and motivations for using them) are very similar, in that they yield physically accurate 3D versions of real environments that can save real headaches in production and post.

Real by Fake works closely with editorial to generate any shots needed as early in the edit as possible. “If they come up with a new scene, or they want to replace a face or a background, we do a version 0, meaning we start not from the Avid material, but from the source in Nuke or another 3D application,” Côté explains.

“I remember, the HBO executives had no clue those shots [in Big Little Lies] were not real,” he says. “It had no animation, it was just a background plate. But the focus was right, and the light was similar, so they were stoked by what they were seeing.”

If early cuts feature more VFX elements, they draw fewer notes from producers because it’s more clear what the finished shots will look like. And photogrammetric techniques make it easier than ever to quickly composite shots with accurate and realistic 3D geometry throughout the scene.

They make a huge difference on shows like The Mandalorian, whose virtual environments and huge LED screens use photogrammetry extensively.

Even without those virtual backdrops, getting a good scan or photogrammetric capture of a scene is so helpful that it has become standard operating procedure for many VFX supervisors.

“Lidar and photogrammetry have become essential tools not just for virtual production but also traditional post-production and VFX,” says Brian Drewes, founder and CEO of VFX studio Zero, whose portfolio includes impressive work for films including Little Women, Creed II, and Tomb Raider.

“When we’re supervising on set, we routinely do photogrammetry of the sets, even if we don’t see a real need for it. We’ll keep the data and only crunch it if it helps us with tracking the room or if we decide we’d like a clean plate. It’s saved us so many times that we just do it as a safety measure, no matter what.

Whether your concern is creating photoreal VFX, ensuring the safety of your actors, or saving time and money by bringing faraway locations to your talent (instead of the other way around), photogrammetry is increasingly likely to offer inexpensive options for you.

Power to the (photogrammetric) people

Now that it’s becoming easier to save time and money with digital backgrounds and set extensions, the new frontier in photogrammetry is people.

Not necessarily for recreating a film’s heroes in 3D space, like in Quantum of Solace, but more as a source for digital characters that can be dropped into the middle ground or background as needed.

Imagine a crowd scene effortlessly populated by figures plucked from a stable of photoreal digital extras.

Sure, you can generate a crowd today using green screen techniques. But a photogrammetric character is a real 3D shape that can be rendered from literally any point of view, making it more versatile than any chroma keyed element.

Here’s how it works: performers are shot in motion from a number of synchronized camera angles. A photogrammetric volume is a little like a motion-capture stage, except that while mocap is designed to record only motion vectors, motion photogrammetry records real photographic data, just like an ordinary camera. Then, selected portions of those performances are cut into animated loops and the original photography is mapped onto CG shapes created through analysis of the captured images.

The result? Animated CG characters with photoreal textures that can be rendered at synthesized angles that precisely match a moving camera.

What’s more, you can relight and otherwise manipulate those digital figures as if they were fully CG — at least to a degree.

“You don’t have ultimate control over shaders and you can’t swap textures,” Drewes says. “But you also don’t have to create a photoreal digital human for a background or middle-ground asset. You could have a backlot full of digital extras.”

Characters created with technology like Unreal Engine’s Metahuman Creator look great, but they still need to be animated realistically, either by hand or with the help of mocap data. But if a character is created through photogrammetry, realistic movements (based on real physical performances) can be baked into the asset.

You can see examples of this in the live-action remake of Ghost in the Shell, which used a domed rig with 80 2K machine-vision cameras to capture performances at 24fps without using traditional green-screen work.

All of the camera images were fed into photogrammetry software called RealityCapture, which created a unique 3D model for each frame—32,000 in total. The moving 3D figures were then composited into the film’s fictional Japanese metropolis as holographic advertisements called “solograms.”

Ghost in the Shell’s solograms are supposed to look a little unreal, but the goal is for photogrammetric digital humans to be indistinguishable from the real thing.

Real by Fake expects that AI will make it possible to increase their realism by recognizing the shapes of bodies.

“If we have AI that can recognize the shape of the [point] cloud and make a bone in it, we can modify the animation and do some tweaks,” says Real by Fake CG Supervisor Robert Rioux. “That’s another area that we need to explore — to find a way to use AI to add a skeleton and make other modifications.”

What does this mean for cameras?

The fundamental advantage of photogrammetric capture is that, by acquiring multiple images of a scene from multiple points of view, cinematographers are no longer limited to capturing a flat image of a scene with no real depth information.

Stereoscopic camera rigs, which record just two different points of view on a scene, can be used to deliver the illusion of depth. But if enough angles are captured, the object or environment can be analyzed to recreate it in 3D space.

And that means you can create an entirely new camera angle anywhere inside the capture volume.

Sam Nicholson, ASC, the CEO of virtual production company Stargate Studios, says the work of the Fraunhofer Institute may point the way. That team has developed a nine-camera rig that allows for the creation of virtual dolly and crane moves around a photogrammetrically captured object.

As multi-camera arrays become less expensive, Nicholson sees photogrammetry becoming more powerful. If nine cameras can create a convincing 3D scene, imagine what 100 cameras, augmented by AI and fast GPU-based image processing, could do.

He also singles out the pixel-shift technology found in Sony’s Alpha-series cameras as a possible harbinger of new photogrammetric techniques.

In pixel-shift mode, the camera takes a rapid series of exposures with tiny sensor movements in between each shot. This is just enough to shift the sensor’s color-filter array a single pixel, which improves color resolution in the final image by allowing the camera to gather red, green and blue light instead of just one filtered color at each photosite.

“What if it actually looks for depth data when it shifts?” Nicholson asks. “If it can shift back and forth fast enough, you could do photogrammetry with a single chip and a single lens. It’s photogrammetry, but on steroids.”

Nicholson thinks the acquisition of depth information will become so important to filmmakers that cameras may eventually be made without traditional lenses.

“I think the camera of the future will be a flat camera, about as thick as an iPad, that absorbs and times light and can calculate depth,” he says.

“Think about how small your cell phone lens is. Put a thousand of them together, right next to each other on a flat plate, and now you’re capturing 1,000 images, all offset a little bit and synchronized, and using AI, you put them all together. Each frame is a 1,000-input photogrammetry frame.”

A new type of storytelling

It’s clear that photogrammetry will have an impact on camera technology. But its influence on filmmaking technology won’t be limited to cameras and 3D VFX processes.

Côté sees photogrammetry having an impact on editorial, allowing an editor to select not just the best take but also to dictate the precise camera angle.

“In the Avid you could change the camera’s position to help with timing or even create a new shot if you don’t have the right angle,” he says. “Just imagine the Avid timeline with a window showing what you’re seeing from a given camera angle that allows you to go into the shot and change the angle.”

That capability — the power to edit space as well as time — would fundamentally change the way film editors build a scene.

Imagine the possibilities that would open to martial-arts filmmakers who capture their biggest brawls in real time on a photogrammetric stage. That could allow them to later assemble every chop, kick and block from the most dramatic point of view.

And what if the next big Hamilton-esque Broadway musical could be shot with a massive photogrammetry rig consisting of dozens and dozens of cameras forming a dense dome overhead? The director could make retroactive choices about camera placement to get exactly the right angles on and around the stage without interfering with the performance.

If the concept still sounds exotic or expensive, consider that Apple debuted Object Capture, a photogrammetry API for macOS, just last week at WWDC 2021. This will allow developers to easily analyze photos and turn them into 3D meshes and material maps with basically any Mac or iPhone.

Supercharged by ever-increasing amounts of computational power, photogrammetry is challenging and expanding the boundaries of filmmaking itself. Making effective use of it is going to mean mastering new ideas about what it means to capture an image.

But it’s not going away.

Along with other powerful techniques for bridging the gap between the real and virtual worlds, photogrammetry is key to the future of both production and post.

And if the brief but fast-moving history of digital cinema has taught us anything, it’s that the future always arrives faster than you expect.

Featured image from The Mandalorian © Disney

Bryant Frazer

Bryant is a New York-based journalist specializing in filmmaking technology and technique. For many years, he was the editor of StudioDaily, a daily news source for artists, executives, and craftspeople working in production and post-production for film and television. His writing can also be found at Film Freak Central, where he reviews new and classic movies released on Blu-Ray Disc and, since 1994, at Deep-Focus.com, one of the first generation of film sites on the Internet.