Enhancement Makes GRAND THEFT AUTO Look Photorealistic

The original Super Mario Bros. was once the height of video game technology. Just 35 years later and we have in-game graphics that almost look real. Almost. Despite monumental leaps in rendering and aesthetics, even the best looking games can’t truly pass for real-life footage. But that might not be true for much longer. One company has created a photorealism enhancer that employs machine learning to make games like Grand Theft Audio V resemble actual footage filmed on city streets.

This process might not just be the future for video games; it might also be the future of filmmaking.

Intel Labs has shared a video tutorial (which we first came across at Gizmodo) showing how its new Photorealism Enhancement program works. Using machine learning techniques, it modifies images from video games to make them look realistic. The actual process is, as you can imagine, a lot more complex than just running rendered images though a photo enhancement program. Intel Labs explains:

“The images are enhanced by a convolutional network that leverages intermediate representations produced by conventional rendering pipelines. The network is trained via a novel adversarial objective, which provides strong supervision at multiple perceptual levels. We analyze scene layout distributions in commonly used datasets and find that they differ in important ways. We hypothesize that this is one of the causes of strong artifacts that can be observed in the results of many prior methods. To address this we propose a new strategy for sampling image patches during training. We also introduce multiple architectural improvements in the deep network modules used for photorealism enhancement. We confirm the benefits of our contributions in controlled experiments and report substantial gains in stability and realism in comparison to recent image-to-image translation methods and a variety of other baselines.”

A split screen of Grand Theft Audio, with actual game footage on the right, and an enhanced realistic image on the left, both showing a city street with mountains in the distanceIntel Labs

If you fully understand all of that, you may be qualified to work for Intel Labs. (You’ll at least want to read their accompanying paper and visit the program’s official site.) But the general concept is easy to understand. As is how it could be used for different games. For this GTAV sample, the company employed images of a German city from Cityscapes Dataset as a base. They helped the program transform the rendered graphics into something that looks real. The results are striking. Grand Theft Auto V goes from looking like a modern day game to feeling like you’re seeing out of the eyes of an actual person (wearing sunglasses) driving a car in the real world.

Other video games would need a different set of real photos to pull from. But this method would still be much faster and more cost effective than other programs. And that’s why this has the potential to one day change how movies and TV shows are filmed.

A split screen of Grand Theft Audio V, with actual game footage on the left, and an enhanced realistic image on the right, both showing a city street with mountains in the distanceIntel Labs

Instead of shooting huge, expensive action sequences, you could render them like a video game first. Then run the footage through this Photorealism Enhancement program. You’d never have to worry about getting the shot right, or getting permission to close city streets. More importantly, you wouldn’t have to put any performers’ lives at risk. All while still providing a sequence that looks real. And this could give small-budget films enormous sequences they otherwise couldn’t afford.

Maybe this machine learning image enhancer won’t revolutionize video games tomorrow. Or replace CGI, green screens, and humans immediately. But that future might not be that far away. It wasn’t all that long ago that Super Mario Bros. was as good as it got.