The video above is the result of a new study from UC Berkeley researchers. In layman’s terms, what they’ve done is use artificial intelligence to detect a dancer’s moving pose in a source video, then map the movements of the original dancer into a target video of a different person. Or, as they describe it in their paper:
“Given a source video of a person dancing we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves. We pose this problem as a per-frame image-to-image translation with spatio-temporal smoothing. Using pose detections as an intermediate representation between source and target, we learn a mapping from pose images to a target subject’s appearance. We adapt this setup for temporally coherent video generation including realistic face synthesis.”
What results from this is a fairly realistic video of a person dancing just like the real dancer from the original video. There are some tells that the created clip isn’t real—the movement can look a bit stop-motion-y and sometimes the outline of the body is inconsistent—but this is an impressive technological feat that is only going to get more and more convincing as research continues.
What other ways could you imagine this technology being used? Let us know what you think in the comments below!
Featured image: Ministerio de Cultura de la Nación Argentina/Flickr