We propose SCARF (Segmented Clothed Avatar Radiance Field), a hybrid model combining a mesh-based body with a neural radiance field. Integrating the mesh into the volumetric rendering in combination with a differentiable rasterizer enables us to optimize SCARF directly from monocular videos, without any 3D supervision. The hybrid modeling enables SCARF to (i) animate the clothed body avatar by changing body poses (including hand articulation and facial expressions), (ii) synthesize novel views of the avatar, and (iii) transfer clothing between avatars in virtual try-on applications. We demonstrate that SCARF reconstructs clothing with higher visual quality than existing methods, that the clothing deforms with changing body pose and body shape, and that clothing can be successfully transferred between avatars of different subjects.
SCARF takes monocular RGB video and clothing segmentation masks as input, and outputs a human avatar with separate body and clothing layers. Blue letters indicate optimizable modules or parameters.
Once the avatar is built, we can animate it with detailed control over face and hands. We can alter the body shape, the clothing will adapt accordingly. We also transfer the clothing from other trained videos to the given subject.
We run Marching cube to extract the mesh from trained clothing NeRF, and show it with the explicitly learned body mesh. Green indicates the mesh part extracted from NeRF-based clothing.
We thank Sergey Prokudin, Weiyang Liu, Yuliang Xiu, Songyou
Peng, Qianli Ma for fruitful discussions, and Peter Kulits, Zhen
Liu, Yandong Wen, Hongwei Yi, Xu Chen, Soubhik Sanyal, Omri
Ben-Dov, Shashank Tripathi for proofreading.
We also thank Betty Mohler, Sarah Danes, Natalia Marciniak, Tsvetelina Alexiadis, Claudia Gallatz, and Andres Camilo Mendoza Patino for their supports
with data. This work was partially supported by the Max Planck
ETH Center for Learning Systems.
Disclosure. MJB has received research gift funds from Adobe, Intel, Nvidia, Meta/Facebook, and Amazon. MJB has financial interests in Amazon, Datagen Technologies, and Meshcapade GmbH. While MJB is a part-time employee of Meshcapade, his research was performed solely at, and funded solely by, the Max Planck Society.
While TB is part-time employee of Amazon, this research was performed
solely at, and funded solely by, MPI.
@inproceedings{Feng2022scarf,
author = {Feng, Yao and Yang, Jinlong and Pollefeys, Marc and Black, Michael J. and Bolkart, Timo},
title = {Capturing and Animation of Body and Clothing from Monocular Video},
year = {2022},
booktitle = {SIGGRAPH Asia 2022 Conference Papers},
articleno = {45},
numpages = {9},
location = {Daegu, Republic of Korea},
series = {SA '22}
}