ReMu

TL;DR

We introduce a clothed human capturing scheme using a single RGB camera to capture "Image Layers".
ReMu reconstructs multi-layer garments fitting to a unified canonical body.
The reconstructed garments are optimized for penetration-free and suitable for downstream applications.

Abstract

The reconstruction of multi-layer 3D garments typically requires expensive multi-view capture setups and specialized 3D editing efforts. To support the creation of life-like clothed human avatars, we introduce ReMu for reconstructing multi-layer clothed humans in a new setup, Image Layers, which captures a subject wearing different layers of clothing with a single RGB camera. To reconstruct physically plausible multi-layer 3D garments, a unified 3D representation is necessary to model these garments in a layered manner. Thus, we first reconstruct and register each garment layer in a shared coordinate system defined by the canonical body pose. Afterwards, we introduce a collision-aware optimization process to address interpenetration and further refine the garment boundaries leveraging implicit neural fields. It is worth noting that our method is template-free and category-agnostic, which enables the reconstruction of 3D garments in diverse clothing styles. Through our experiments, we show that our method achieves competitive performance compared to category-specific methods and reconstructs penetration-free 3D clothed humans.

Method

Given a set of k-layered Image Layers I_k, we register SMPL-X body models B and reconstruct watertight meshes M from images. We then segment out 3D garments S through multi-view parsing. Next, we deform 3D meshes to the canonical body pose B_c through inverse LBS. These aligned garments are optimized to remove inter-layer penetrations as M'. Finally, we fit implicit neural fields f to refine the garment surface geometry and boundaries.

Technical Contributions:

Clothing Reconstruction: Leveraging off-the-shelf 3D human reconstruction model with multi-view segmentation.
Garment Registration: Deforming garments to canonical body + remove penetration from inner to outer layers.
Garment Refinement: Fitting implicit neural UDF to each garment layer to resolve segmentation artifacts.

Results on 4D-DRESS

Input Images

SMPLicit

ClothWild

ISP

Ours

Input Images

SMPLicit

ClothWild

ISP

Ours

In-the-wild Capture

References

Ho et. al, "SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion." CVPR, 2024.

Kim et. al, "GALA: Generating Animatable Layered Assets from a Single Scan." CVPR, 2024.

Wang et. al, "4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations." CVPR, 2024.

Corona et. al, "SMPLicit: Topology-aware Generative Model for Clothed People." CVPR, 2021.

Moon et. al, "ClothWild: 3D Clothed Human Reconstruction in the Wilde." ECCV, 2022.

Li et. al, "ISP: Multi-Layered Garment Draping with Implicit Sewing Patterns." NeurIPS, 2023.

Citation


@inproceedings{vuran2025remu,
  title   = {ReMu: Reconstructing Multi-layer 3D Clothed Human from Images},
  author  = {Vuran, Onat and Ho, Hsuan-I},
  booktitle = {British Machine Vision Conference (BMVC)},
  year    = {2025}
}