Denoising Functional Maps: Diffusion Models for Shape Correspondence
Accepted at CVPR 2025

We propose DenoisFM, a novel method for predicting shape correspondences in the form of functional maps using Denoising Diffusion Models.


Abstract


Estimating correspondences between pairs of deformable shapes remains a challenging problem. Despite substantial progress, existing methods lack broad generalization capabilities and require category-specific training data. To address these limitations, we propose a fundamentally new approach to shape correspondence based on denoising diffusion models. In our method, a diffusion model learns to directly predict the functional map, a low-dimensional representation of a point-wise map between shapes. We use a large dataset of synthetic human meshes for training and employ two steps to reduce the number of functional maps that need to be learned. First, the maps refer to a template rather than shape pairs. Second, the functional map is defined in a basis of eigenvectors of the Laplacian, which is not unique due to sign ambiguity. Therefore, we introduce an unsupervised approach to select a specific basis by correcting the signs of eigenvectors based on surface features. Our approach achieves competitive performance on standard human datasets, meshes with anisotropic connectivity, non-isometric humanoid shapes, as well as animals compared to existing descriptor-based and large-scale shape deformation methods.


Method



(a) Functional map prediction using a diffusion model. For each unique shape in the dataset, we obtain surface features using a feature extractor, as well as a sign-corrected eigenbasis. Both are used for conditional prediction of the template-wise functional map using a diffusion model. The pairwise functional maps are obtained through the map composition property, upsampled using ZoomOut, and converted to pointwise maps.


(b) Learned sign correction. The eigenvectors returned by a numerical eigensolver have random signs. To select a specific sign, we project the eigenvectors onto the learned correction vector obtained with a feature extractor, and make the projection positive. The sign correction network is trained in an unsupervised manner: learning to correct eigenvectors that have random signs.


Training Data


230,000 synthetic human meshes from the SURREAL dataset.

64,000 synthetic animal meshes generated by randomly varying the pose of 32 training shapes from the SMAL dataset.


Results: Shape Correspondence


Humans


Left: source shape, Right: ours

We evaluate our method on 3 benchmark datasets: FAUST, SCAPE, and SHREC'19. We compare it to 2 categories of baselines: large-scale shape deformation methods and small descriptor-based models. Our results are on par on all datasets, demonstrating our competitive performance on diverse shape classes.


Humanoids


To evaluate our model on unseen data, we use the non-isometric DT4D dataset that consists of 9 humanoid shape classes. For training we use only human datasets, and our evaluation can be viewed as a zero-shot setting. Our method achieves accurate matching for most shape classes in both intra- and inter-category settings, demonstrating the ability of our method to generalize to data unseen during training.


Animals


We test our approach on the dataset containing 7 types of animals. For training, we use the fitted parameters of the SMAL parametric model and randomly vary the pose of 32 training animals, generating a total of 64,000 meshes. As a result, we achieve comparable performance to the baselines. Therefore, given enough training data, we believe our method can be applied to any shape class.


Results: Sign Correction of Laplacian Eigenvectors


Accuracy


Given a set of eigenvectors with random signs, our trained sign correction networks the same basis combination in \({>}95\%\) of cases on standard human datasets: FAUST, SCAPE, SHREC'19, both regular (r) and anisotropic (a) versions.


Distribution of Functional Maps


For the FAUST dataset, we plot the first two PCA components for:

  1. functional maps between each shape and the template before sign correction
  2. functional maps after sign correction
  3. conditioning matrix after sign correction

Each entry is colored according to its shape class. Note the clusters that form for the functional maps and conditioning matrices after sign correction.


Learned Correction Vectors



We visualize several of the learned correction vectors \(\varsigma_i\), along with the corresponding eigenvectors \(\phi_i\). The low-order correction vectors resemble the eigenvectors themselves, while the high-order ones are mostly concentrated in the arm and hand regions.


Limitations



The main limitations of our method include:

  • Shapes with significant differences from the synthetic human data we used for training (an alien with large claws or a mouse).
  • Partial shapes (with missing body parts).
  • Shapes with challenging topology that are not handled well by the functional map framework (holes, self-intersections).

BibTeX


Thank you for your interest in our work! If you find our paper useful, please consider citing it:

@inproceedings{zhuravlev2025denoising,
    title={Denoising Functional Maps: Diffusion Models for Shape Correspondence}, 
    author={Aleksei Zhuravlev and Zorah Lähner and Vladislav Golyanik},
    year={2025},
    eprint={2503.01845},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2503.01845}, 
}


Template sources: D-NPC, Michaël Gharbi, Ref-NeRF.