2022
DOI: 10.48550/arxiv.2205.08535
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Abstract: texture generation. Moreover, by leveraging the priors learned in the motion VAE, a CLIP-guided reference-based motion synthesis method is proposed for the animation of the generated 3D avatar. Extensive qualitative and quantitative experiments validate the effectiveness and generalizability of AvatarCLIP on a wide range of avatars. Remarkably, AvatarCLIP can generate unseen 3D avatars with novel animations, achieving superior zero-shot capability. Codes are available at https://github.com/hongfz16/AvatarCLIP.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(21 citation statements)
references
References 49 publications
(66 reference statements)
0
21
0
Order By: Relevance
“…2022) takes single-view in-the-wild images in training; as their data has not been released, we only evaluate our method on some of their categories. To evaluate the performance, we employ Fréchet Inception Distance (FID) Heusel et al (2017), Fréchet Point Distance (FPD) Liu et al (2022) to measure shape generation quality, and conduct a human perceptual evaluation to further assess text-shape consistency.…”
Section: Text-guided Stylizationmentioning
confidence: 99%
See 4 more Smart Citations
“…2022) takes single-view in-the-wild images in training; as their data has not been released, we only evaluate our method on some of their categories. To evaluate the performance, we employ Fréchet Inception Distance (FID) Heusel et al (2017), Fréchet Point Distance (FPD) Liu et al (2022) to measure shape generation quality, and conduct a human perceptual evaluation to further assess text-shape consistency.…”
Section: Text-guided Stylizationmentioning
confidence: 99%
“…To measure the shape generation quality, we employ Fréchet Inception Distance (FID) Heusel et al (2017) between five rendered images of the generated shape with different camera poses and a set of ground-truth ShapeNet or CO3D images. Further, we convert the generated shapes to 3D point clouds and adopt the metric Fréchet Point Distance (FPD) proposed in Liu et al (2022) to evaluate the generative quality. Note that Dream Field Jain et al (2022) does not produce 3D shapes directly, so that we cannot evaluate this work in this regard.…”
Section: B Implementation Details Metrics and Human Perceptual Evalua...mentioning
confidence: 99%
See 3 more Smart Citations