2021
DOI: 10.48550/arxiv.2112.03221
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Text2Mesh: Text-Driven Neural Stylization for Meshes

Abstract: Iron ManBrick Lamp Colorful Crochet Candle Astronaut Horse Figure 1. Text2Mesh produces color and geometric details over a variety of source meshes, driven by a target text prompt. Our stylization results coherently blend unique and ostensibly unrelated combinations of text, capturing both global semantics and part-aware attributes.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 39 publications
0
13
0
Order By: Relevance
“…We follow the model architecture and generation pipeline in Text2Mesh (Michel et al, 2021). Text2Mesh proposes a neural style field network, which directly outputs the value displacement on the mesh normal and the color on each vertex.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…We follow the model architecture and generation pipeline in Text2Mesh (Michel et al, 2021). Text2Mesh proposes a neural style field network, which directly outputs the value displacement on the mesh normal and the color on each vertex.…”
Section: Discussionmentioning
confidence: 99%
“…We first examine and understand our method in some toy examples, and then apply F sum and F max to more difficult deep learning applications: text-to-image (Liu et al, 2021;Ramesh et al, 2021), text-to-mesh (Michel et al, 2021), molecular conformation generation (Shi et al, 2021) and neural network ensemble. In all these cases, we verify and confirm that our method can serve as a plug-in module and can obtain 1) visually more diverse examples, and 2) a better trade-off between main loss (e.g.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Closer to our method are works that utilize the richness of CLIP outside the imagery domain. In the 3D domain, CLIP's latent space provides a useful objective that enables semantic manipulation [Sanghi et al 2021;Michel et al 2021;Wang et al 2021a] where the domain gap is closed by a neural rendering. CLIP is even adopted in temporal domains [Guzhov et al 2021;Luo et al 2021;Fang et al 2021] that utilize large datasets of video sequences that are paired with text and audio.…”
Section: Clip Aided Methodsmentioning
confidence: 99%
“…Another work generates shapes from text prompts [15] but it requires training of an encoder and decoder using a set of defined meshes which limits generalizability and they also use a voxel representation which lack textures. [7,10] focus on stylization of pre-defined object meshes with text descriptions, while we tackle the problem of generating the entire shape and texture from a detailed natural language description. Concurrent to our work, [6] proposed a zero-shot text guided generation using a NeRF model [11].…”
Section: Related Workmentioning
confidence: 99%