ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation

Liu, Zhengzhe; Dai, Pengcheng; Li, Ruihui; Qi, Xiaojuan; Fu, Chi‐Wing

doi:10.48550/arxiv.2209.04145

Cited by 2 publications

(2 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…FID utilises Frechet distance between the two image distributions, and assumes that the two are Gaussian. This is applied to 2D assets [7], [64], [105], [106], [107], [108], [109], 3D assets via 3D classifiers [37], [110] or rasterisation to 2D form [111], [112].The Frechet point cloud distance extends FID for applications in assessing the similarity of pointbased 3D shapes [113]. This has been used to evaluate many deep-learning based 3D point-cloud [71], [113], [114] and mesh [67] generators.…”

Section: Perceptual Similarity Metricsmentioning

confidence: 99%

Evaluation Metrics for Intelligent Generation of Graphical Game Assets: A Systematic Survey-Based Framework

Fukaya,

Daylamani-Zad,

Agius

2024

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Generative systems for graphical assets have the potential to provide users with high quality assets at the push of a button. However, there are many forms of assets, and many approaches for producing them. Quantitative evaluation of these methods is necessary if practitioners wish to validate or compare their implementations. Furthermore, providing benchmarks for new methods to strive for or surpass. While most methods are validated using tried-and-tested metrics within their own domains, there is no unified method of finding the most appropriate. We present a framework based on a literature pool of close to 200 papers, that provides guidance in selecting metrics to evaluate the validity and quality of artefacts produced, and the operational capabilities of the method.

show abstract

Section: Perceptual Similarity Metricsmentioning

confidence: 99%

Evaluation Metrics for Intelligent Generation of Graphical Game Assets: A Systematic Survey-Based Framework

Fukaya,

Daylamani-Zad,

Agius

2024

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

show abstract

“…Some notable examples of such methods include CLIP-NeRF [58], PureCLIPNeRF [24], and Dream-Fields [19]. Additionally, recent studies have explored the fusion of CLIP with other algorithms, such as ISS [27] with SVR [37], CLIP-Forge [50] using a normalizing flow network [10], and AvatarCLIP [15] leveraging SMLP [30]. Furthermore, the diffusion model [48] has recently demonstrated impressive results in text-to-image generation, leading to its integration into the text-to-3D generation process.…”

Section: Text-to-3d Manipulation/generationmentioning

confidence: 99%

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

Ma¹,

Zhang²,

Sun³

et al. 2023

Preprint

View full text Add to dashboard Cite

Text-driven 3D stylization is a complex and crucial task in the fields of computer vision (CV) and computer graphics (CG), aimed at transforming a bare mesh to fit a target text. Prior methods adopt text-independent multilayer perceptrons (MLPs) to predict the attributes of the target mesh with the supervision of CLIP loss. However, such text-independent architecture lacks textual guidance during predicting attributes, thus leading to unsatisfactory stylization and slow convergence. To address these limitations, we present X-Mesh, an innovative text-driven 3D stylization framework that incorporates a novel Text-guided Dynamic Attention Module (TDAM). The TDAM dynamically integrates the guidance of the target text by utilizing textrelevant spatial and channel-wise attentions during vertex feature extraction, resulting in more accurate attribute prediction and faster convergence speed. Furthermore, existing works lack standard benchmarks and automated metrics for evaluation, often relying on subjective and nonreproducible user studies to assess the quality of stylized 3D assets. To overcome this limitation, we introduce a new standard text-mesh benchmark, namely MIT-30, and two automated metrics, which will enable future research to achieve fair and objective comparisons. Our extensive qualitative and quantitative experiments demonstrate that X-Mesh outperforms previous state-of-the-art methods. Our codes and results are available at our project webpage: https://xmu-xiaoma666.github.io/ Projects/X-Mesh/ * Corresponding author; ‡ Equal contributions. Neural Style NetworkSteve Jobs in a red sweater, blue jeans, brown leather shoes and colorful gloves . X-MeshSteve Jobs in a red sweater, blue jeans, brown leather shoes and colorful gloves .

show abstract

ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation

Cited by 2 publications

References 9 publications

Evaluation Metrics for Intelligent Generation of Graphical Game Assets: A Systematic Survey-Based Framework

Evaluation Metrics for Intelligent Generation of Graphical Game Assets: A Systematic Survey-Based Framework

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

Contact Info

Product

Resources

About