Relightable 3D Faces from a Single Image via Diffusion Models
This paper presents an approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image, resulting in superior performance in both texture completion as well as reflectance reconstruction tasks.
Can be used for face recognition in businesses, improving accuracy in low light environments.
StarCoder: may the source be with you!
This paper introduces StarCoder and StarCoderBase, models with 15.5B parameter models with 8K context length capable of infilling and fast large-batch inference enabled by multi-query attention. StarCoder outperforms other models on multiple programming languages, achieving 40% pass@1 on HumanEval, and is publicly available.
Can be used for code completion and generation in businesses, improving efficiency and accuracy in software development.
Sketching the Future (STF): Applying Conditional Control Techniques to Text-to-Video Models
This paper proposes a novel approach that combines zero-shot text-to-video generation with ControlNet to improve the output of these models. Experiments demonstrate that this method excels at producing high-quality and consistent video content that aligns with the user's intended motion for the subject within the video.
Can be used for video generation in businesses, improving efficiency and creativity in video production.
HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion
HumanRF is a 4D dynamic neural scene representation that captures full-body appearance in motion from multi-view video input, and enables playback from novel, unseen viewpoints. It effectively leverages 12MP footage from 160 cameras for 16 sequences with high-fidelity, per-frame mesh reconstructions.
HumanRF can improve the quality of human representation in applications such as film production, computer games, and videoconferencing, by capturing fine details at high compression rates and representing high-resolution details even in the context of challenging motion.
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
This paper conducts a comprehensive survey on text-to-3D, a newly emerging research field that combines text-to-image and 3D modeling technologies. It introduces 3D data representations and various foundation technologies, and summarizes how recent works combine those to realize satisfactory text-to-3D. It also summarizes how text-to-3D technology is used in various applications such as avatar generation, texture generation, shape transformation, and scene generation.
Text-to-3D technology can enable interaction between human instruction and AIGC, and have practical applications such as avatar generation, texture generation, shape transformation, and scene generation.