Thu Feb 16 2023 - Top Trending AI Papers

3D-aware Conditional Image Synthesis

Neural radiance fields

Computer vision

Image processing

3D object generation

Image synthesis

Interactive 3D editing

Pix2pix3D synthesizes 3D objects given a 2D label map, such as a segmentation or edge map.

Can be used to generate 3D objects and photorealistic images from 2D label maps, allowing for explicit 3D user control. Provides an interactive 3D editing demo.

https://www.cs.cmu.edu/~pix2pix3D/

https://github.com/dunbar12138/pix2pix3D

https://arxiv.org/pdf/2302.08509.pdf

https://arxiv.org/abs/2302.08509

https://twitter.com/arankomatsuzaki/status/1626394778131922944/video/1

LEVER: Learning to Verify Language-to-Code Generation with Execution

Code language models

Natural language processing

Programming languages

Language-to-code generation

Program verification

Code language models

LEVER improves language-to-code generation by learning to verify the generated programs with their execution results.

Improves language-to-code generation by combining CodeLM decoding with verifiers trained to determine whether a program is correct based on its natural language input, program, and execution results.

https://arxiv.org/pdf/2302.08468.pdf

https://arxiv.org/abs/2302.08468

https://twitter.com/arankomatsuzaki/status/1626398888180736000/photo/1

Efficiency 360: Efficient Vision Transformers

Transformers

Computer vision

Machine learning

Vision transformer models

Image classification

Efficiency

Efficiency 360 compares various vision transformer models based on their performance, number of parameters, and number of floating point operations on multiple datasets.

Introduces an efficient 360 framework for vision transformers to make them more efficient for industrial applications. Compares various vision transformer models based on their performance, number of parameters, and number of floating point operations on multiple datasets.

https://arxiv.org/pdf/2302.08374.pdf

https://arxiv.org/abs/2302.08374

https://twitter.com/arankomatsuzaki/status/1626396145357570048/photo/1

Text-driven Visual Synthesis with Latent Diffusion Prior

Diffusion models

Image Processing

Machine Learning

text-to-3D

StyleGAN adaptation

layered image editing

Presents a generic approach using latent diffusion models as powerful image priors for various visual synthesis tasks, including text-to-3D.

Provides a more efficient and effective approach to text-to-image synthesis and visual synthesis tasks, which can be useful for businesses looking to improve their image editing and customized generation processes.

https://latent-diffusion-prior.github.io/

https://arxiv.org/pdf/2302.08510.pdf

https://arxiv.org/abs/2302.08510

https://twitter.com/arankomatsuzaki/status/1626396909677199362/video/1

Shared Microexponents: A Little Shifting Goes a Long Way

Block Data Representations

Machine Learning

large-scale generative pretraining

inferencing

production-scale recommendation systems

Presents Block Data Representations, a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning.

Introduces an innovative framework for deep learning that enables the comparison of different quantization standards and identifies a new format called shared microexponents, which outperforms other state-of-the-art quantization approaches. This could be useful for businesses looking to optimize their machine learning models and improve their recommendation systems.

https://arxiv.org/pdf/2302.08007.pdf

https://arxiv.org/abs/2302.08007

https://twitter.com/arankomatsuzaki/status/1626400460801466369/photo/1