Mon Oct 17 2022 - Top Trending AI Papers

Imagic: Text-Based Real Image Editing with Diffusion Models

Text-to-image diffusion models

Computer Vision, Natural Language Processing

Image and video editing for businesses

Demonstrates, for the very first time, the ability to apply complex (e.g., non-rigid) text-guided semantic edits to a single real image using Imagen.

Offers businesses the ability to apply complex, text-guided semantic edits to real images in a single, unified framework, reducing the need for multiple input images and additional inputs, such as image masks or additional views of the object.

https://arxiv.org/pdf/2210.09276.pdf

https://arxiv.org/abs/2210.09276

https://twitter.com/arankomatsuzaki/status/1582174859500847105/photo/1

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Language Models, Chain-of-Thought Prompting

Natural Language Processing

Natural language processing for businesses

PaLM / Codex + CoT outperforms the average humanrater performance on many of the 23 challenging BIG-Bench tasks.

Provides insights into how language models perform on challenging tasks, and how chain-of-thought (CoT) prompting can improve model performance, thereby offering businesses improved natural language processing capabilities.

https://arxiv.org/pdf/2210.09261.pdf

https://arxiv.org/abs/2210.09261

https://twitter.com/arankomatsuzaki/status/1582174051623059457/photo/1

LAION-5B: An open large-scale dataset for training next generation image-text models

Large-scale language-vision models, CLIP-filtered image-text pairs

Computer Vision, Natural Language Processing

Training and improving language-vision models for businesses

The arXiv paper for LAION-5B. Shows successful replication of foundational models like CLIP, GLIDE and Stable Diffusion.

Offers businesses access to a large-scale dataset consisting of billions of CLIP-filtered image-text pairs, enabling improved training and capabilities of language-vision models, without requiring expensive and accurate labels used in standard vision unimodal supervised learning.

https://laion.ai/blog/laion-5b/

https://arxiv.org/pdf/2210.08402.pdf

https://arxiv.org/abs/2210.08402

https://twitter.com/arankomatsuzaki/status/1582178481554616320/photo/1