Mon Mar 20 2023 - Top Trending AI Papers

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

Large Language Models

Natural Language Processing (NLP)

Artificial Intelligence (AI) Computing

Open-domain dialogue

Question answering

Machine translation

Code generation

Presents a sparse LM with 1T parameters trained over 329B tokens

Enables better natural language understanding, generation, and reasoning, leading to improved performance in various downstream NLP tasks.

https://arxiv.org/pdf/2303.10845.pdf

https://arxiv.org/abs/2303.10845

https://twitter.com/arankomatsuzaki/status/1637983258880122881/photo/1

A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models

Language Models

Natural Language Processing (NLP)

Artificial Intelligence (AI) Computing

Natural language understanding (NLU) tasks

Evaluates the performance of six representative GPT models on various NLU tasks and compares their capabilities and evolution over time

Provides insights into the strengths, weaknesses, and limitations of GPT series models for different NLP tasks and scenarios, and highlights the need for further improvement in areas such as model robustness.

https://arxiv.org/pdf/2303.10420.pdf

https://arxiv.org/abs/2303.10420

https://twitter.com/arankomatsuzaki/status/1637985202973671425/photo/1

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Diffusion Models

Computer Vision

Natural Language Processing (NLP)

Text-to-image generation

Proposes a novel approach to personalizing text-to-image diffusion models with a significantly smaller model size

Enables efficient personalization of text-to-image diffusion models with reduced risk of overfitting and language-drifting, and improves the quality of multi-subject image generation and text-based image editing.

https://arxiv.org/pdf/2303.11305.pdf

https://arxiv.org/abs/2303.11305

https://twitter.com/arankomatsuzaki/status/1637981387885035526/photo/1