Tue Mar 21 2023
Mon Mar 20 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

Large Language Models
Natural Language Processing (NLP)
Artificial Intelligence (AI) Computing
Open-domain dialogue
Question answering
Machine translation
Code generation

Presents a sparse LM with 1T parameters trained over 329B tokens

Enables better natural language understanding, generation, and reasoning, leading to improved performance in various downstream NLP tasks.

A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models

Language Models
Natural Language Processing (NLP)
Artificial Intelligence (AI) Computing
Natural language understanding (NLU) tasks

Evaluates the performance of six representative GPT models on various NLU tasks and compares their capabilities and evolution over time

Provides insights into the strengths, weaknesses, and limitations of GPT series models for different NLP tasks and scenarios, and highlights the need for further improvement in areas such as model robustness.

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Diffusion Models
Computer Vision
Natural Language Processing (NLP)
Text-to-image generation

Proposes a novel approach to personalizing text-to-image diffusion models with a significantly smaller model size

Enables efficient personalization of text-to-image diffusion models with reduced risk of overfitting and language-drifting, and improves the quality of multi-subject image generation and text-based image editing.

Sun Mar 19 2023
Thu Mar 16 2023
Tue Mar 14 2023
Mon Mar 13 2023