PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
Presents a sparse LM with 1T parameters trained over 329B tokens
Enables better natural language understanding, generation, and reasoning, leading to improved performance in various downstream NLP tasks.
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Evaluates the performance of six representative GPT models on various NLU tasks and compares their capabilities and evolution over time
Provides insights into the strengths, weaknesses, and limitations of GPT series models for different NLP tasks and scenarios, and highlights the need for further improvement in areas such as model robustness.
SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
Proposes a novel approach to personalizing text-to-image diffusion models with a significantly smaller model size
Enables efficient personalization of text-to-image diffusion models with reduced risk of overfitting and language-drifting, and improves the quality of multi-subject image generation and text-based image editing.