OpenAGI: When LLM Meets Domain Experts
OpenAGI is an open-source AGI research platform that uses Large Language Models (LLMs) to select, synthesize, and execute various domain-specific expert models to solve complex tasks. It also proposes a Reinforcement Learning from Task Feedback (RLTF) mechanism to provide feedback to improve LLM's task-solving ability, enabling a feedback loop for self-improving AI.
Businesses can use OpenAGI to develop and implement AI models that can solve complex tasks by harnessing multiple domain-specific models. RLTF can be used to improve the performance of AI models over time.
Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
Slide-Transformer is a local attention module that allows for local feature learning and dynamic feature selection using common convolution operations. It is an efficient and flexible way to implement the local attention paradigm, and is compatible with various hardware devices. It achieves consistently improved performance on comprehensive benchmarks for advanced Vision Transformer models.
Businesses can use Slide-Transformer to implement advanced Vision Transformer models that allow for efficient and flexible local feature learning and dynamic feature selection.
Inference with Reference: Lossless Acceleration of Large Language Models
LLMA is an LLM accelerator that losslessly speeds up Large Language Model (LLM) inference with references. It achieves over 2x speed-up for LLMs with identical generation results as greedy decoding in practical generation scenarios where significant overlap between in-context reference and outputs exists (e.g., search engines and multi-turn conversations).
Businesses can use LLMA to speed up inference of Large Language Models when there is significant overlap between the reference and the outputs, such as in search engines and multi-turn conversations.
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
This work proposes POMP, a prompt pre-training method for vision-language models. Once pre-trained, the prompt can be directly plugged into a variety of visual recognition tasks to boost recognition performances in a zero-shot manner.
POMP improves recognition performances on 21 downstream datasets for image classification, semantic segmentation, and object detection.
WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus
This paper introduces a new NLP task called WebBrain, which generates short factual articles from queries by mining supporting evidence from the web. The authors construct a large-scale dataset WebBrain-Raw by extracting English wiki articles and their references.
ReGen, a new framework for generation, outperforms all baselines in both automatic and human evaluations in enhancing factualness.