We’re Afraid Language Models Aren’t Modeling Ambiguity
This research explores the challenge of managing ambiguity in natural language and presents an annotated benchmark to evaluate the performance of pretrained language models in recognizing and disentangling possible meanings. It shows that current language models struggle to handle ambiguity, including the recent GPT-4 model, and emphasizes the importance of ambiguity-sensitive tools for NLP applications.
This research highlights the challenges in managing ambiguity in language models and provides a benchmark to evaluate their performance. It recommends the development of ambiguity-sensitive tools to improve the performance of NLP applications in recognizing and disentangling possible meanings.
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System
This research introduces a prototype system for multimodal and versatile video understanding built upon a tracklet-centric paradigm, which treats tracklets as the basic video unit and employs Video Foundation Models (ViFMs) to annotate their properties. The system demonstrates its effectiveness in answering various video-related problems through extensive case studies on different types of in-the-wild videos.
This research presents a prototype system for multimodal and versatile video understanding that uses a tracklet-centric paradigm and ViFMs to annotate the properties of tracklets. It provides evidence of the effectiveness of the system in solving various video-related problems through extensive case studies.
JaxPruner: A concise library for sparsity research
This research introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research that aims to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. It demonstrates the ease of integration of JaxPruner with existing JAX-based libraries and provides examples in four different codebases. The researchers believe that JaxPruner has the potential to accelerate sparsity research in the field of machine learning.
This research introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research that aims to provide concise implementations of popular pruning and sparse training algorithms. The research highlights the ease of integration of JaxPruner with existing JAX-based libraries and its potential to accelerate sparsity research.
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
The paper presents a method to inferring scene affordances and realistically inserting people into scenes by learning to re-pose humans in video clips and training a large-scale diffusion model on a dataset of 2.4M video clips that produces diverse plausible poses while respecting the scene context.
The proposed method can be useful for businesses working with augmented reality, virtual reality, or computer graphics, allowing them to realistically insert human subjects into scenes to showcase their products or services. Moreover, it can also automate the process of image or video editing that requires people to be inserted into scenes, saving time and costs.
Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models
The paper studies multi-party conversations and evaluates the ability of language models to act as one or more characters in such conversations. The paper introduces a new dataset called MultiLIGHT and compares models trained on this dataset to existing pairwise-trained dialogue models and large language models with few-shot prompting.
The proposed method can be useful for businesses working with chatbots or conversational agents that are designed to interact with multiple customers or users simultaneously in group settings. It can improve the quality and coherence of the generated responses of the chatbot, leading to better customer satisfaction and engagement. Moreover, the MultiLIGHT dataset can be used to improve the performance of existing chatbots or train new chatbots for multi-party conversations.