Sun Apr 30 2023
Thu Apr 27 2023

We’re Afraid Language Models Aren’t Modeling Ambiguity

Linguistics
Language Models
Machine Learning
Natural Language Processing
Dialogue Interfaces
Writing Aids

This research explores the challenge of managing ambiguity in natural language and presents an annotated benchmark to evaluate the performance of pretrained language models in recognizing and disentangling possible meanings. It shows that current language models struggle to handle ambiguity, including the recent GPT-4 model, and emphasizes the importance of ambiguity-sensitive tools for NLP applications.

This research highlights the challenges in managing ambiguity in language models and provides a benchmark to evaluate their performance. It recommends the development of ambiguity-sensitive tools to improve the performance of NLP applications in recognizing and disentangling possible meanings.

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System

Video Analytics
Computer Vision
Deep Learning
Video Understanding
Multimodal Systems
Video Foundation Models

This research introduces a prototype system for multimodal and versatile video understanding built upon a tracklet-centric paradigm, which treats tracklets as the basic video unit and employs Video Foundation Models (ViFMs) to annotate their properties. The system demonstrates its effectiveness in answering various video-related problems through extensive case studies on different types of in-the-wild videos.

This research presents a prototype system for multimodal and versatile video understanding that uses a tracklet-centric paradigm and ViFMs to annotate the properties of tracklets. It provides evidence of the effectiveness of the system in solving various video-related problems through extensive case studies.

JaxPruner: A concise library for sparsity research

Neural Architecture
Machine Learning
Deep Learning
Pruning
Sparse Training
Neural Networks

This research introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research that aims to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. It demonstrates the ease of integration of JaxPruner with existing JAX-based libraries and provides examples in four different codebases. The researchers believe that JaxPruner has the potential to accelerate sparsity research in the field of machine learning.

This research introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research that aims to provide concise implementations of popular pruning and sparse training algorithms. The research highlights the ease of integration of JaxPruner with existing JAX-based libraries and its potential to accelerate sparsity research.

Putting People in Their Place: Affordance-Aware Human Insertion into Scenes

Scene understanding
Computer vision
Image and video editing
Augmented reality
Virtual reality
Computer graphics

The paper presents a method to inferring scene affordances and realistically inserting people into scenes by learning to re-pose humans in video clips and training a large-scale diffusion model on a dataset of 2.4M video clips that produces diverse plausible poses while respecting the scene context.

The proposed method can be useful for businesses working with augmented reality, virtual reality, or computer graphics, allowing them to realistically insert human subjects into scenes to showcase their products or services. Moreover, it can also automate the process of image or video editing that requires people to be inserted into scenes, saving time and costs.

Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models

Multi-party conversation modeling
Natural language processing
Dialogue systems
Chatbots
Conversational agents
Customer service

The paper studies multi-party conversations and evaluates the ability of language models to act as one or more characters in such conversations. The paper introduces a new dataset called MultiLIGHT and compares models trained on this dataset to existing pairwise-trained dialogue models and large language models with few-shot prompting.

The proposed method can be useful for businesses working with chatbots or conversational agents that are designed to interact with multiple customers or users simultaneously in group settings. It can improve the quality and coherence of the generated responses of the chatbot, leading to better customer satisfaction and engagement. Moreover, the MultiLIGHT dataset can be used to improve the performance of existing chatbots or train new chatbots for multi-party conversations.

Wed Apr 26 2023
Tue Apr 25 2023
Mon Apr 24 2023
Sun Apr 23 2023