Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
Finds that properly instructed ChatGPT and GPT-4 can deliver competitive, even superior results than supervised methods on popular IR benchmarks.
Implementing generative LLMs such as ChatGPT and GPT-4 for relevance ranking in Information Retrieval (IR) can deliver better results than supervised methods.
Evaluating Verifiability in Generative Search Engines
Finds that responses contain unsupported statements and inaccurate citations majority of times.
Users should be aware that responses generated by existing generative search engines may contain unsupported statements and inaccurate citations.
NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
Introduces NeuralField-LDM, a generative model capable of synthesizing complex 3D environments with a substantial improvement over existing state-of-the-art models.
Use NeuralField-LDM to generate high-quality 3D scenes for applications such as virtual reality and robotics simulation.
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Chameleon is a plug-and-play compositional reasoning framework that augments LLMs to help address challenges faced by them, such as an inability to access up-to-date information, utilize external tools or perform precise mathematical reasoning. It synthesizes programs to compose various tools tailored to user interests and infers the appropriate sequence of tools to generate a final response. The framework showcases adaptability and effectiveness on two tasks: ScienceQA and TabMWP. Chameleon with GPT-4 achieves an 86.54% accuracy on ScienceQA and a 17.8% increase over the state-of-the-art model, leading to a 98.78% overall accuracy on TabMWP.
Chameleon can help businesses improve their natural language processing tasks by addressing the inherent limitations faced by Large Language Models. It can access up-to-date information, utilize external tools or perform precise mathematical reasoning. The synthesizing programs can be customized to tailor to user interests, and the framework can adapt to new tasks to generate final responses that are more accurate and effective.
Pretrained Language Models as Visual Planners for Human Assistance
Visual Planning for Assistance (VPA) is the task proposed in this paper, which aims to obtain a plan, i.e., a sequence of actions, to achieve a goal described in natural language based on a user's progress in a video. The task requires assessing the user's progress from the untrimmed video, relating it to the requirements of the goal, and handling long video history and arbitrarily complex action dependencies. The paper presents Visual Language Model based Planner (VLaMP), which leverages pre-trained LMs as the sequence model and performs significantly better than baselines w.r.t all metrics that evaluate the generated plan.
VLaMP can help businesses improve their multi-modal AI assistants, which can guide users to achieve complex multi-step goals. The framework can obtain plans to achieve goals described in natural language based on a user's progress in a video by assessing the progress from the untrimmed video and relating it to the requirements of the goal. The pre-trained LMs as the sequence model can help handle long video history and arbitrarily complex action dependencies, with VLaMP performing significantly better than existing baselines.