Sun Feb 26 2023 - Top Trending AI Papers

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

External knowledge and automated feedback

Natural Language Processing

Machine Learning

Task-oriented dialog

Open-domain question answering

LLM-Augmenter significantly reduces ChatGPT’s hallucinations without sacrificing the fluency and informativeness of its responses.

Implement LLM-Augmenter to improve the performance of large language models in real-world mission-critical applications such as task-oriented dialog and question answering.

https://arxiv.org/pdf/2302.12813.pdf

https://arxiv.org/abs/2302.12813

https://twitter.com/arankomatsuzaki/status/1630020273066979328/photo/1

Decoupling Human and Camera Motion from Videos in the Wild

Data-driven human motion priors

Computer Vision

Machine Learning

Posetrack

The optimization method proposed in the paper decouples the camera and human motion, allowing the placement of people in the same world coordinate frame.

Use the proposed optimization method to reconstruct global human trajectories from videos in challenging in-the-wild scenarios to improve performance of downstream tracking in PoseTrack.

https://vye16.github.io/slahmr/

https://arxiv.org/pdf/2302.12827.pdf

https://arxiv.org/abs/2302.12827

https://twitter.com/arankomatsuzaki/status/1630020944088428544/video/1

Language-Driven Representation Learning for Robotics

Language-driven representations

Robotics

Machine Learning

Grasp affordance prediction

Language-conditioned imitation learning

Intent scoring for human-robot collaboration

Voltron's language-driven representations strictly outperform the prior art.

Implement Voltron to learn from human videos and associated captions for language-conditioned imitation learning and intent scoring for human-robot collaboration among other diverse set of robot learning problems.

https://sites.google.com/view/voltron-robotics

https://arxiv.org/pdf/2302.12766.pdf

https://arxiv.org/abs/2302.12766

https://twitter.com/arankomatsuzaki/status/1630019773869301761/photo/1

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis

Diffusion models

Computer Vision

Image generation and synthesis

Enables conditional image synthesis using pretrained models with multimodal conditioning modules (MCM)

Enables user control over spatial layout of images and better alignment with conditioning inputs. Cheap to train with limited examples.

https://mcm-diffusion.github.io/

https://arxiv.org/pdf/2302.12764.pdf

https://arxiv.org/abs/2302.12764

https://twitter.com/arankomatsuzaki/status/1630019006349336577/photo/1

MUX-PLMs: Pre-training Language Models with Data Multiplexing

Transformers

Machine Learning

Natural Language Processing

Language modeling

Pre-trained multiplexed language models (MUX-PLMs) for improving inference efficiency in downstream tasks

Achieves 2x/5x inference speedup with minimal drop in performance on GLUE and token-level tasks. Released pre-trained checkpoints for different configurations.

https://github.com/princeton-nlp/datamux-pretraining/

https://arxiv.org/pdf/2302.12441.pdf

https://arxiv.org/abs/2302.12441

https://twitter.com/arankomatsuzaki/status/1630022156305276929/photo/1