Tue Mar 21 2023 - Top Trending AI Papers

MM-ReAct: Prompting ChatGPT for Multimodal Reasoning and Action

Computer vision, natural language processing

AI integration with visual intelligence

Multimodal understanding in various scenarios that require advanced visual understanding.

Proposes MM-REACT, a system that integrates ChatGPT with a pool of vision experts to achieve multimodal reasoning and action for advanced visual intelligence, and demonstrates its effectiveness in addressing advanced visual understanding.

Can improve advanced visual understanding in scenarios that require multimodal information processing.

https://multimodal-react.github.io/

https://arxiv.org/pdf/2303.11381.pdf

https://arxiv.org/abs/2303.11381

https://huggingface.co/spaces/microsoft-cognitive-service/mm-react

https://twitter.com/arankomatsuzaki/status/1638344451910238209/photo/1

Visual Representation Learning from Unlabeled Video using Contrastive Masked Autoencoders

Computer vision

Visual representation learning

Transfer learning from video to images on Imagenet-1k

Competitive transfer learning performance on Kinetics-400 video classification benchmark

Proposes ViC-MAE, a method that combines masked autoencoders and contrastive learning for visual representation learning, and demonstrates improved transfer learning from video to images on Imagenet-1k and competitive transfer learning performance on Kinetics-400 video classification benchmark.

Can improve transfer learning performance from video to images and video classification benchmark.

https://arxiv.org/pdf/2303.12001.pdf

https://arxiv.org/abs/2303.12001

https://twitter.com/arankomatsuzaki/status/1638340408471068672/photo/1

Large Language Models Can Be Used to Estimate the Ideologies of Politicians in a Zero-Shot Learning Setting

Natural language processing

Language models in social sciences

Measuring latent ideology in the social sciences

Demonstrates the potential of large language models in measuring latent ideology in the social sciences by scaling pairwise liberal-conservative comparisons between members of the U.S. Senate using prompts made to ChatGPT, with strong correlation to widely used liberal-conservative scales such as DW-NOMINATE.

Can potentially offer new solutions to problems of observability and measurement in the social sciences.

https://arxiv.org/pdf/2303.12057.pdf

https://arxiv.org/abs/2303.12057

https://twitter.com/arankomatsuzaki/status/1638339392929517568/photo/1