Teaching Large Language Models to Self-Debug
Proposes Self-Debugging to teach a large language model to debug its predicted program via few-shot demonstrations, achieving state-of-the-art performance on several code generation benchmarks.
Can improve code generation performance by teaching large language models to debug their predicted program.
Neural Lens Modeling
Proposes NeuroLens, a neural lens model for distortion and vignetting that can be used for point projection and ray casting and can be optimized through both operations, outperforming standard packages as well as recent approaches while being much easier to use and extend.
Can achieve higher quality and easier calibration for camera calibration and 3D reconstruction by using NeuroLens, a neural lens model.
Reinforcement Learning from Passive Data via Latent Intentions
Proposes learning from passive data by modeling intentions and using temporal difference learning objective to learn about intentions, resulting in an algorithm similar to conventional RL, but which learns entirely from passive data, achieving features amenable for value prediction for downstream tasks.
Can use passive observational data to learn features that accelerate downstream reinforcement learning.
RRHF: Rank Responses to Align Language Models with Human Feedback without tears
RRHF helps align large language models with human perference easier by proposing a novel learning paradigm that scores responses generated by different sampling policies and learns to align them with human preferences through ranking loss.
RRHF can efficiently align language model output probabilities with human preferences as robust as fine-tuning and it only needs 1 to 2 models during tuning, simplifying the alignment between language models with human preference.
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
This paper systematically evaluates toxicity in over half a million generations of ChatGPT, a popular dialogue-based LLM, and finds that setting the system parameter of ChatGPT by assigning it a persona significantly increases the toxicity of generations.
Developers should be aware that assigning a persona to a language model can result in toxicity and potentially defamatory or harmful outputs. The AI community should rethink the efficacy of current safety guardrails and develop better techniques that lead to robust, safe, and trustworthy AI systems.