Articles
In this article you'll learn:
- How Long Ouyang and his team at OpenAI trained InstructGPT to follow human instructions
- How fine-tuning with reinforcement learning from human feedback can produce better results
with less data at a lower cost
- How alignment can unleash untapped potential in existing models