Back to feed
research3292d ago
Learning from human preferences
OpenAI and DeepMind collaborated on an algorithm that infers human preferences by comparing proposed behaviors. This aims to improve AI safety by reducing the need for manual goal functions. The approach lets humans simply choose which behavior is better, rather than specifying exact goals. This could help prevent undesirable behavior from misaligned goals.
Key takeaways
- Algorithm infers human preferences from pairwise comparisons.
- Reduces need for manual goal functions in AI development.
- Improves AI safety by minimizing risk of misaligned goals.
OpenAI and DeepMind collaborated on an algorithm that infers human preferences by comparing proposed behaviors. This aims to improve AI safety by reducing the need for manual goal functions. The approach lets humans simply choose which behavior is better, rather than specifying exact goals. This could help prevent undesirable behavior from misaligned goals.
Key takeaways
- Algorithm infers human preferences from pairwise comparisons.
- Reduces need for manual goal functions in AI development.
- Improves AI safety by minimizing risk of misaligned goals.