Reinforcement Learning Explained Simply

What it is

Reinforcement learning is how you train an AI by rewarding it when it does something right and penalising it when it does something wrong. It's basically the same way you'd train a dog, except the dog is software and the treats are numbers. The AI tries things, sees what gets rewarded, and gradually figures out the best strategy. This is how AlphaGo learned to beat the world champion at Go, and it's a key part of how ChatGPT was trained to give helpful answers instead of unhinged ones.

Why it matters for your job

Reinforcement learning is behind many of the AI systems that are getting better at complex decision-making: game playing, robotics, trading strategies, resource allocation. It's also how AI chatbots get tuned to be actually useful, often using human feedback. If your job involves making sequential decisions or optimising outcomes, reinforcement learning is the flavour of AI most likely to be relevant to your work.

What to do about it

If AI tools in your workplace seem to improve over time based on user feedback, that's likely reinforcement learning at work. Be thoughtful about the feedback you give. You're literally training the system that might end up doing parts of your job. Make sure it learns the right lessons.

This glossary is part of the full guide, along with role-specific playbooks and redundancy rights cheat sheets → See what’s inside

What it is

Why it matters for your job

What to do about it

Related terms