What is Reinforcement Learning?
To understand Reinforcement Learning, let’s start with the big picture.
The big picture
The idea behind Reinforcement Learning is that an agent (an AI) will learn from the environment by interacting with it (through trial and error) and receiving rewards (negative or positive) as feedback for performing actions.
Learning from interactions with the environment comes from our natural experiences.
For instance, imagine putting your little brother in front of a video game he never played, giving him a controller, and leaving him alone.
Your brother will interact with the environment (the video game) by pressing the right button (action). He got a coin, that’s a +1 reward. It’s positive, he just understood that in this game he must get the coins.
But then, he presses the right button again and he touches an enemy. He just died, so that’s a -1 reward.
By interacting with his environment through trial and error, your little brother understands that he needs to get coins in this environment but avoid the enemies.
Without any supervision, the child will get better and better at playing the game.
That’s how humans and animals learn, through interaction. Reinforcement Learning is just a computational approach of learning from actions.
A formal definition
We can now make a formal definition:
But how does Reinforcement Learning work?
< > Update on GitHub