| 1. |
What is Q-learning? |
|
Answer» Q Learning is a model-free learning policy that chooses the best COURSE of action in an environment, depending on where in the environment the AGENT is (an agent is an entity that makes a decision and enables AI to be put into action). Model-free learning policy means that the nature and predictions of the environment to learn and move forward. It does not reward a system to learn, it uses the trial and error method instead. The model's goal is to determine the optimum course of action given the CURRENT situation. To accomplish this, it may devise its own set of rules or act outside of the policy that has been established for it to obey. This means there isn't a real need for a policy, which is why it's called off-policy. The agent's experience is SAVED in the Q table in Q-learning, and the value in the table INDICATES the long-term reward value of executing a certain action in a specific condition. The Q learning algorithm, according to the Q table, can instruct the Q agent the action to take in a given situation to maximize the predicted reward. |
|