Exploring Reinforcement Learning in Artificial Intelligence

What is reinforcement learning? How does it relate with other ML techniques?

Reinforcement Learning(RL) is a type of machine learning technique that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences.

Though both supervised and reinforcement learning use mapping between input and output, unlike supervised learning where feedback provided to the agent is correct set of actions for performing a task, reinforcement learning uses rewards and punishment as signals for positive and negative behavior.

As compared to unsupervised learning, reinforcement learning is different in terms of goals. While the goal in unsupervised learning is to find similarities and differences between data points, in reinforcement learning the goal is to find a suitable action model that would maximize the total cumulative reward of the agent. The figure below represents the basic idea and elements involved in a reinforcement learning model.

What are some most used Reinforcement Learning algorithms?

Q-learning and SARSA (State-Action-Reward-State-Action) are two commonly used model-free RL algorithms. They differ in terms of their exploration strategies while their exploitation strategies are similar. While Q-learning is an off-policy method in which the agent learns the value based on action a* derived from the another policy, SARSA is an on-policy method where it learns the value based on its current action aderived from its current policy. These two methods are simple to implement but lack generality as they do not have the ability to estimate values for unseen states.

This can be overcome by more advanced algorithms such as Deep Q-Networks which use Neural Networks to estimate Q-values. But DQNs can only handle discrete, low-dimensional action spaces. DDPG(Deep Deterministic Policy Gradient)is a model-free, off-policy, actor-critic algorithm that tackles this problem by learning policies in high dimensional, continuous action spaces.

Built With

artificialintelligence
deeplearning
reinforcementlearning

Updates

Sahibdeep Singh Sodhi started this project — Jan 16, 2021 11:33 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.