Reinforcement Learning Study - 7
Deep Reinforcement Learning In real world, states or action space is too big to record all information about the model in (value) table. So as to generalize the information, the most powerful generalization tool(function), Neural Network is used. A node is fundamental component of neural network, and each node linearly combines(WX + b) the inputs entering the node and then outputs them by applying a nonlinear function(sigmoid, ReLU etc...). Value-based agent value-based learning is method where neural netwrok is used to predict value function. In neural network, 'Loss function' is used to update parameter of neural network. Loss function is defined as the difference between the predicted value and the real value. In Q-learning, value function is defined . so Loss is defined as below. In fact, in the above eqation, because we don't know real value Q, we can't use that equation. So, here we have smart way to solve this problem. That's the expected val...