15 puzzle java source code

5/18/2023

Whether we are playing a game or solving a physics problem or predicting the stock market with RL, we are always facing a current situation(or state) and a set of actions that we can take from that state. We might get stuck in a local maxima which is not a solution really. That is a greedy strategy and might not yield a globally optimal solution.

We don’t want to necessarily maximize the reward with each action. The reward might be given to us after each action we take or could be a lump-sum reward at the end of the game or activity(That is if there is an end). The reward is like a signal which tells us if we are doing good or bad.

The very abstract and general idea behind RL is that we solve the problem by experiencing(playing) it and get rewards for actions we take.

0 Comments

15 puzzle java source code

Leave a Reply.

Author

Archives

Categories