Exploration
The reinforcement learning problem as described requires clever exploration mechanisms. Randomly selecting actions is known to give rise to very poor performance. The case of (small) finite MDPs is relatively well understood by now. However, due to the lack of algorithms that would provably scale well with the number of states (or scale to problems with infinite state spaces), in practice people resort to simple exploration methods. One such method is -greedy, when the agent chooses the action that it believes has the best long-term effect with probability, and it chooses an action uniformly at random, otherwise. Here, is a tuning parameter, which is sometimes changed, either according to a fixed schedule (making the agent explore less as time goes by), or adaptively based on some heuristics (Tokic & Palm, 2011).
Read more about this topic: Reinforcement Learning
Famous quotes containing the word exploration:
“I call her old. She has one family
Whose claim is good to being settled here
Before the era of colonization,
And before that of exploration even.
John Smith remarked them as he coasted by....”
—Robert Frost (18741963)
“The future author is one who discovers that language, the exploration and manipulation of the resources of language, will serve him in winning through to his way.”
—Thornton Wilder (18971975)
“Typography tended to alter language from a means of perception and exploration to a portable commodity.”
—Marshall McLuhan (19111980)