Reinforcement Learning - Theory

Theory

The theory for small, finite MDPs is quite mature. Both the asymptotic and finite-sample behavior of most algorithms is well-understood. As mentioned beforehand, algorithms with provably good online performance (addressing the exploration issue) are known. The theory of large MDPs needs more work. Efficient exploration is largely untouched (except for the case of bandit problems). Although finite-time performance bounds appeared for many algorithms in the recent years, these bounds are expected to be rather loose and thus more work is needed to better understand the relative advantages, as well as the limitations of these algorithms. For incremental algorithm asymptotic convergence issues have been settled. Recently, new incremental, temporal-difference-based algorithms have appeared which converge under a much wider set of conditions than was previously possible (for example, when used with arbitrary, smooth function approximation).

Read more about this topic:  Reinforcement Learning

Famous quotes containing the word theory:

    It makes no sense to say what the objects of a theory are,
    beyond saying how to interpret or reinterpret that theory in another.
    Willard Van Orman Quine (b. 1908)

    No one thinks anything silly is suitable when they are an adolescent. Such an enormous share of their own behavior is silly that they lose all proper perspective on silliness, like a baker who is nauseated by the sight of his own eclairs. This provides another good argument for the emerging theory that the best use of cryogenics is to freeze all human beings when they are between the ages of twelve and nineteen.
    Anna Quindlen (20th century)

    [Anarchism] is the philosophy of the sovereignty of the individual. It is the theory of social harmony. It is the great, surging, living truth that is reconstructing the world, and that will usher in the Dawn.
    Emma Goldman (1869–1940)