Reinforcement Learning - Theory

Theory

The theory for small, finite MDPs is quite mature. Both the asymptotic and finite-sample behavior of most algorithms is well-understood. As mentioned beforehand, algorithms with provably good online performance (addressing the exploration issue) are known. The theory of large MDPs needs more work. Efficient exploration is largely untouched (except for the case of bandit problems). Although finite-time performance bounds appeared for many algorithms in the recent years, these bounds are expected to be rather loose and thus more work is needed to better understand the relative advantages, as well as the limitations of these algorithms. For incremental algorithm asymptotic convergence issues have been settled. Recently, new incremental, temporal-difference-based algorithms have appeared which converge under a much wider set of conditions than was previously possible (for example, when used with arbitrary, smooth function approximation).

Read more about this topic:  Reinforcement Learning

Famous quotes containing the word theory:

    Thus the theory of description matters most.
    It is the theory of the word for those
    For whom the word is the making of the world,
    The buzzing world and lisping firmament.
    Wallace Stevens (1879–1955)

    We have our little theory on all human and divine things. Poetry, the workings of genius itself, which, in all times, with one or another meaning, has been called Inspiration, and held to be mysterious and inscrutable, is no longer without its scientific exposition. The building of the lofty rhyme is like any other masonry or bricklaying: we have theories of its rise, height, decline and fall—which latter, it would seem, is now near, among all people.
    Thomas Carlyle (1795–1881)

    It is not enough for theory to describe and analyse, it must itself be an event in the universe it describes. In order to do this theory must partake of and become the acceleration of this logic. It must tear itself from all referents and take pride only in the future. Theory must operate on time at the cost of a deliberate distortion of present reality.
    Jean Baudrillard (b. 1929)