Decision Tree Learning - General

General

Decision tree learning is a method commonly used in data mining. The goal is to create a model that predicts the value of a target variable based on several input variables. An example is shown on the right. Each interior node corresponds to one of the input variables; there are edges to children for each of the possible values of that input variable. Each leaf represents a value of the target variable given the values of the input variables represented by the path from the root to the leaf.

A tree can be "learned" by splitting the source set into subsets based on an attribute value test. This process is repeated on each derived subset in a recursive manner called recursive partitioning. The recursion is completed when the subset at a node has all the same value of the target variable, or when splitting no longer adds value to the predictions. This process of top-down induction of decision trees (TDIDT) is an example of a greedy algorithm, and it is by far the most common strategy for learning decision trees from data, but it is not the only strategy. In fact, some approaches have been developed recently allowing tree induction to be performed in a bottom-up fashion.

In data mining, decision trees can be described also as the combination of mathematical and computational techniques to aid the description, categorisation and generalisation of a given set of data.

Data comes in records of the form:

The dependent variable, Y, is the target variable that we are trying to understand, classify or generalise. The vector x is composed of the input variables, x1, x2, x3 etc., that are used for that task.

Read more about this topic:  Decision Tree Learning

Famous quotes containing the word general:

    When General Motors has to go to the bathroom ten times a day, the whole country’s ready to let go. You heard of that market crash in ‘29? I predicted that.... I was nursing a director of General Motors. Kidney ailment, they said; nerves, I said. Then I asked myself, “What’s General Motors got to be nervous about?” “Overproduction,” I says. “Collapse.”
    John Michael Hayes (b. 1919)

    In communist society, where nobody has one exclusive sphere of activity but each can become accomplished in any branch he wishes, society regulates the general production and thus makes it possible for me to do one thing today and another tomorrow, to hunt in the morning, fish in the afternoon, rear cattle in the evening, criticize after dinner, just as I have a mind, without ever becoming hunter, fisherman, shepherd or critic.
    Karl Marx (1818–1883)

    I suggested to them also the great desirability of a general knowledge on the Island of the English language. They are under an English speaking government and are a part of the territory of an English speaking nation.... While I appreciated the desirability of maintaining their grasp on the Spanish language, the beauty of that language and the richness of its literature, that as a practical matter for them it was quite necessary to have a good comprehension of English.
    Calvin Coolidge (1872–1933)