Features and Advantages
The advantages of random forest are:
- It is one of the most accurate learning algorithms available. For many data sets, it produces a highly accurate classifier.
- It runs efficiently on large databases.
- It can handle thousands of input variables without variable deletion.
- It gives estimates of what variables are important in the classification.
- It generates an internal unbiased estimate of the generalization error as the forest building progresses.
- It has an effective method for estimating missing data and maintains accuracy when a large proportion of the data are missing.
- It has methods for balancing error in class population unbalanced data sets.
- Prototypes are computed that give information about the relation between the variables and the classification.
- It computes proximities between pairs of cases that can be used in clustering, locating outliers, or (by scaling) give interesting views of the data.
- The capabilities of the above can be extended to unlabeled data, leading to unsupervised clustering, data views and outlier detection.
- It offers an experimental method for detecting variable interactions.
Read more about this topic: Random Forest
Famous quotes containing the words features and/or advantages:
“Art is the child of Nature; yes,
Her darling child, in whom we trace
The features of the mothers face,
Her aspect and her attitude.”
—Henry Wadsworth Longfellow (18071882)
“Can you conceive what it is to native-born American women citizens, accustomed to the advantages of our schools, our churches and the mingling of our social life, to ask over and over again for so simple a thing as that we, the people, should mean women as well as men; that our Constitution should mean exactly what it says?”
—Mary F. Eastman, U.S. suffragist. As quoted in History of Woman Suffrage, vol. 4 ch. 5, by Susan B. Anthony and Ida Husted Harper (1902)