Studies Comparing Methods
Nakhleh et al. carried out a comparison of six analysis methods using an IE database. The methods compared were UPGMA, NJ MP, MC, WMC and GA. The PAUP software package was used for UPGMA, NJ, and MC as well as computing the majority consensus trees. The RWT database was used but 40 characters were removed due to evidence of polymorphism. Then a screened database was produced excluding all characters that clearly exhibited parallel development, so eliminating 38 features. The trees were evaluated on the basis of the number of incompatible characters and on agreement with established sub-grouping results. They found that UPGMA was clearly worst but there was not a lot of difference between the other methods. The results depended on the data set used. It was found that weighting the characters was important, which requires linguistic judgement.
A comparison of coding methods was carried out by Rexova et al.. They created a reduced data set from the Dyen database but with the addition of Hittite. They produced a standard multistate matrix where the 141 character states corresponds to individual cognate classes, allowing polymorphism. They also joined some cognate classes, to reduce subjectivity and polymorphic states were not allowed. Lastly they produced a binary matrix where each class of words was treated as a separate character. The matrices were analysed by PAUP. It was found that using the binary matrix produced changes near the root of the tree.
Barbancon et al. studied various tree reconstruction methods using simulated data. Their simulated data varied in the number of contact edges, the degree of homoplasy, the deviation from a lexical clock, and the deviation from the rates-across-sites assumption. It was found that the accuracy of the unweighted methods (MP, NJ, UPGMA, and GA) were consistent in all the conditions studied, with MP being the best. The accuracy of the two weighted methods (WMC and WMP) depended on the appropriateness of the weighting scheme. With low homoplasy the weighted methods generally produced the more accurate results but inappropriate weighting could make these worse than MP or GA under moderate or high homoplasy levels.
McMahon and McMahon used three PHYLIP programs (NJ, Fitch and Kitch) on the DKB dataset. They found that the results produced were very similar. Bootstrapping was used to test the robustness of any part of the tree. Later they used subsets of the data to assess its retentiveness and reconstructability. The outputs showed topological differences which were attributed to borrowing. They then also used Network, Split Decomposition, Neighbor-net and Splitstree on several data sets. Significant differences were found between the latter two methods. Neighbor-net was considered optimal for discerning language contact.
Cysouw et al. compared Holm's original method with NJ, Fitch, MP and SD. They found Holm's method to be less accurate than the others.
Saunders compared NJ, MP, GA and Neighbor-Net on a combination of lexical and typological data. He recommended use of the GA method but Nichols and Warnow have some concerns about the study methodology.
Read more about this topic: Quantitative Comparative Linguistics
Famous quotes containing the words studies, comparing and/or methods:
“These studies which stimulate the young, divert the old, are an ornament in prosperity and a refuge and comfort in adversity; they delight us at home, are no impediment in public life, keep us company at night, in our travels, and whenever we retire to the country.”
—Marcus Tullius Cicero (10643 B.C.)
“There is no comparing the brutality and cynicism of todays pop culture with that of forty years ago: from High Noon to Robocop is a long descent.”
—Charles Krauthammer (b. 1950)
“If you want to know the taste of a pear, you must change the pear by eating it yourself.... If you want to know the theory and methods of revolution, you must take part in revolution. All genuine knowledge originates in direct experience.”
—Mao Zedong (18931976)