Quantitative Comparative Linguistics - Studies Comparing Methods

Studies Comparing Methods

Nakhleh et al. carried out a comparison of six analysis methods using an IE database. The methods compared were UPGMA, NJ MP, MC, WMC and GA. The PAUP software package was used for UPGMA, NJ, and MC as well as computing the majority consensus trees. The RWT database was used but 40 characters were removed due to evidence of polymorphism. Then a screened database was produced excluding all characters that clearly exhibited parallel development, so eliminating 38 features. The trees were evaluated on the basis of the number of incompatible characters and on agreement with established sub-grouping results. They found that UPGMA was clearly worst but there was not a lot of difference between the other methods. The results depended on the data set used. It was found that weighting the characters was important, which requires linguistic judgement.

A comparison of coding methods was carried out by Rexova et al.. They created a reduced data set from the Dyen database but with the addition of Hittite. They produced a standard multistate matrix where the 141 character states corresponds to individual cognate classes, allowing polymorphism. They also joined some cognate classes, to reduce subjectivity and polymorphic states were not allowed. Lastly they produced a binary matrix where each class of words was treated as a separate character. The matrices were analysed by PAUP. It was found that using the binary matrix produced changes near the root of the tree.

Barbancon et al. studied various tree reconstruction methods using simulated data. Their simulated data varied in the number of contact edges, the degree of homoplasy, the deviation from a lexical clock, and the deviation from the rates-across-sites assumption. It was found that the accuracy of the unweighted methods (MP, NJ, UPGMA, and GA) were consistent in all the conditions studied, with MP being the best. The accuracy of the two weighted methods (WMC and WMP) depended on the appropriateness of the weighting scheme. With low homoplasy the weighted methods generally produced the more accurate results but inappropriate weighting could make these worse than MP or GA under moderate or high homoplasy levels.

McMahon and McMahon used three PHYLIP programs (NJ, Fitch and Kitch) on the DKB dataset. They found that the results produced were very similar. Bootstrapping was used to test the robustness of any part of the tree. Later they used subsets of the data to assess its retentiveness and reconstructability. The outputs showed topological differences which were attributed to borrowing. They then also used Network, Split Decomposition, Neighbor-net and Splitstree on several data sets. Significant differences were found between the latter two methods. Neighbor-net was considered optimal for discerning language contact.

Cysouw et al. compared Holm's original method with NJ, Fitch, MP and SD. They found Holm's method to be less accurate than the others.

Saunders compared NJ, MP, GA and Neighbor-Net on a combination of lexical and typological data. He recommended use of the GA method but Nichols and Warnow have some concerns about the study methodology.

Read more about this topic: Quantitative Comparative Linguistics

Famous quotes containing the words studies, comparing and/or methods:

“His life itself passes deeper in nature than the studies of the naturalist penetrate; himself a subject for the naturalist. The latter raises the moss and bark gently with his knife in search of insects; the former lays open logs to their core with his axe, and moss and bark fly far and wide. He gets his living by barking trees. Such a man has some right to fish, and I love to see nature carried out in him.”
—Henry David Thoreau (1817–1862)

“There is no comparing the brutality and cynicism of today’s pop culture with that of forty years ago: from High Noon to Robocop is a long descent.”
—Charles Krauthammer (b. 1950)

“We are lonesome animals. We spend all our life trying to be less lonesome. One of our ancient methods is to tell a story begging the listener to say—and to feel—”Yes, that’s the way it is, or at least that’s the way I feel it. You’re not as alone as you thought.””
—John Steinbeck (1902–1968)

Related Phrases

Historical Linguistics

Internal Nodes

Phylogenetic Methods

Represent Ancestors

Similarity Percentage

Related Words