Abstract This paper describes a method of comparing corpora which uses frequency profiling The method can be used to discover key words in
Abstract This paper describes a method of comparing corpora which uses frequency profiling The method can be used to discover key words in
Almost all the techniques considered work with word frequencies While a full comparison between any two corpora would of course cover many
17 oct 2016 · these issues across nine languages, including starting from loosely comparable We assess a corpus by comparing its word frequency list
As corpora often differ in size, a critically important assumption in this field states that the use of a normalized frequency threshold, such as 20 occurrences
In the case of comparing the frequencies of a given word in two corpora the test statistic is the difference between these frequencies and, put simply, the null
We recommend the use of the t-test, Wilcoxon rank-sum test, or bootstrap test for comparing word frequencies across corpora Page 4 1 Introduction Comparison
5 mar 2019 · Why describe / compare corpora? Over 60 normalised feature counts at the document level frequency of the same word in corpus B