Abstract:
Heylighen and Dewaele’s (2002) F-score, a measure of formality developed based on categorical frequencies of word types, is used as a starting point for an investigation of an online diary corpus. Comparisons are made between results in the main corpus of diary entries, a smaller corpus of diary comments, and with previously calculated F-scores for similar types of data (Nowson, Oberlander & Gill, 2005). While the overall F-score is similar in these two corpora, results show that internal make-up of the categories upon which the calculation is based can differ. This suggests that while the F-score is a good measure of formality/contextuality and is useful in distinguishing between genres on a large scale, more detailed analyses are required to more completely describe and situate genres with respect to one another.
DOI:
10.1609/icwsm.v3i1.14004