Rhys Price Jones, J. Fernando Naveda, Paul Roetling, Steven J. Harrington, and Nishant Thakkar
We identify three aspects of style pertaining to documents. The first of these we call literary style and it includes the word and sentence constructions and choice of illustrations traditionally associated with authorship. The second we call informative style and it includes formatting and iconic choices that convey additional information such as the document’s genre or corporate identity. The third aspect of style covers the degrees of freedom remaining for the author and is used to convey the author’s intent. Literary style is the realm of academic scholarship and discourse and is beyond the scope of the present article. But corporate and intent style can be quantified by measuring many different attributes. For example, density of text, colorfulness of images, regularity of positioning of images, diversity of font and typeface, all contribute to the document’s overall style. Indeed, we have identified more than 150 different value functions, each of which can be measured, and each of which can contribute to a document’s overall stylistic appearance. Measurement of these value functions effectively places a document as a point in a style space. But the 150 value functions are not independent. A heuristic approach is described for investigating the possibility of finding basis vectors for intent space.