Track:
Contents
Downloads:
Abstract:
In this paper we resume the T-Recs block segmentation and layout analysis approach for tabular documents before we discuss an additional processing step for the proper recognition of potential tables. While the T-Recs results of the processing steps so far look pretty good on documents like articles which might occasionally contain the one or the other table (amongst regular paragraphs), the identification of tables gets confused by logical objects such as recipient address, date or company specific printing within business letters heads if those layout objects occur in a horizontal (left-of, right-o]) neighborship. To increase the precision z for correctly recognized tables in the above mentioned domain of business letters, we developed an additional processing step with has the purpose to determine clusters of cells that show up more evident features of a table than just the left-of, right-of relation between some blocks.