Creating RSS for News Archives and Beyond

Authors

Sandip Debnath

Track:

All Papers

Downloads:

Abstract:

RSS or Rich Site Summary is becoming an invaluable format/tool for news feeds. More and more news publishing organizations are realizing its benefits. Content publishers are joining the already heavily crowded RSS club. In the era of information explosion and peer-to-peer sharing, RSS is a great format for doing content publishing, archiving, sharing and much more. However, it came late. We realize that this should have started at the same time Internet became popular and news organizations are making their on-line debut. During the last decade, an enormous amount of news articles had already been published, and (at the same time,) improperly archived due to the lack of a flexible and widely accepted format of archival. However, better late than never. As we now explore possibilities of RSS, this is the time to make the transition smooth for old unformatted news articles and make it uniform across all (new and old) news articles. To do that we realized that extracting metadata of old news articles is one of the ways to create their RSS versions. In this paper we talk about our progress in extracting news metadata with the use of support vector classifier and show that an ordering of applying the classifiers is more useful than applying them in random order. We also show preliminary results on applying TIMEX tags to extract news events, which can be very useful to go beyond RSS to create individual event lines instead of taking the whole story under a single timeline.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.