Please use this identifier to cite or link to this item:
https://rfos.fon.bg.ac.rs/handle/123456789/1404| Title: | Automated classification and localization of daily deal content from the Web | Authors: | Cuzzola, John Jovanović, Jelena Bagheri, Ebrahim Gašević, Dragan |
Keywords: | Web classification;Segmentation;Information extraction | Issue Date: | 2015 | Publisher: | Elsevier, Amsterdam | Abstract: | Websites offering daily deal offers have received widespread attention from the end-users. The objective of such Websites is to provide time limited discounts on goods and services in the hope of enticing more customers to purchase such goods or services. The success of daily deal Websites has given rise to meta-level daily deal aggregator services that collect daily deal information from across the Web. Due to some of the unique characteristics of daily deal Websites such as high update frequency, time sensitivity, and lack of coherent information representation, many deal aggregators rely on human intervention to identify and extract deal information. In this paper, we propose an approach where daily deal information is identified, classified and properly segmented and localized. Our approach is based on a semi-supervised method that uses sentence-level features of daily deal information on a given Web page. Our work offers (i) a set of computationally inexpensive discriminative features that are able to effectively distinguishWeb pages that contain daily deal information; (ii) the construction and systematic evaluation of machine learning techniques based on these features to automatically classify daily deal Web pages; and (iii) the development of an accurate segmentation algorithm that is able to localize and extract individual deals from within a complex Web page. We have extensively evaluated our approach from different perspectives, the results of which show notable performance. | URI: | https://rfos.fon.bg.ac.rs/handle/123456789/1404 | ISSN: | 1568-4946 |
| Appears in Collections: | Radovi istraživača / Researchers’ publications |
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
