Please use this identifier to cite or link to this item: https://rfos.fon.bg.ac.rs/handle/123456789/1575
Full metadata record
DC FieldValueLanguage
dc.creatorKirchner, Kathrin
dc.creatorZec, Jelena
dc.creatorDelibašić, Boris
dc.date.accessioned2023-05-12T11:03:16Z-
dc.date.available2023-05-12T11:03:16Z-
dc.date.issued2016
dc.identifier.issn0269-2821
dc.identifier.urihttps://rfos.fon.bg.ac.rs/handle/123456789/1575-
dc.description.abstractClustering is among the most popular data mining algorithm families. Before applying clustering algorithms to datasets, it is usually necessary to preprocess the data properly. Data preprocessing is a crucial, still neglected step in data mining. Although preprocessing techniques and algorithms are well-known, the preprocessing process is very complex and takes usually a lot of time. Instead of handling preprocessing more systematically, it is usually undervalued, i.e. more emphasis is put on choosing the appropriate clustering algorithm and setting its parameters. In our opinion, this is not because preprocessing is less important, but because it is difficult to choose the best sequence of preprocessing algorithms. We argue that it is important to better standardize this process so it is performed efficiently. Therefore, this paper proposes a generic framework for data preprocessing. It is based on a survey with data mining experts, as well as a literature and software review. The framework enables pipelining preprocessing algorithms and methods which facilitate further automated preprocessing design and the selection of a suitable preprocessing stream. The proposed framework is easily extendible, so it can be applied to other data mining algorithm families that have their own idiosyncrasies.en
dc.publisherSpringer, Dordrecht
dc.rightsrestrictedAccess
dc.sourceArtificial Intelligence Review
dc.subjectPreprocessing stream selectionen
dc.subjectPreprocessing in data miningen
dc.subjectGeneric frameworken
dc.subjectClustering algorithmen
dc.titleFacilitating data preprocessing by a generic framework: a proposal for clusteringen
dc.typearticle
dc.rights.licenseARR
dc.citation.epage297
dc.citation.issue3
dc.citation.other45(3): 271-297
dc.citation.rankM21
dc.citation.spage271
dc.citation.volume45
dc.identifier.doi10.1007/s10462-015-9446-6
dc.identifier.rcubconv_1823
dc.identifier.scopus2-s2.0-84957849248
dc.identifier.wos000377827400001
dc.type.versionpublishedVersion
item.cerifentitytypePublications-
item.fulltextWith Fulltext-
item.grantfulltextrestricted-
item.openairetypearticle-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
Appears in Collections:Radovi istraživača / Researchers’ publications
Files in This Item:
File Description SizeFormat 
1571.pdf
  Restricted Access
889.06 kBAdobe PDFView/Open    Request a copy
Show simple item record

SCOPUSTM   
Citations

27
checked on Nov 17, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.