Please use this identifier to cite or link to this item: https://rfos.fon.bg.ac.rs/handle/123456789/2959
Full metadata record
DC FieldValueLanguage
dc.creatorMahovac, Zoranen_US
dc.creatorPetrović, Andrijaen_US
dc.creatorRadovanović, Sandroen_US
dc.creatorDelibašić, Borisen_US
dc.date.accessioned2025-12-04T09:12:17Z-
dc.date.available2025-12-04T09:12:17Z-
dc.date.issued2025-
dc.identifier.isbn978-86-7680-484-9-
dc.identifier.urihttps://rfos.fon.bg.ac.rs/handle/123456789/2959-
dc.description.abstractTabular data, such as relational tables, Web tables and CSV files, is among the most primitive and essential forms of data in machine learning, characterized by excellent structural properties, readability, and interpretability. However, acquiring substantial amounts of high-quality tabular data for ML model training remains a persistent challenge. This study evaluates the performance of six generative models — TVAE, RTVAE, CTGAN, ADSGAN, BNN, and Marginal Distributions on synthetic data generation. The evaluation is based on three key metrics: Fidelity, Diversity, and Generalization. Fidelity measures the quality of synthetic data, Diversity assesses how well the samples cover the variability of the real dataset, and Generalization quantifies the risk of overfitting. The research applies these metrics to four datasets: Abalone, Acute Inflammation, Census Income, and Pittsburgh Bridges. Results show that CTGAN consistently outperforms other models measured by IPα and IRβ metrics, while RTVAE excels in the Census Income dataset in terms of Generalization. Marginal Distributions stands out in preserving data authenticity. This study offers a refined method of evaluating generative models, emphasizing precision–recall analysis grounded in minimum volume sets, thus providing a deeper understanding of model performance across multiple dimensions.en_US
dc.language.isoenen_US
dc.publisherUniverzitet u Beogradu – Fakultet organizacionih naukaen_US
dc.rightsopenAccessen_US
dc.sourceProceedings of the 11th International Conference on Decision Support System Technology (ICDSST 2025)en_US
dc.subjectGenerative modelsen_US
dc.subjectsynthetic dataen_US
dc.subjectFidelityen_US
dc.subjectDiversityen_US
dc.subjectGeneralizationen_US
dc.subjecttabular data generationen_US
dc.titleEvaluating Generative Models for Synthetic Tabular Data: A Comparative Analysis of Fidelity, Diversity, and Generalizationen_US
dc.typeconferenceObjecten_US
dc.citation.epage20en_US
dc.citation.spage12en_US
dc.type.versionpublishedVersionen_US
item.fulltextWith Fulltext-
item.openairetypeconferenceObject-
item.grantfulltextopen-
item.cerifentitytypePublications-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.languageiso639-1en-
Appears in Collections:Radovi istraživača / Researchers’ publications
Files in This Item:
File Description SizeFormat 
133_mahovac_et_al_icdsst_2025.pdf1.38 MBAdobe PDFView/Open
Show simple item record

Page view(s)

18
checked on Dec 14, 2025

Download(s)

38
checked on Dec 14, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.