Paper
11 September 2015 Novel approach to data discretization
Grzegorz Borowik, Karol Kowalski, Cezary Jankowski
Author Affiliations +
Proceedings Volume 9662, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2015; 96623U (2015) https://doi.org/10.1117/12.2205916
Event: XXXVI Symposium on Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments (Wilga 2015), 2015, Wilga, Poland
Abstract
Discretization is an important preprocessing step in data mining. The data discretization method involves determining the ranges of values for numeric attributes, which ultimately represent discrete intervals for new attributes. The ranges for the proposed set of cuts are analyzed, in order to obtain a minimal set of ranges while retaining the possibility of classification. For this purpose, a special discernibility function can be constructed as a conjunction of alternative cuts set for each pair of different objects of different decisions- cuts discern these objects. However, the data mining methods based on discernibility matrix are insufficient for large databases. The purpose of this paper is the idea of implementation of a new data discretization algorithm that is based on statistics of attribute values and that avoids building the discernibility matrix explicitly. Evaluation of time complexity has shown that the proposed method is much more efficient than currently available solutions for large data sets.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Grzegorz Borowik, Karol Kowalski, and Cezary Jankowski "Novel approach to data discretization", Proc. SPIE 9662, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2015, 96623U (11 September 2015); https://doi.org/10.1117/12.2205916
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data mining

Databases

FDA class I medical device development

Protactinium

Algorithm development

Computing systems

Data processing

RELATED CONTENT

Rule induction based on frequencies of attribute values
Proceedings of SPIE (September 11 2015)
Fast algorithm for feature extraction
Proceedings of SPIE (September 11 2015)
Immune algorithm for KDD
Proceedings of SPIE (September 25 2001)

Back to Top