Paper
6 April 2000 Distance functions in dynamic integration of data mining techniques
Seppo Jumani Puuronen, Alexey Tsymbal, Vagan Terziyan
Author Affiliations +
Abstract
One of the most important directions in the improvement of data mining and knowledge discovery is the integration of multiple data mining techniques. An integration method needs to be able either to evaluate and select the most appropriate data mining technique or to combine two or more techniques efficiently. A recent integration method for the dynamic integration of multiple data mining techniques is based on the assumption that each of the data mining techniques is the best one inside a certain subarea of the whole domain area. This method uses an instance-based learning approach to collect information about the competence areas of the mining techniques and applies a distance function to determine how close a new instance is to each instance of the training set. The nearest instance or instances are used to predict the performance of the data mining techniques. Because the quality of the integration depends heavily on the suitability of the used distance function, our goal is to analyze the characteristics of different distance functions. In this paper we investigate several distance functions as the very commonly used Euclidean distance function, the Heterogeneous Euclidean- Overlap Metric (HEOM), and the Heterogeneous Value Difference Metric (HVDM), among others. We analyze the effects of the use of different distance functions to the accuracy achieved by dynamic integration when the parameters describing datasets vary. We include also results of our experiments with different datasets which include both nominal and continuous attributes.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Seppo Jumani Puuronen, Alexey Tsymbal, and Vagan Terziyan "Distance functions in dynamic integration of data mining techniques", Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); https://doi.org/10.1117/12.381747
Lens.org Logo
CITATIONS
Cited by 9 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data mining

Feature selection

Knowledge discovery

Error analysis

Data modeling

Machine learning

Databases

RELATED CONTENT

Data mining for news content a case of Cantonese...
Proceedings of SPIE (January 12 2023)
A topological-based spatial data clustering
Proceedings of SPIE (April 20 2016)
Value-based customer grouping from large retail data sets
Proceedings of SPIE (April 06 2000)
Decomposition in data mining: a medical case study
Proceedings of SPIE (March 27 2001)
The study on rough set in GIS and remote sensing
Proceedings of SPIE (December 02 2005)

Back to Top