Proceedings:
No. 8: AAAI-22 Technical Tracks 8
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 36
Track:
AAAI Technical Track on Machine Learning III
Downloads:
Abstract:
Categorical data is common and, however, special in that its possible values exist only on a nominal scale so that many statistical operations such as mean, variance, and covariance become not applicable. Following the basic idea of the neighbour correlation coefficient (nCor), in this study, we propose a new measure named the categorical nCor (CnCor) to examine the association between categorical variables through using indicator functions to reform the distance metric and product-moment correlation coefficient. The proposed measure is easy to compute, and enables a direct test of statistical dependence without the need of converting the qualitative variables to quantitative ones. Compare to previous approaches, it is much more robust and effective in dealing with multi-categorical target variables especially when highly nonlinear relationships occurs in the multivariate case. We also applied the CnCor to implementing feature selection by the scheme of backward elimination. Finally, extensive experiments performed on both synthetic and real-world datasets are conducted to demonstrate the outstanding performance of the proposed methods, and draw comparisons with state-of-the-art association measures and feature selection algorithms.
DOI:
10.1609/aaai.v36i8.20889
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 36