Discovery of Dependencies and Models
Discovery of regularities in data involves search in many spaces, for instance in the space of equations, boolean expressions, statistical patterns, and the like. If data do not fit any hypothesis in a particular space, much time could be saved if that space was not searched at all and the search effort was directed to other spaces. A test which determines the non-existence of a solution in a particular space can prevent unneeded search. We concentrate on construction of a test which detects the existence of a functional relationship between two variables in a set of numerical data. The test is passed when the data satisfy the functionality definition. The test is general and computationally simple. It works on data which include error, limited number of outliers, and background noise. Our algorithm detects the functional dependence, but it does not recognize a particular form of that dependence. That task is left to an equation finder, which is called when the data pass the functionality test. We show, how our functionality test works in database exploration within the 49er (read:A Forty-Niner) system, as a trigger for the computationally expensive search in the space of equations. Results of tests show the time saved when the test has been applied. Then we discuss how the functionality test, in conjunction with a continuity test can be used to recognize multifunctions, and to split the data so that each part includes a single functional dependency and can be searched for a single equation.