Page 44 - Demo
P. 44


                                    %u062c%u0645%u064a%u0639 %u0627%u0644%u062d%u0642%u0648%u0642 %u0645%u062d%u0641%u0648%u0638%u0629 %u0640 %u0627%u0625%u0644%u0639%u062a%u062f%u0627%u0621 %u0639%u0649%u0644 %u062d%u0642 %u0627%u0645%u0644%u0624%u0644%u0641 %u0628%u0627%u0644%u0646%u0633%u062e %u0623%u0648 %u0627%u0644%u0637%u0628%u0627%u0639%u0629 %u064a%u0639%u0631%u0636 %u0641%u0627%u0639%u0644%u0647 %u0644%u0644%u0645%u0633%u0627%u0626%u0644%u0629 %u0627%u0644%u0642%u0627%u0646%u0648%u0646%u064a%u062944For nominal data, a correlation relationship between two attributes, A and B, can be discovered by a %u03c72(chi-square) test. 2 = (Observed %u2212Expected)2 ExpectedThe larger the %u03a72 value, the more likely the variables are related. The cells that contribute the most to the %u03a72 value are those whose actual count is very different from the expected count. Correlation does not imply causality as for example: -number of hospitals and number of car-theft in a city are correlated and Both are causally linked to the third variable: population. Example:-Play chess Not play chess Sum (row) Like science fiction 250(90) 200(360) 450 Not like science fiction 50(210) 1000(840) 1050 Sum(col.) 300 1200 1500 %u03a72 (chi-square) calculation (numbers in parenthesis are expected counts calculated based on the data distribution in the two categories) 2 (250%u221290)2 (50%u2212210)2 (200%u2212360)2 (1000%u2212840)2 507.93= + + + =90 210 360 840It shows that like_science_fiction and play_chess are correlated in the group. b- for numeric data:- we can evaluate the correlation between two attributes, A and B, by computing the correlation coefficient (also known as Pearson%u2019s product moment coefficient. rA,B = in=1(ai %u2212A)(bi %u2212B) = in=1(aibi )%u2212nAB(n%u22121) A B (n%u22121) A B
                                
   38   39   40   41   42   43   44   45   46   47   48