How can you determine relationships between two variables?

How can you determine relationships between two variables? Description

Description: The notes for week 10 cover the classes from 3/29 and 3/31. They cover Contingency Tables to Sign Tests.
14.4 Contingency Tables

Height

Height

Experimental units        1.75m        1.75m

People                        short                tall

Gender        Female

Male

Is there any relationship between these two variables?

Usually, we know, women tend to be shorter.

So, Intuitively, they are not independent.

But, what if we choose another variable?

#of kids          3         small

#of kids          3         small

in family         > 3         large



Intuitively, height and number of kids probably are not dependent.

Now let us look at a concrete example

We wish to classify defects found on furniture produced in a manufacturer, according to two different Methods.

(1) the type of defect

(2) the production shift prob of defect

n = 309 pieces of furniture that are defected

The defects were classified as one of 4 types A, B, C, D and also by Shift 1,2,3.

: that type of defect is independent of production shift.

: that type of defect is dependent of the production shift.

Restrictions:

A.                                column probabilities        - we want to test

B.                                        row probabilities        - that these are events                                                                                     independent

A: defects from type A

B: defects from shift 1

So, the first box prob. Is

If independent

Cell Probability

however, we don't know

We must estimate it from the data.

In general                i = 1, 2, 3

j= A, B, C, D

Where is the number in respective cell.

We can use a Chi-squared test to test

~

Because all expected values

Where df. Is         (r-1)(c-1)

When  is true should be small                        (3-1)(4-1)

So,we reject  when is large.                        2x3 = 6 df.

so,

Therefore, since 19.17 > 12.592, we reject the null and conclude that the type of defect and production shift are dependent.

Therefore again, we would conclude that we should reject the null

15 Nonparametric Statistics

• No parameters in the model

15.2 A general two sample shift model

population 1 -        mean

〉~N

population 2 -        mean

<

Note the only difference is the location shift.

