This problem introduces a random effects model for the

Chapter , Problem 20

(choose chapter or problem)

QUESTION:

This problem introduces a random effects model for the one-way layout. Consider a balanced one-way layout in which the I groups being compared are regarded as being a sample from some larger population. The random effects model is

\(Y_{i j}=\mu+A_i+\varepsilon_{i j}\)

where the \(A_i\) are random and independent of each other with \(E\left(A_i\right)=0\) and \(\operatorname{Var}\left(A_i\right)=\sigma_A^2\). The \(\varepsilon_{i j}\) are independent of the \(A_i\) and of each other, and \(E\left(\varepsilon_{i j}\right)=0\) and \(\operatorname{Var}\left(\varepsilon_{i j}\right)=\sigma_{\varepsilon}^2\).

To fix these ideas, we can consider an example from Davies (1960). The variation of the strength (coloring power) of a dyestuff from one manufactured batch to another was studied. Strength was measured by dyeing a square of cloth with a standard concentration of dyestuff under carefully controlled conditions and visually comparing the result with a standard. The result was numerically scored by a technician. Large samples were taken from six batches of a dyestuff; each sample was well mixed, and from each six subsamples were taken. These 36 subsamples were submitted to the laboratory in random order over a period of several weeks for testing as described. The percentage strengths of the dyestuff are given in the following table.

There are two sources of variability in these numbers: batch-to-batch variability and measurement variability. It is hoped that variability between subsamples has been eliminated by the mixing. We will consider the random effects model,

\(Y_{i j}=\mu+A_i+\varepsilon_{i j}\)

Here, \(\mu\) is the overall mean level, \(A_i\) is the random effect of the i th batch, and \(\varepsilon_{i j}\) is the measurement error on the \)j\) th subsample from the i th batch. We assume that the \(A_i\) are independent of each other and of the measurement errors, with \(E\left(A_i\right)=0\) and \(\operatorname{Var}\left(A_i\right)=\sigma_A^2\). The \(\varepsilon_{i j}\) are assumed to be independent of each other and to have mean 0 and variance \(\sigma_{\varepsilon}^2\). Thus,

\(\operatorname{Var}\left(Y_{i j}\right)=\sigma_A^2+\sigma_{\varepsilon}^2\)

Large variability in the \(Y_{i j}\) could be caused by large variability among batches, large measurement error, or both. The former could be decreased by changing the manufacturing process to make the batches more homogeneous, and the latter by controlling the scoring process more carefully.

a. Show that for this model

\(\begin{aligned}& E\left(M S_W\right)=\sigma_{\varepsilon}^2 \\& E\left(M S_B\right)=\sigma_{\varepsilon}^2+J \sigma_A^2\end{aligned}\)

and that therefore \(\sigma_{\varepsilon}^2\) and \(\sigma_A^2\) can be estimated from the data. Calculate these estimates.

b. Suppose that the samples had not been mixed, but that duplicate measurements had been made on each subsample. Formulate a model that also incorporates variability between subsamples. How could the parameters of this model be estimated?