The disk file cancer contains values for breast cancer

Chapter , Problem 65

(choose chapter or problem)

Get Unlimited Answers
QUESTION:

The disk file cancer contains values for breast cancer mortality from 1950 to 1960 (y) and the adult white female population in 1960 (x) for 301 counties in North Carolina, South Carolina, and Georgia.

a. Make a histogram of the population values for cancer mortality.

b. What are the population mean and total cancer mortality? What are the population variance and standard deviation?

c. Simulate the sampling distribution of the mean of a sample of 25 observations of cancer mortality.

d. Draw a simple random sample of size 25 and use it to estimate the mean and total cancer mortality.

e. Estimate the population variance and standard deviation from the sample of part (d).

f. Form 95% confidence intervals for the population mean and total from the sample of part (d). Do the intervals cover the population values?

g. Repeat parts (d) through (f) for a sample of size 100.

h. Suppose that the size of the total population of each county is known and that this information is used to improve the cancer mortality estimates by forming a ratio estimator. Do you think this will be effective? Why or why not?

i. Simulate the sampling distribution of ratio estimators of mean cancer mortality based on a simple random sample of size 25. Compare this result to that of part (c).

j. Draw a simple random sample of size 25 and estimate the population mean and total cancer mortality by calculating ratio estimates. How do these estimates compare to those formed in the usual way in part (d) from the same data?

k. Form confidence intervals about the estimates obtained in part ( j).

l. Stratify the counties into four strata by population size. Randomly sample six observations from each stratum and form estimates of the population mean and total mortality.

m. Stratify the counties into four strata by population size. What are the sampling fractions for proportional allocation and optimal allocation? Compare the variances of the estimates of the population mean obtained using simple random sampling, proportional allocation, and optimal allocation.

n. How much better than those in part (m) will the estimates of the population mean be if 8, 16, 32, or 64 strata are used instead?

Questions & Answers

QUESTION:

The disk file cancer contains values for breast cancer mortality from 1950 to 1960 (y) and the adult white female population in 1960 (x) for 301 counties in North Carolina, South Carolina, and Georgia.

a. Make a histogram of the population values for cancer mortality.

b. What are the population mean and total cancer mortality? What are the population variance and standard deviation?

c. Simulate the sampling distribution of the mean of a sample of 25 observations of cancer mortality.

d. Draw a simple random sample of size 25 and use it to estimate the mean and total cancer mortality.

e. Estimate the population variance and standard deviation from the sample of part (d).

f. Form 95% confidence intervals for the population mean and total from the sample of part (d). Do the intervals cover the population values?

g. Repeat parts (d) through (f) for a sample of size 100.

h. Suppose that the size of the total population of each county is known and that this information is used to improve the cancer mortality estimates by forming a ratio estimator. Do you think this will be effective? Why or why not?

i. Simulate the sampling distribution of ratio estimators of mean cancer mortality based on a simple random sample of size 25. Compare this result to that of part (c).

j. Draw a simple random sample of size 25 and estimate the population mean and total cancer mortality by calculating ratio estimates. How do these estimates compare to those formed in the usual way in part (d) from the same data?

k. Form confidence intervals about the estimates obtained in part ( j).

l. Stratify the counties into four strata by population size. Randomly sample six observations from each stratum and form estimates of the population mean and total mortality.

m. Stratify the counties into four strata by population size. What are the sampling fractions for proportional allocation and optimal allocation? Compare the variances of the estimates of the population mean obtained using simple random sampling, proportional allocation, and optimal allocation.

n. How much better than those in part (m) will the estimates of the population mean be if 8, 16, 32, or 64 strata are used instead?

ANSWER:

Step 1 of 14

a)

We decided to use 13 equally spaced intervals in drawing the histogram. The length of each interval is the difference between maximum and minimum value, divided by the number of intervals (and then rounded off to the nearest integer).

The left boundary of the first interval is the minimum value (here, 0 ) subtracted by 0.5 (since the numbers are integers, so they have 0 decimal digits), and then the left boundary of each next interval is determined by adding the length to the previous left boundary.

On each interval, we draw a bar with the corresponding height; height is determined by the relative frequency, i.e. the number of data that fell in that interval divided by the total number of data, and then divided again by the length of the interval.

We drew the histogram (as well as solved the remaining parts of this exercise) in statistical software R.

Here is the histogram

Add to cart


Study Tools You Might Need

Not The Solution You Need? Search for Your Answer Here:

×

Login

Login or Sign up for access to all of our study tools and educational content!

Forgot password?
Register Now

×

Register

Sign up for access to all content on our site!

Or login if you already have an account

×

Reset password

If you have an active account we’ll send you an e-mail for password recovery

Or login if you have your password back