An acticle in Biometrics [“Integrative Analysis of Transcriptomic and Proteomic Data of Desulfovibrio Vulgaris: A Nonlinear Model to Predict Abundance of Undetected Proteins” (2009)] reported that protein abundance from an operon (a set of biologically related genes) was less dispersed than from randomly selected genes. In the research, 1000 sets of genes were randomly constructed, and of these sets, 75% were more disperse than a specific opteron. If the probability that a random set is more disperse than this opteron is truly 0.5, approximate the probability that 750 or more random sets exceed the opteron. From this result, what do you conclude about the dispersion in the opteron versus random genes?

Step 1 of 2:

Given that, n = 1000, p = 0.5

Let X = binomial random variable representing number of sets that exceed the Opteron with n = 1000 and p = 0.5.

Normal approximation is good for np > 5, and n(1-p) >5.

Then,

np = 1000(0.5) = 500 > 5 and

n(1-p) = 1000(1-0.5) = 500 > 5.

Z = standard normal random variable