RESEARCH DESIGN SOCY 5031
Popular in Course
Popular in Sociology
This 4 page Class Notes was uploaded by Heloise Glover on Thursday October 29, 2015. The Class Notes belongs to SOCY 5031 at University of Colorado at Boulder taught by Terry Thornberry in Fall. Since its upload, it has received 19 views. For similar materials see /class/231814/socy-5031-university-of-colorado-at-boulder in Sociology at University of Colorado at Boulder.
Reviews for RESEARCH DESIGN
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/29/15
Notes on Sample Size Smith University ofPennsylvania ilow large does a sample need to be There are many things which sample size depends on but one ofthcm is not the size oFthe population The accuracy ofa sample has little to do with the size orthe population or more accurately the 39umplingfraclion the ratio ofthc sample size n to the size oFthe population N Even when the samp ittg fraction is fairly large say nN 20 the standard error will only be reduced by approximately 10 or 90 l nN relative to the standard population Typically however our samples are much smaller fractions oftheir corresponding populations and so the quotsavingsquot in sample size due to the nite population correction in Formulae For standard errors has little to do with the calculation ofrequisite sample size Implicit in the l39orcgoing discussion is the assumption that our choice ol sample site has something to do With our desire for reduced standard errors And this is true The smaller our standard errors the less sampling error there is in our estimated parameters and the more sure We are that the statistics we compute from our sample data mirror corresponding population parameters Since the basic formula for the standard error oi39thc mean is sdsqrtn it is clear that as n increases the standard error and therefore the sampling error decrease which would appear to imply the bigger the sample the betterquot To the extent that this is true determination ofsamplc size reduces to the problem ot estimnting the cost per interview and dividing this figure into the funds available Although seemineg simpleminded something like this is often done in practice There are problems with this approach however First there tend to be large xed costs associated with 39 1 a survey These xed costs are at least conceptually spread across the number ofsample elements so the larger the sample the lower the cost per interview This makes determination olsaitiple size via cost pcr interview difficult because cost per interview itself depends on sample size This interdeterminancy notwithstanding most practicing survey researchers have a good handle on cost er interview usually based on recent experience and this gure tends not to vary too greatly within reasonable ranges ofsample size n 9 E i E a A second problem with this approach is that it assumes that sample quantity is all tmportant And from the standpoint ofredueing sampling error it is But samp ing error is only one ofthe problems vexing survey research Non sampliag errors and bias are functions ofwhctt might be thought ot39as sample quality There are problems such as nonresponse interview error coding error etc These problems mtlst be remedied in the same way that errors in production lines are remedied quality control A good survey involves lots of supervision checking and double ehecking and expensive followup ofnonrespondents The labor involved tends to be expensive in the sense that the cost per interview ofa good survey is appreciably more than that For a bad survey lfthe cost is say twice as much then for xed funds available the sample si7e will be half as great What does this mean Does it mean that the cost ofcarrying out a good survey to get rid of nonrsampling error is twice as much sampling error No Standard error is sclsqrtn il we must halve the sample size then the new standard error is stlsqrtV n The ration ofthe new standard error to the old one is lsqrt which is approximately equal to 14 Thus when we shrink the sample by half the standard error does not double but only goes up by 41 lfwe believe that the quality ofthe sample increases by 100 when we double the expenditure per interview then accepting a 41 increase in standard error is a good tradeoff Even ifwe decide to maintain the same costpcrintcrview we can save halfofthe estimated total cost by reducing the sample by V ifwe are willing to accept a 41 increase in the standard error Viewed from this perspective the question becomes not quotHow big a sample can l draw given the funtls available to me but rather How small is sample can I afford to drawquot The smaller the sample the more attention money effort that can be devoted to each interview the lower the nonresponse error the higher the quality ofthc survey or from a different perspective the less the money required to do the survey The answer to the question How small a sample can l aland to draw depends on what the survey is about speci cally what are the key variables lfthe survey is designed to measure the unemployment rate and dichrcnces ofas little as 02 are important then sampling error must be minimized and large samples are necessar ON the other hand if we can live with uncertainty of so 35 as is typical with may socrological variables eg percentage of adults with cable television percentage ofatlults believing that President Clinton is doing a good job etc then much smaller samples are required The problem ol39calculating sample size is best approached from the standpoint ofsurveys designed to measure variables in percentage terms Of course there are types of variables which we seek to measure continuous variables and typically there are several quotkeyquot variables in a survey and often we are most interested in the accuracy of estimates ofdiflcrertces between groups and not in the proportions or means themselves We will consider these problems later 2 5 Let 5 fix ideas with a given problem the sample size for a survey to determine the proportion or percentage of adults believing that President Clinton is doing a goat jo quot l w accurate do we need to be Within the con nes ofthe basic problem this is a subjective question We do not want to be too far off ten percentage points the tlilfereitce between 40 and 50 is probably a serious difference But there is no need to be too precise It is probably unnecessary to be accurate wit in i l percentage point As a cotnprotnisc let us accept i3 percentage points as our tolerance for sampling error This may all seem very arbitrary and to some extent it is but it is no different than the problem ofdeciding how far out of lin a car door can be on an auto assembly lin t is an issue of functionality and consumer tolerance Lquot Q Having accepted a 3 as our tolerance for our error we must remind ourselves that this interval is not absolute but is relative to some desired level ofcon dence Again this is a somewhat suhjeetive choice 1 low sure do we wish to be that the parameter is within the bounds ofour con dence intervalquot Typically we want to be very sure and a 95 con lidencc interval only one chance in twenty ofbeing mistaken is a typical standard We can write our tolerance con dence interval as a function of our desired level ot cont rdence and the standard error 5 i 03 2 x SE where 2 is the number ofstandard errors associated with a 95 con dence interval 196 technically The standard error is SE o so we can write 03 2 x o Solving for n the sample size takes 2 steps rst nntltiply both sides by sqrtn oatF 20 and tltcn divide both sides by 03 the percentage points we decided to allow in the con dence interval 2003 Finally we square both sides n 20O3ll In theory we could use this I39ormula to solve for it except that we need to know a the standard deviation ot llte variable in tlte population Typieally we 39 ate a as r the standard deviation ofthe stmpe But we do not have a sample we are still trying to Figure out how large a sample we need The fact that we are working wtth a proportion rather than say the mean oFa continuous variable simpli es matters as follows rue we do not know 7 or even have an observed r or For that matter the eorrespomling variances 01 or rz but what we know For proportions is that o PO where l7 is the proportion ill the population 0 sPsl and Q li We do not know P after all this is what we plan to estimate onee we draw our sample a but over the range ofvalues of P hence O we can calculate oz P Q o sqrta i 9 99 30 2 V 8 if 40 3 7 21 46 4 6 2 l9 5 5 25 50 6 4 24 49 7 3 Z 46 8 2 16 40 9 l 09 30 There are two very important things to notice in this table First over a very wide range nfvalues of P say 2 to 8 the standard deviation ot a proportion given in the 0 column varies very little treat 40 to 50 We do not know where P the proportion ot the adult population who think President Clinton is Jointr a good job is exactly but based on past evidence it seems safe to say that it is somewhere in this broad range Second the worst casequot the biggest ct occurs at l 5 when the population is evenly divided on the subject This means that if we know nothing at all we can set 0 an be sure that our sample size is a conservative estimate Lew ifP is actually far from 5 either above or below then a smaller sample size will suf ce to achieve the desired con dence interval Substituting 5 for o in our equation for sample size n n e 2 x snow 33331 ll lfwe had used the exact 196 instead ofthc approximation 2 for the number ofstandard errors associated with a 95 confidence interval the sample size would be 1067 lfwc had substituted other valttcs lt 5 for a then the required sample size wotild be smaller A but not that muclt smaller until we get to the sampling errors associated with say P 06Wl1lClt would imply o sort PQ 237 But ifwe thought the proportion were tltis small we would likely have a much lower tolerance oferror t 3 is not much arottnd 50 bttt it is a big difference when tlte true value is near 6 and hence we need a larger sample in that case For example if we decided that for this relativer small proportion we needed a con dence interval oft 15 and we thought we knew that P 7 06 then the sample size would e n Z 23705 999 which is not much different from the earlier calculation Calculations like this give very precise answers to llC sample size question but these samples should probably be thought ofas ballpark figures subject to adjustment For example ii39this number turns out to be a non negligible Fraction ofthc poptilation size N then the satnplc size can be adjusted downward somewhat Lei multiply the estimated sample size by lnNl Similarly if the sample will involve some form of strati cation then it can be further reduced by a few percent Conversely if it will be a cluster sample then the 5am lc 5 may need to be raised slightly in order to retain the same standard error Also even a wellexceutctl survcyvhas seine nonresponse so the calculated sample size should be in ated as well by dividing through by ihc response rate For example ifthe calculated sample is 1 I00 and an 80 response rate is anticipated then a sample of 110030 375 must be drawn in order to obtain lltlt eases Remember nonresponse bias is not captured by sampling error Samples may also have to be expanded it39cstimates are to be made for separate subsamples The standard error for the proportion thinking President Clinton is doing a good job will be greater if we took in each so separately or at different age groups etc The size of these subsamplcs will be proportionate to their representation in the population ll wc are interested in making comparisons between say whites and blacks then we must recognize that a sample of 1000 adults will have approximately 130 blacks and 870 whites If this will result in insuf cient precision for blacks we can either a inflate the overall sample size or b over sample blacks ie disproportionate strati ed sampling In either case some sense ofthe appropriate sample 39 c for each group can be obtained by considering the effects ofdifferent sample sizes on the standard error for the difference between proportions leitQA Al Pugh Bl where A is the subscript for whites and B is the subscript for blacks Finally if the key variables in the survey are continuous eg income and you have no idea what U the standard deviation in the population is then conceptualize the variable as a proportion eg the proportion earning more titan 30000 and estimate n accordingly The sample size should prove satisfactory for obtaining a comparatively good estimate ofthe corresponding sample mean
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'