Quantitative Methods in Geography
Quantitative Methods in Geography GEOG 3190
Popular in Course
Popular in Geography
This 120 page Class Notes was uploaded by Greta Purdy on Sunday October 25, 2015. The Class Notes belongs to GEOG 3190 at University of North Texas taught by Steven Wolverton in Fall. Since its upload, it has received 34 views. For similar materials see /class/229164/geog-3190-university-of-north-texas in Geography at University of North Texas.
Reviews for Quantitative Methods in Geography
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/25/15
3190 Week 3 Graphmg Data Stem amp Leaf Diagram An easy way to convert frequency distributions to a chart Distribution of 1996 US state unemployment rates Cases state values Nb UJU39I UJCD 66999 34456679 Groups 00ICDU39JgtUJN HNOOOHLD U39IHHHN OONHHU39I NHNU I HNOO U39INUJ INUJ UNT Geog 3190 Wolverton Histograms Turns a fre uenc distribution into a chart Uses bars to represent the frequency of cases in groups Bars are immediately adjacent to each other Y axis frequency X axis group number You can label the xaxis with group boundaries or the gmp midpoint All histograms need to have labeled axes amp clear chart titles UNT Geog 3190 Wolverton Histogram example Lot Sales Over the Past Year v v quot3 Number of Sales 1 to 19 2 to 29 3 to 39 4 to 49 5 to 59 6 to 69 Lot Size Category Does this frequency show a raw data or relative frequency distribution UNT Geog 3190 Wolverton Frequency Polygon Converts a histogram to a line graph 0 Allows easy expression of multiple frequency distributions on the same graph Which is cumbersome with a histogram lJNT Geog 3190 Wolverton Frequency Polygon example Percentage Polygons for LotSize Phase I v Phase II 700 600 500 400 300 Percentage of Total 200 100 00 15 25 35 45 55 65 Lot Area Category Midpoint Acres Phase o Phase Does this frequency show a raw data or relative frequency distribution UNT Geog 3190 Wolverton Cumulative Frequency Polygon 0 A frequency polygon that graphs frequency of cases in a group M frequency of cases in any lower groups Its is additive For each group you graph its frequency and add the fre uenc of all lower rou s to it lJNT Geog 3190 Wolverton Histogram Distribution 0f1996 US state unemployment rates Frequency Di Mean 5m 0 0 an s an a an in nu Elclasses um Geog 3190 Wulvenun Frequency Polygon Distribution of 1996 US state unemployment rates 5 Elclasses UNT Gang 3190 Wolvermn Cumulative Frequency Polygon Cumulative Distribution of 1996 US state unemployment rates Cumulative Frequency l mu Elclasses UNTGEng 3190 Wolvermn 10 Graphs in SPSS Sorting amp Grouping The easiest way to group data is to sort data into an ordered array 0 Then insert a new variable and use it as a quotgrouping variable 0 This is important for graphs and some inferential tests later in the class 5 Lat153 DataSetl SPSS Data Editur Sorting File Edit Iv39iew Data Transform Analyze Graphs Utilities Window Help 189 it all Click amp drag to the right to highlight each column This is a m important step you do not want to sort only 1 column which mixes up your case data Sorting ahlsav Dakaszkl 7 525 Data mm Ed Fig t Wm mm mm mm Wm WM 7 7E Dennevmiepmpm Em E ii mm r W 7 awomapemes LLJ i s Newmmmm a HE E a 6 am E M fr E 45 Define Dates Define Muitivie Resvanse Sets 1Sex iEi Identify Duviicate Cases Weight Cases This will sort your cases by weight from lightest to heaviest rearrahgihg all of the other data eg age sex appropriately Sorting Now each deer is ordered from lowest to highest in terms of weight notice many ofthese are does The heaviest deer if you scroll down will be bucks Grouping Highlight the weight variable Then right click and select insert variables Sm Ascenqu Grouping H Lathav DataSetl SPSS Data Editor File Edit u39iew Data Tranel erm Analyze Graphs Utilities Window Help ilEllEl hlEil l l2 IDIalegRetalli I I SEX Age y v e3i Fund 3PMan HHEH u 1 e 1200 400 to variable VIeW 2 0 2000 15100 and name it 5 0 2100 2000 u 4 e 2200 200 525 2000 grouplng 5 e 2500 3400 5 0 2300 4100 139 0 2300 2500 5 0 2300 2400 5 e 2400 100 10 0 150 2400 2400 11 0 50 2400 2000 12 0 50 2400 1500 15 e 50 2500 2000 14 e 50 2500 1200 15 e 50 2500 200 Grouping H LahlGrpsav DataSetl SPES Data Editor File Edit 39u iew Data Transform Analyze Graphs Utilities Window lElIlEl l lhlEl l Here l have labeled any deer with weight group mid points 10 199 15 20 299 l3 Grouping IEG 25 and SO forth Sex Age Grou in Weight F oil 1 B 5393 133539 Go through and label the rest in the 2 0 50 Q 2500 2000 grouping variable 3 0 50 4 5 53 5 B File Edit IView Data TransForm Analyze Graphs Utilities Window Help 5 0 50 Elglgl tgl il l W Elgl 2 0 50 E D 50 Grid Font DD 9 e 50 2400 10 0 150 2400 11 0 50 2400 12 0 50 2400 13 e 50 2500 14 e 50 2500 15 e 50 2500 Gra phing There are many ways to graph in SPSS I prefer the quotLegacy Dialogs method It is simple and will provide basic graphs quotChart Builder is more complicated but also more flexible it is covered in Chapter 3 Cronk Histogra m Chm may H We are going to make a histogram based on our weight groups You should make a frequency distribution on your own to provide a com ementarr table V Go to Graphs Legacy Dialogs Histogram Uliiiiies Windaw Haip a no amt mEr39g Eml oc 2713 Gruupmg Fiie ad View Data Yrsnsfmm Arraisz Graphs Vaiiahie Age New E rm zzzz a MA m I Nest vaiiahia rm emviy mwsi Eaiumns E Nest vaiiahie rm amply Emmi PEZZ Z ZMM was We Ill Frequenicv Histogram 1 000quot 00039 500 400 200 12500 500 10000 Grouping 000 2500 5000 Unweighted statistics Mean 603521 Std Deviation 1814458 Variance 329226 Mean 096 Std 0335 913322 L um 215512 u Weighted statistics Frequency Polygon E LablGrpsav DataSetl SPSS Data Editor File Edit Itt39iew Data Transform analyze Graphs Utilities Window Help ngl Go to Graphs Legacy Dialogs Line 3D Bar Area lt Pie HighLow Boxplot Error Bar t Population Pyramid ScatteriDot Histogram De ne Simple Line Summaries for Groups of Cases Line iilspn cl II amp Seic G of cases A of cases 4 Age r 3 Weight F DUI N r Cum Z i Points A Other statistic egmean Reset I Line Charts 77 ll Select simple 7 amp push define El Dropline Data in Chart Are I liequott ariables no empty rows 539 Summaries for groups of cases COMms r Summaries of separate variables 393 Values of individual cases 39 Nest variables no empty columns Template T39tl I 39 Use chart specifications from I as File I Optionsml 100 Count Frequency Polygon 1 nun anu suur ouur 2m y y y 15m 25m 35m mu 55mm ssuu 75m mu 95uu1n5nu115uu125un135nn Grouping Cumulative Frequency Polygon H LablGrpsav DataSetl SPSS Data Editor File Edit View Data Transform Analyze Graphs Utilities Window Help eneilggjglegl il leery Go to Graphs Legacy Dialogs I1 2 Sex IB Interactive b Barm 3D Bar Area Pie HighLow Boxplot Error Bar If you want relative frequency or cumulative Populatim Pyramidn frequency polygons click those in the green area ScatterIDot I Define Simple Line Summaries for Groups of Casiquot 5 Line Represents eir Age F N ofoases i Weight m N n Points t Either statistic egme i Spread quotquotariable i LitFiEA m Cancel I Help I Change Statistic Select simple ll 39 amp push define Dropline Data in Chart Are 39 N est variables no empty rows Columns 5 Summaries for groups of cases F Summaries of separate variables F Values 0f Inleldual 03883 39 Hest variables no empty xzsolurrrns Template lElEl T39tl I I Use chart speoifioations from l 39 File I Elptions I Cumulative Frequency Polygon 3mm 25m zuuuv Cumulative Frequency 1 uuue sun y y y 15m 25m 35m mu 55mm Benn 75m ESUEI 95m 1u5uu115uu125uu135uu Grouping Frequency Histogram of our raw data 25 an an un 75mm Weight lunuu 125nm Histograms with such small groups as derived from raw data here weight instead of grouping may not be useful with small samples But here we have such a large sample that it is quite meaningful Mean em as sin Dev zl w N m 2 3190 Week 1 Statistics Science Hypotheses Measurement UNT Geog 3190 Wolverton Statistics A quantitative observation of some characteristic of a sample For example a quotcountquot A sample is a set of observations on a subgroup of a population eg 3190 students A population is the whole body of phenomena of interest eg the UNT student body A quantitative observation of some characteristic of a population is not a statistic but a parameter UNT Geog 3190 Wolverton 2 Science A sense making system created through observation and generalization that is used to disconfirm hypotheses A hypothesis a proposed explanation for an event s causes a potential answer to a question Scientists use multiple working hypotheses MWH In sum the process ofscience is to use general statements assumed to be true to disconfirm MWH lJNT Geog 3190 Wolverton 3 Example Question why are basketball laers seeminvl taller on average than members of the general population H1 The are selected because they are better at putting b balls through 10 foot hoops H2 they are more accommodating to coaches than quotshort people H0 the pattern is random null hypothesis The null hypothesis is the one tested against alternatives that must be rejected to accept another lJNT Geog 3190 Wolverton 4 Aka The scientific method Which is counter intuitive Why Because we typically rely on a different sense making system other than science UNT Geog 3190 Wolverton Common sense This is our nativ sense m kinu system h we inherit from our culture It is largely confirmatory quotWhy did he break up with me quotBecause he is an a And then we go look for evidence to confirm our story We tend not to use MWH UNT Geog 3190 Woiverton Truth Can hypotheses be r No they cannot because any one of them can be rejected with new data We simply accept the best current answer to a question Science is in a sense a willingness to be wrong lJNT Geog 3190 Wolverton A role for stats Hypotheses science math Perhaps a bit intimidating But really quite simple In stats we have two hypotheses The one we think is correct alternative Ha That we could be wrong due to chance null H H0 We simply use stats to help us describe our data and infer whether or not we can reject the H0 Hence descriptive statistics amp inferential statistics lJNT Geog 3190 Wolverton Descriptive Statistics Purpose is to describe a set of data Data observations of phenomena here usually quantitative observations Description allows communication Ig UNT basketball players are quoton average taller than UNT professors We can calculate an average and it is a description UNT Geog 3190 Wolverton Inferential Statistics Purpose is to analyze data from a samples to learn about a population and to answer questions test hypotheses Eg Are two samples of people different enough in size that it is likely they come from different populations Inferential stats commonly answer two questions among many other ones 1 How similar or different are samples 2 How closely related are two or more variables We will discuss elementary inferential stats later this semester UNT Geog 3190 Wolverton 10 Measurement FOUF scales Of measurement Nominal Ordinal Interval Hatlo UNT Geog 3190 Wolverton Nominal Scale 0 nominal means name A scale that uses qualitative or categorical classes 0 Texan is a nominal scale category so is quotNew Yorker The categories convey difference Do not convey greater than and less than differences lJNT Geog 3190 Wolverton 12 Ordinal Scale Measurement that places phenomena in relative order Provides information on greater than less than relationships but M how much so A good example is stratigraphy in geology Deeper tends to be older and shallower tends to be younger but the position will not tell you how much so wo independent information We might order several people by height but without an independent measure we would not know magnitude of difference lJNT Geog 3190 Wolverton IntervalRatio Scales Measurement ofv ri l on scales h can determine magnitude of difference Measurement units are all equal eg meters or grams Ratio scales have a true zero point Eg weight in lbs 0 no weight 40 lbs is 2X heavier than 20 lbs Interval scale have not true zero point mg temperature in Celsius 40 is not 2X as warm as 20 lJNT Geog 3190 Wolverton 14 Measurement Error We use four concepts to gauge error Precision Accuracy Validity Reliability UNT Geog 3190 Wolverton Precision Level of exactness of measurement The finer the measurement the more precise Eg Length measured to the millimeter is more precise than that to the centimeter A rain gauge that measures to the inch is less precise than one that measures to the 116 inch Precision is irrelevant if measurement is inaccurate lJNT Geog 3190 Wolverton 16 Accuracy Exten of system i i in measurement EV shooting to the same spot always to the left of the target is precise but inaccurate Eg a rain gauge with a wad of paper towel in the bottom of it may still be precise eg to the 16 inch but is inaccurate Deer astragaluscaliper example lJNT Geog 3190 Wolverton Validity Consideration of whether or not the most appropriate variable is being examined to answer the research question at hand If we are interested in height it does not matter how precisely and accurately we measure weight Deer weight vs astragalus size lJNT Geog 3190 Wolverton Rehab y An ability to coll c n recoll data in a repeatable precise accurate and valid manner A big concern with longterm studies that collect data multiple times The caliper example is a good one here The only way to assess reliability would be to go back and remeasure the same specimens lJNT Geog 3190 Wolverton Grouping Data Often we want to group data so that we can more easily comprehend it s distribution in a table or chart Natural Breaks Equal Intervals basednot based on the range Quantile Breaks here quartiles UNT Geog 3190 Wolverton 20 Natural Breaks Relies on Va 5 in the distribution of the values of data You can see many gaps in Table 25 33 39 47 So make these the boundaries of groups 0 29 to 33 is group 1 0 34 to 39 is Urou 2 0 40 to 47 is group 3 and so forth Advantage easy to understand Disadvantage very subjective are rou s meaninful UNT Geog 3190 Wolverton 21 Equal Interval Based on the Range Uses intervals of the data values to create groups Subtract minimum value from maximum range Divide the range into the desired of categories Here max 81 min 29 range 52 If four groups are desired what will the interval be What are the boundaries for groups 14 You should have a question about group 4 UNT Geog 3190 Wolverton 22 Equal Intervals not Based on Range 0 Here the interval is based on a set of equal interval classes that encompass the range of value but are not determined from it Pick the whole number below the minimum Pick the whole number above the maximum Subtract the former from the latter Divide by the desired number of groups Create mutually exclusive groups So do this for Table 25 UNT Geog 3190 Wolverton Z3 Quantile Breaks 0 Divide of cases states here as equally as possible into a desired of groups Quartiles 4 groups Quintiles 5 groups Could be any number For quartiles you want 4 groups each with an equal number of states 39 The lowest of states the 2nd lowest the 3rd lowest and the highest UNT Geog 3190 Wolverton 24 Calculating Quartiles For Q1 n14 Gives you the position of the break Here equals 514 1275 Round ug amp states 1 13 are in 01 For Q2 2n14 1024 255 Round ug states 14 26 are in C12 For Q3 3n14 1534 3825 Round down states 27 38 are in Q3 The remaining states 39 50 are in Q4 UNT Geog 3190 Wolverton 25 If DC amp PR are included For Q1 n14 Gives you the position of the break Here equals 534 1325 Round down amp states 1 13 are in Q1 PraCtICe Wlth DC but not PR For Q2 2n14 1064 265 Round ug states 14 27 are in Q2 For Q3 3n14 1594 3975 Round ug states 28 40 are in Q3 The remaining states 41 52 are in Q4 UNT Geog 3190 Wolverton 26 Rules shmooles There are n erous ways to group qu Some recomme rounding You will fin a rule and follow it 27 UNT Geog 3190 Wolverton Ordered arrays The simplest organizational tool for working with data is to order it An ordered array is a list of numerical values associated with a variable in rank order from the smallest value to the largest value So are the unemployment data an ordered array lJNT Geog 3190 Wolverton 28 So practice Make a frequency distribution for the Table 25 data First with the equal interval groups not based on the range Then with quartile groups UNT Geog 3190 Wolverton Introduction to SPSS UNT Geog 3190 Wolverton 30 Data files sav Variable View Data View UNT Geog also Waive on 32 Variable View u Lab1sav DataSetl SPSS Data Editor File Edit 39v39iew Data Transform Analyze Graphs Utilities Window Help aiuiieigigisisiysigama eelAl Decimals Label Values Missing Columns Align Measure 1 g III None None 8 Left Nominal 2 MARIJEIDD2 S 2 None None El Right Scale 3 MARUEIDDS Numeric S 2 None None El Right Scale 4 MARIJEIDM Numeric S 2 None None 8 Right Scale 5 MARDUDDE Numeric S 2 None None Et Right Scale E MARUEIUDE Numeric S 2 None None 8 Right Scale 7 3 g Lablsa1r DataSetl SPSS Data Editor File Edit i iew Data Transform analyze Graphs Utilities II 39indoiriI Help iaieiiseigl Name Type E Width Decimals Label Values Missing Columns Ali 1MARUUEIEH String III None None 8 Left rinnnnnna h r 33 33 3 232 4 r Numeric None None 8 Right 5quot rquot Comma None None 3 Right 52 r Dot None None 8 Right F r Scientific notation Characters 398 HEIP I 3 lfquotJate Ei lIF39IZJollar 39lEl If Custom currencyI 11 IF String 12 13 ill 15 15 U beog lBU VVOIVEI IOh 55 Enter Variable Labels Labels must begin with a letter and can have no spaces VAROOOOl quotSexquot Sex of the animal harvested VAROOOOZ quotAgequot Age of the animal harvested VAR00003 Weight Dressed weight of the animal VAROOOO4 quotTOTptsquot Total antler points for bucks VAROOOOS quotSpreadquot Spread of antlers for bucks VAROOOO6 quottAREAquot Training area of harvest lJNT Geog 3190 Wolverton Analysis All of the descriptive amp inferential statistics that we use in this course are in the quotanalysisquot column Desmptwes Exvlme Cmsstahs Ram Nanvavametvi Vests UNT Gang 3190 Wnlverlun 35 Frequency Distribution Almost all of the descriptive statistics you will need can be found under quotdescriptivesstatistics frequencies One of the most important tools under quotfrequenciesquot is the ability to derive a frequency distribution of grouped data In this dataset each deer is labeled by sex b bb bf s buck button buck buck fawn or spike d df doe doe fawn A frequency distribution is a table that portrays counts of morvrduals in groups in a dataset lJNT Geog 3190 Wolverton 36 Creating a Frequency Distribution 1 Frequencies 7 7 1 Variables rm I F aste I Reset I Cancel I H Lab1sav DataSetl SPSS Data Editor File Edit View Data Transform Analyze Graphs Utilities Window Help Reports at Descriptive Statistics Frequencies IWJF Tables D Descrr rves Help I VARooom VAREIEIIJIJI COWre News 135 1 B 45 General Lrnear Model F IUD 39 2 B 3 5 Generalized Linear Models P an I7 Dislay frequency tables 39 Mixed Models gt 39 3 D 395 Correlate 1 UD Statistics Charts I Format I ll El 2395 Hegressron P DD 5 e 1 5 Lognnear 9 25 5 on I E D Classify V F BUD T B 25 Data Reduction 1350 500 a D 1 5 Scale 5 DD Insert VAR00001 Into the varrable g B 45 Nu L Tests gt 1530 m box by pushing the arrow button 10 DF 5 Time Series D 4 DD 39 Survival gt I g Multiple Response gt 11 0039 39 Quality Control gt 39 13 El 1 5 ROC Curve 575 400 ll 14 D 2511 new I I MUD VARUDDDE abes 15 e 150 wool zool 125 moo v11 aetmnn1 Click on frequencies Push OK 397 Display frequency tables Statistics Charts Format UNT Geog 3190 Wolverton 37 Your output will look like this 539 Dutput5 DocumentS SPSS Viewer lElI File Edit l 39iew Data Transform Insert Format Analyze Graphs Utilities Ioil39inclow Help a JEIlJa alarm ololagll 1 gt ampIlll illlall a mi Output FREQUENCIES quot UARIABLESVAREIDDEII ORDER ANALYSIS Fre uencies Qataristics q VARUDDm DataSetl CDuucuments and SettingswolvertonDesktOp QMZCIUQHJeek 1Lab1E Statistics VARUU 0 I11 N Valid 233912 Missing III Icncc Dynrarrnv ir v ark UNT Geog 3190 Wolverton 38 3190 Week 4 Sampling amp Normal probability UNT Geog 3190 Wolverton Normality 0 A random sample from a population that is normally distributed will be normally distributed 0 Asymmetry matters only for small samples from nonnormal populations lJNT Geog 3190 Wolverton Assuming normality We would like to be able to assume normality Then we can use parametric statistics which are more powerful For example more likely to determine a difference or see a relationship More powerful because we can use the normal probability distribution to make predictions 0 If our sample is random we can assume normality at samples n 2 30 why lJNT Geog 3190 Wolverton Sampling Has to do with the nature of sampling and probability Before we learn about the magic number n 2 30 Let s review basic robabilit amp sam linv lJNT Geog 3190 Wolverton Why is sampling important When we need data to answer a question we have three options Censuses Experiments Samples As you know statistical analyses use samples It is critical that those samples represent populations well called representative sampling lJNT Geog 3190 Wolverton Classic example of poor sampling 1936 presidential election Republican Alfred Landon predicted to win in a landslide over Franklin D Roosevelt by Literary Digest Based on a poll a sample of the American population FDR won in a landslide what happened Two biases in the sample 1 Sample obtained among people who owned a car or telephone wealthy in 1936 tended to vote Republican 2 Only 25 polled responded there was a nonresponse bias Those who did not respond tended to vote for FDR lJNT Geog 3190 Wolverton Also important All inferential tests rely on the assumption that samples are representative Especially so for parametric tests why Because we are assuming normality a characteristic of the population Larger samples tend to be more representative why Because smaller samples do not capture enough variability to be representative lJNT Geog 3190 Wolverton 7 Remember The central goal of inferential statistics is To draw conclusions about a population based on a sample Before we discuss inferential tests we must ensure that we know how to produce representative samples lJNT Geog 3190 Wolverton Probability Sampling A general category The easiest way to ensure representation is to choose one of several probability or quotrandomquot sampling techniques In all robabilit sam lin techniques a random device is used to decide which members of a population are included Replaces human judgment subjective choice lJNT Geog 3190 Wolverton Essential Concepts Target o ulation the complete set of individuals that a sample will represent Target area a geographic twist the entire region of set of locations that a sample will represent A sampling frame the operational set that contains the entire set of cases from which a subset of cases will be drawn the practical population can be locations area or individuals population It s the entire set of cases whatever they might be that you will draw a sample from lJNT Geog 3190 Wolverton 10 Simple Random Sampling A probability sampling technique in which each case individual in the sampling frame has an equal chance of being selected Each case in the sam lin frame must be identifiable to facilitate its random selection usually by a number eg Case 202 We use a random number table to choose simple random samples lJNT Geog 3190 Wolverton 11 Simple Random Sampling example Dr Oppong knows that student evaluations can be misleading in terms of instructor performance Of the 728 students who have taken World Regional Geora h during the last few years he wants to conduct interviews He can interview only a small number of students He settles on 15 randomly selected students lJNT Geog 3190 Wolverton 12 His sampling strategy He sets up a sampling frame numbering each student from 001 to 728 He could just pick the first 15 or the last 15 or students he knows but he wants to cover multiple semesters and to be unbiased So he decides to use a random number table to produce a simple random sample lJNT Geog 3190 Wolverton Picking the first number Dr Oppong closes his eyes And puts his finger on a number on the page Then he uses the table to help him pick the fifteen students he wishes to interview Here s how UNT Geog 3190 Wolverton 39Fhe number he picked zoon1h1on Hussec onin the next slide soyoucan seeait Table of Random Numbers 31871 60770 59235 1 87134 32839 17850 16314 81076 67486 05167 87246 47378 83967 84810 51612 49990 02051 64575 65332 16488 04433 42309 04063 55291 84715 41808 12085 63919 83977 72416 7 97595 78300 93502 39 17116 42649 89252 34037 84573 49914 7 7 08813 14453 70437 67115 41050 65453 14596 762802 33009 70258 26948 60863 83369 81179 32429 83811 49358 75171 14924 71607 74638 60102 56587 29842 33393 30109 42005 92592 78232 19328 27421 73356 53897 26528 22550 36692 07664 10752 95021 37954 72029 29624 66495 11333 81101 72506 28524 39595 09713 70270 28077 51852 70782 93498 31460 22222 18801 14328 05024 04333 84002 98073 52998 89541 28345 22887 50502 39890 81465 30862 61996 73216 36735 58841 35287 11561 81204 68175 41702 37359 42172 07819 98338 81501 70323 37990 72165 72525 55450 25847 61052 59688 49093 04510 74095 47666 34781 34768 01939 12031 47977 29645 26916 25262 17030 09119 69328 49356 15634 44669 00675 04135 05749 79269 00449 12554 51112 93037 89372 27221 46446 79918 40368 10440 07863 93517 96921 91171 47642 19520 78332 18584 69880 35518 34549 58512 00006 70070 77044 00794 26453 69836 52015 61419 76784 13444 84838 92733 36525 79647 57562 53143 45538 55620 09931 01200 47322 47967 28600 92409 09226 83949 02240 48553 59220 18395 53350 09779 01013 16896 15102 53498 99944 88843 76634 91404 65951 76550 18277 90638 15333 91169 26854 53986 86861 22645 76395 42951 91204 06321 97923 79207 26164 68269 12667 63234 81354 74085 30013 94778 96262 45605 72593 67919 01746 72848 34173 07223 17560 69282 47707 94905 40482 15801 64270 97357 40254 14252 68229 21862 45390 95180 42833 73898 12780 78345 35997 47774 48443 04020 45974 85863 68672 88765 30278 41277 51080 05905 18266 17902 77674 18915 52823 73678 94213 97025 39908 75577 54189 16917 28369 14914 04254 86163 67491 85710 71102 97378 09310 72154 89862 15046 64257 80237 44379 79876 07259 75462 50561 00111 75158 04962 97486 72464 63963 20477 72771 86471 65044 09467 70205 79458 79002 83149 82977 38894 11634 20934 73523 04194 60400 23261 62842 49913 83941 03414 60416 79500 63258 19880 70351 45679 49423 71387 31261 37582 66254 64409 92394 24737 94918 89549 32341 11586 84192 71899 53653 47671 61045 86757 98137 54009 88190 47096 42384 46611 87145 92047 33681 25797 15908 58133 68089 46849 55154 56591 43296 97123 85064 80895 36953 94500 39440 32532 18424 75549 47451 69116 60636 05521 40144 63308 99419 52211 25266 05347 42108 18456 U0H 680g3190vm 08n80 15 IFlulli 1 Hindu Humbu 31371 39 5323 32339 1 Begin with the starting point Yourframe is from 001 to 728 so you need three digit numbers 57435 Cross out the two numbers on the right side of 95646 this leaves 956 956 is out of your frame Move down one number 44085 cross out 8 amp 5 leaving 440 440 is in your frame it is the first of your 15 cases 14 left Move down one number 83967 839 is not in your frame move down one more 499 is so pick it and so forth 83311 9355 T51 H UNT Geog 3190 Wolverton 16 The Sample Dr Oppong would interview students 440 499 653 423 639 171 340 088 671 145 702 149 601 333 274 The result is a rou that is randomly selected and thus more likely to be representative That is there is no biasing choice mechanism in the sampling lJNT Geog 3190 Wolverton 17 Uneven coverage amp SS costs problems Because simple random sampling is completely random there is no guarantee of even coverage of the sampling frame Additionally it can be costly in geography to travel to sample There are a variety of sampling strategies to deal with these problems Systematic sampling guarantees even coverage Stratified sampling very useful for populations areas with different subsets to them Cluster sampling minimizes costs and targets efforts very important in geography Multistage sampling may combine advantages of approaches lJNT Geog 3190 Wolverton 18 Systematic Sampling Sam linv that starts with ordering the case labels from lowest to highest then picks the first case randomly and selects at an equal interval for the rest of the cases For Dr Oppong number each student and order from 001 to 728 Pick the first case and following cases Determine the interval size K Pick the first case randomly from the first interval lJNT Geog 3190 Wolverton 19 Determining the Interval Calculate the interval K based on the desired sample size We desired a sample of 15 to determine the interval take 72815 K K 485 round to 48 Always round down in SIS Pick the first case from 001 to 048 randomly using the random number table Then add 48 to that first case to get the next one and so forth lJNT Geog 3190 Wolverton 20 Tabs 61 mm Hungquot This time we are picking a random 318 mm 53235 number from O1 to 48 the first 134 32639 1mm interval DETEB 16314 31666 quot95645 67466 D5157 Close your eyes and put your finger 535 37245 573373 on the table 6366 64316 51612 49m masq 54535 Lets say we land on 83 it is not in 65332 16438 64433 the interval 42309 6463 55291 N5 quot1303 13035 But move down to the first number 63916 as 72415 that is 97595 lam 66562 171 15 32W 39252 It is 42 which is your first case 3463 mam 49914 as 453 3943 Add 48 to 42 and 91 is your next W115 41050 65453 0886 14596 62602 33mg 7quot 05 55 55945 59553 42 90 138 186 234 282 330 378 33359 Biff 3 32429 426 474 522 570618666 714 83311 49356 151 Til UNT Geog 3190 Wolverton 21 Stratified Random Sampling edge 9f study area Prairie qualmsiI areganslale edulnsmctmeEIAwlsumavCamemMsetslSlRS an A method of sampling that takes into account known differences in the underlying population Here the target population is separated into several groups strata to reflect that underlying structure Called quottarget subdivision A random device is then used to sample strata This sample is stratified into forest and prairie um Geog 3190 Wolverton 22 Two kinds Proportional stratified random Disproportional stratified random The same proportion ofarea or A higher proportion ofa stratum population is sampled in each is sampled than for other strata stratum Let s say I wanted to learn about Let s say I wanted to sample plots the abundance of a bird species to determine community that occurs most often in the vegetation in the prairie and forest but less so in prairie forest areas I need to sample both areas but I need to find out equally about forest moreso both strata UNT Geog 3190 Wolverton 23 Other examples Proportional Disproportional Voting preferences amp residence Let s say legislation to be voted types 10 sample upon is most important to house owners I want to make sure I cover all types of residences and sample 0 I would still want to sample each each randomly stratum residence type Stratify by type apartments But I might take a 20 sample houses condominiums mobile from homeowners and less eg home etc 5 from others Take a 10 sample from each UNT Geog 3190 Wolverton 24 Stratified random sampling You decide on the appropriates subdivision based on the questions you ask 0 The key is to sample within each stratum randomly Can be done with simple random sampling Or with systematic sampling lJNT Geog 3190 Wolverton 25 Cluster Sampling l39ixmv I wilyimm m lmwliisiiilxl A method of sam in in which cases are selected from groups within the sampling frame In this study of HIV transmission in Bangladesh researchers studied rural and urban areas Within those areas simple random sampling would have been inefficient They chose clusters neighborhoods villages and studied 30 clusters in each area um Geog 3190 Wolvenon To cluster sample 0 Divide population into groups clusters Randomly select a subset of those clusters Collect data within selected clusters Either census within the l tr Or randomly sample within the cluster 2 stage lJNT Geog 3190 Wolverton 27 Cluster sampling another example Let s say we want to sample parasites in horses in North Texas to determine risk for a new rancher We could randomly seleCt USGS sections then go look for horses in the sections we select Inefficient why Or we could pick multiple areas clusters where horses are ranched randomly select a subset of clusters and then study ranches within each cluster Efficient why lJNT Geog 3190 Wolverton 28 Cluster sampling Very efficient in veovra h where sam linv often requires travel For example suppose the 728 students in Dr Oppong s sampling frame where all over the world after they graduated Wouldn t it be most efficient to randomly select a subset of large cities and then randomly sample alumni in those areas Depending on SS amp time you may sample every case within a cluster g randomly sample within each cluster UNT Geog 3190 Wolverton 79 Multistage sampling Complex sampling designs that combine one or more of the traditional approaches Cluster sampling can be multistage if you sample within clusters If you census within cluster then it is not Example you might stratify an area into subsections randomly select clusters within each stratum and then systematic sample within each cluster UNT Geog 3190 Wolverton 30 Normal Probability UNT Geog 3190 Wolverton Inferential Statistics Rely on probability theory Up until now all descriptive But we would like methods with which to draw inferences about a population using a sample Because we use part of the population to draw inferences about the whole population there is always uncertainty in the correctness of our conclusions error lJNT Geog 3190 Wolverton 32 Probability Theory Is the science of uncertainty quotEnables us to evaluate and control the likelihood that a statistical inference is correct Weiss 2002146 Probability the chance that any particular outcome for an event will take place lJNT Geog 3190 Wolverton 33 Properties of Probability The probability of an outcome is always between 0 and 1 The probability of an outcome that cannot occur is always 0 an impossible outcome 0 The probability of an outcome that must occur is 1 a certain outcome lJNT Geog 3190 Wolverton 34 Area amp the Normal Curve The total area under the curve 1 So if we asked the question what is the probability of encountering a case at the mean or less It would be 05 because the mean is the middle of the curve 0 That is half ofthe area ofthe curve is below the mean to the left lJNT Geog 3190 Wolverton 35 I What is the quotnorma curve It is a model of the perfect symmetrical distribution It was derived mathematically Its purpose is to serve as an ideal example of the a data distribution we tend to see often Symmetrical Unimodal The normal curve is not real reality What does it mean to assume normality The normal curve is an ideal model Area under the curve is used to make predictions In order to make predictions with our real data we must assume that our data are normally distributed Variance Standard Deviation based on h s Parametric statistics assume nc I I I I A model of a perfectly symmetrical distribution Perfectly symmetrical real distribution A an I n A normally dlStrlbUted dlStrlbUtlon The basis of parametric description amp inference Veryfew low amp high dollar sales Frequency t Retail Sales Median Mean amp Mode A normally distributed real data distribution with a superimposed normal curve Frequech 1i Retail Sales Median Mean K Made Probability We often speak of probability when using the normal curve Area under a portion of the curve is the probability of encountering a particular score There is a higher probability of encountering a score at A than B It is near a part ofthe curve with more area under it B Mummediammode A In a normal curve There is less than 5 chance of encountering a score greater than i 25 from the mean I h There is less than 1 chance of i l l l l encountering a score greater than 3 5 I 3935 I S 1 3 5 xII 25 171135 If this distribution height then a 1 I score outside of 35 is either I extremely tall or extremely short I which is uncommon improbable I 6826 i 35 from the mean 3I l l l l l l l 9546 9973 wwwmathnstuf fcom Cumu aliw perccnlago obsemalk s anestsmres H ghestsmres Standard scores aka Zscores Aka quotstandard deviation units Indicates how many standard deviations separate a particular score from the mean iiii 5 3 Calculated as the score value minus the mean divided by the standard deviation PencHs Pencil Length inches Xi mean X mean2 1 10 63 3969 2 4 03 009 3 2 be 4 15 22 484 5 1 27 729 Mean Sum 2Xi mean2 37 0 548 What is the variance What is the standard deviation What is the zscore for pencil 1 Pencil 4 Probability of a case with score lt mean um Geog 3190Wuivem7n 44 Normal Curve Tables Area 0 What is the probability of encountering a case that is between the mean and 15 Must find the area between the mean and 15 We can do this by Knowing the zscore for 13 z 1 Using the table which is a record of area between the mean and any particular zscore UNT Geog 3190 Wolverton 45 Area from the 2 normal distribution 35 EEs xs E E Es E25 35 UNT Geog 3190 Wolverton 46 Summary 0 4 levels of analysis here 1 2 3 4 raw data scores z scores calculated from raw scores area under curve related to z score area equals probability of encounter in a distribution UNT Geog 3190 Wolverton 47 Precipitation Data Calculate the z scores for each score Calculate Pearson s skewness Use ZSCOFES t0 answer What is the probability of encountering a year with S 32 inches in ralntallr39 What is the probability of encountering a year with 2 50 inches in rainfall What is the probability of encountering a year with rainfall between 27 and 53 inches lJNT Geog 3190 Wolverton Why do we care Most inferential tests provide a test statistic that falls in the normal distribution We base our conclusion on how close that test statistic is to the mean That is how far is it in standard 2 scores from the mean AND how likely is it to represent the mean using probability area If it is far out big zscore the lower the probability it belongs with the mean lJNT Geog 3190 Wolverton 49 But This only works when we can assume normality If samples are representative of the population then when n 2 30 we can assume normality A magic number we will explain next week If you know that a sample is from a normally distributed population you can always assume normality regardless of sample size Why So it is critical that our samples are representative UNT Geog 3190 Wolverton 50 Normality Tests in SPSS UNT Geog 3190 Wolverton Accessing the tests 3911 4 You must choose oH m39am Cancel these or you WIII not 9 s Help get test results I 9 Narmalkv plots wuh tests rswead vs Level wllh Laverne Test Descnptlve Stamndleal Euxplnts F Factor levels together and Nanvavametn lest f Urwamlmmeu 33 Sex Dependent List Age BuckWT Alter Explore Plo iwmm m push OK I Pmnls I Spread Factor List i IAHEA I Doer39 a Label Case by Dlspley Bulh 6 Stellsllcs 6 Plots sensual Fluts Uplin U NT Geog 3 190 Wolve non Interpreting results Tests or Nnmnllly Kulm gumvrSmimuy Shapierilk Statistic a Sig Statistic m SIE EchW 5 mm 985 1382 can 3 1382 a Lillrefms Signi cance Correction For tests on samples of n 3 to 2000 use Shapiro Wilks for those of n gt 2000 use KolmogorovSmirnov H0 normalit If you accept then assume normality If you reject then do not assume normality Statistic is the test statistic W for S W D for K S Sig is the significance for the test aka the pvaue If p lt then 005 reject the HD because the test is significant um Geog 3190 Wolverttn 53 What do these tests do They compare the shape of you sample distribution to the shape of a normal curve The assumption is if your sample is shaped like a normal curve the o ulation from which it came is normally distributed for that variable then you can assume normality A significant test means the sample distribution is not shaped like a normal curve UNT Geog 3190 Wolverton 54 Significance etc Let s not worry about how we determine significance of a test at this point Simply learn the criteria and quottrust it 0 We will get back to significant results later in the class UNT Geog 3190 Wolverton 55 Tests of Normality av Lnlwemrs Signi cance Currenion ShaperWIIK Reject H0 Normal QQ Plot of BuckWT 15n Expected Normal qua Frequency SD 75 Observed Value 2 no l I an an an an an no we an 0 an BuckWT Both charts show you departure from normality at 35 to 40 pounds UNT Geog 3190 Wolverton 56
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'