Social Science Statistics
Social Science Statistics STA 2122
Popular in Course
Popular in Statistics
Daphnee Quitzon DVM
verified elite notetaker
This 11 page Class Notes was uploaded by Daphnee Quitzon DVM on Wednesday September 23, 2015. The Class Notes belongs to STA 2122 at University of South Florida taught by George Lunsford in Fall. Since its upload, it has received 20 views. For similar materials see /class/212692/sta-2122-university-of-south-florida in Statistics at University of South Florida.
Reviews for Social Science Statistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/23/15
Lesson 1 The Whats and Wherefores of Statistics Objectives After you have completed this lesson you should be able to De ne statistics data case descriptive statistics inferential statistics and signi cance De ne and identify the population and sample in a data collection situation De ne the term variable and the differing levels of variables Explain the difference between qualitative and quantitative data Identify variables as nominal ordinal interval or ratio Understand and differentiate between an independent variable and a dependent variable Explain and identify the difference between a correlational study and an experiment Make up an operational definition Determine and de ne the difference between research design and statistics Discussion It is our opinion that most people already use everything in this course but they don t realize they use it all the time This course will be a great deal easier if you start out with recognizing that you already use statistics but when you are discussing what you do you use different terms So what you really need to do is learn to use the vocabulary This rst chapter is mainly about statistical terms the vocabulary of statistics Before we dive right in with us giving you a bunch of terms to learn we d like you to spend a few moments considering why you are learning statistics We all know that mathematicians must learn all about statistics but why do social scientists have to learn it too You may not realize that when social sciences eg sociology psychology criminology communication started out predictions about causality were based on some theory put forward by some important person If the person were important enough people would merely accept the theory as being true It wasn t until statistics came along that people began asking whether these theories were actually based in fact and whether the predictions were true Today instead of merely following blindly what has been said by some important person educated people who have learned about statistics ask questions about what evidence there is to support theories that are put forward They want to know whether the theories are based on 12 data observations written in words or numbers collected from a large number of people or just a few and how many studies led himher to the theory They also want to know whether the data that was collected was in the form of words qualitative or numbers quantitative if the individual cases participants were asked biased questions questions bound to get the desired answer and whether the participants in the study were chosen randomly or from some special or biased group Further they also want to know whether or not some outside in uence was in uencing the researcher to come up with his theory There are many questions like these that today s welleducated researcher buyer voter etc want to know In order to carry out research properly it is necessary to understand how to run an experiment or a quasiexperiment an experiment in which the researcher just observes but makes no changes It is also necessary to explain very clearly what has been done in the experiment You might argue that you might never carry out an experiment or get involved in any research Even if this is true you need to understand the explanations that others give about their research and you need to have the knowledge necessary to enable you to discuss research ndings with your colleagues Finally it is important that you know statistics because many businesses organizations and government offices use the language of statistics in communicating results plans and objectives By understanding the terms being used you gain a sense of professional competence and selfassurance These tools help you make a higher salary Why The reason is that statistics is mostly about prediction If you are able to predict well you are a valuable member of society Often you are paid a higher salary because you are a good predictor Let s now get on with learning about how to talk the talk The following terms are essential for you to understand right from the start Learn them and you will find our work much easier Population The term population refers to the overall group in which you are interested When you talk about men or women in general you re referring to the whole population of men or women When you are trying to describe only collegeaged men or women you are referring to a different population In other words the term population refers to everyone or everything defined by pour topic of interest If we were only interested in the number of students in a classroom or in the number of rats in a laboratory and we were not going to do anything more than describe this group we would use the word population when describing this group Sample and Generalize The term sample refers to a subset of the population You are referring to a sample when you talk about men or women that you have met who represent men or women in general 13 this is called generalizing If you count the number of cars that pass by a particular corner on one day and use that count as an estimate of how many cars pass that corner each day the count you make is a sample The number of cars that actually pass that comer each day is the population this number will change everyday Data and Case When we observe one individual we refer to himherit as a case the smallest unit of observation eg one rat one married couple one city We write down information about the observations that we make and collectively call this information data As a general rule when you show numbers that represent the characteristics of the cases in your sample you would refer to them as data If data from one observation is of the mpe that can be infinitely broken down eg weight time or any numberifor example 12134 it is called continuous If it can take on ANY value in a particular range or if the range has an infinite number of possible values it is continuous For example it might usually take 10 minutes to get to work but on a particular day it might be 10 minutes and 30 seconds or 10 minutes and 45 seconds You can break minutes down into seconds seconds down to milliseconds etc infinitely If it cannot be broken down eg number of participants it is called discrete If it can only be counted in whole numbers OR can only have a specified number of possible values it is discrete Statistics and a Statistic When analyzing data you use statistics procedures to organize condense and obtain answers about the cases represented by the data The reason these procedures are called statistics is because most of the time when we analyze data we are examining data from samples Any number that summarizes 39 about a sample we call a statistic Whereas any number that summarizes 39 about a 39 is called a parameter For example if you took a random sample of 25 students and asked for their ages the number of students 25 would be a statistic referred to as n the age of the oldest students would be a statistic referred to as Max the age of the youngest student would be a statistic referred to as Min and so forth Descriptive Statistics Inferential Statistics and Signi cance When you want to describe a sample you would use descriptive statistics If you want to imply that a larger 39 is represented by the statistics of a sample you would use inferential statistics For example if you take a sample of students and want to show how many students were in your sample and what the maximum and minimum ages of your sample were you would use descriptive statistics If however you want to suggest that this sample represented the population of all students you would use inferential statistics to do it In order to show that this sample is similar or dissimilar to a population or another sample you would use a statistical tool to see if your ndings were signi cant Although it is not always the case researchers usually want to show signi cance This means that what has 1 4 been found is due to more than iust chance and this nding would be repeated if another samgle were taken Variable In statistics when we collect one piece of information from a sample we abbreviate the case eg the student with the letter X and a subscript number The first student would be X1 the second student X2 the third student X3 and so forth We can then make comparisons with the X1 X2 and X3 from another sample We can write down an observation in words or numbers which can var in ggge or value and refer to what we are trying to observe as a variable In a group of students consisting of males and females gender can be a variable This is because the variable can take on more than one value male female Political affiliation is a variable with the possible values quotdemocratquot quotrepublicanquot quotlibertarianquot or quototherquot Among a group of boys gender cannot be a variable Instead it would be a constant because the value of gender boys or girls could take on only one value in our group of boys boys Measurement and Assessment The terms measurement and assessment are synonyms in this course and mean using a rule to aggly a number to a variable Why would you want to replace a number for a word Because you may then easily use the number to calculate statistics in which you may be interested For example male l and female 2 is a simple ruleiwhen you use this rule you are measuring the variable of genderithen you can assess the percentage of women vs men quite easily using a program like Excel by adding the 1 s A grade of 80 on an exam is another form of measurement using this grade as a number you can compare your standing in the class with your peers and also determine how you match up in comparison to what the teacher expects Qualitative and Quantitative Data 4 types Nominal Ordinal Interval and Ratio Many students of statistics do not realize the importance of knowing the difference between the various types of data but this is vital to continue this course and to use statistics Later in the course you will be required to make a decision as to the type of test to use in order to demonstrate the difference between two groups In order to do this you must first decide what type of data nominal ordinal interval or ratio to use in your analysis Without this knowledgel you cannot continue Before you can use inferential statistics you must specify what type of data is being analyzed There are four different types of data each gives a different kind of information We also call them scales of data or levels of data Each level gives more numerical information than the last On the first level is the type of data called nominal and it is collected by asking the name of something For example names of gender M or F religion Catholic Protestant etc politics r 39 quot D quot I J J etc or race Caucasian African American 15 etc are types of nominal data When you assess nominal data there is no set order therefore the rule you make concerning which number to give to the answers is up to you arbitrary You could just as easily say male 2 and female 1 Because the numbers are just coded information you can do very little with them arithmetically The same would apply for religion politics or race although these are not the only nominal variables This kind of characteristic is called qualitative qualities are identi ed not 39 39 The next three types of data are called quantitative and are called so because the data are collected utilizing numbers for use in mathematical formulae and calculations For example the number of times one coughs one s grade in school and the number of questions answered correctly provide information about the amount of an attribute that is collected for each case In other words quantitative data is numerical information that conveys information about how much of some attribute each case possesses One type of quantitative data is called ordinal and is similar to nominal data except that some order is evident usually a ranked order In ordinal data one number is relative to another and may be collected by asking the name or the number of something For example the pa1ticipant in a study may be asked to identify hisher year of college The reply may be freshmen sophomore junior or senior Alternatively the response could be 123 or 4 Generally when you hear the word ranked or ranking you may assume in this course that the data being referred to is ordinal Another type of quantitative data is called interval Interval data bases its scores for each case on the number of fixedsized units of the attribute possessed by that case These data are collected in the form of a number that ranges from a low number to a higher numberin order and the distance between each number is considered to be equal tests of personality and intelligence use interval data The distance between any two numbers that are next to each other is equal These equal distances intervals give the scale its name interval In interval data a zero does not necessarily imply that none of the quality exists For example a 0 Fahrenheit reading for temperature does not mean that there is no temperature and a zero on a statistics exam does not imply that a person has no knowledge of statistics If you can get a negative value a value less than 0 this would be a good marker that your data are interval A third type of quantitative data is called ratio and usually refers to some quantity such as time This type of data is collected in the form of a number and the distance between each number is equal like the interval level data For example when someone takes two seconds to answer a question they are taking twice as long as a person who takes one second Often the procedures methods and calculations that are used with interval data are also used with ratio data The added feature of ratio data is that when a score of zero is given to a case it means complete absence of what is being measured Examples of ratio data include measures of weight time and height If you can get a measurement that is a negative number you are NOT dealing with ratio data as 0 is the lowest value these data can take on The following table gives examples of each type of variable Table 11 Levels of Measurement How many X T a correct on a subject based on a test Number correct years as of years in school schooling after high school GED as of seconds gt 2 Notice that the answer on a form to any of the nominal variables is a word the answer to any of the ordinal variables is a ranking maybe in the form of a number and the answer to any of the intervalratio variables is a number Because the statistical procedures are the same for interval and ratio level data we can group them together as intervalratio Experiment Experimental Control Independent vs Dependent Variable and Correlational Study In this course we refer to three different types of variables the variable of interest the independent variable and the dependent variable When you are attempting to describe a sample or 39 with numbers that vary in size you are referring to the variable of interest A researcher who runs an experiment has at least two groups and tries to keep all 39 on all participants the same experimental control except in the case of one in uence the independent variable which is 39 for one group only The result the J I J variable is then assessed An easy way to tell the difference between an independent variable and a dependent variable is to answer the question Which came first The independent variable IV comes first A researcher who is not running an experiment but who is considering mere relationships between variables is conducting a correlational study In a correlational study one is seeking to determine if the score of one variable rises or falls as the scores on another variable rises or falls The 39 J r J variable in this case would refer to the variable that would be noticed first For example if we were looking for a correlation between the trait anxiety and illness we might measure a person s tendency to be anxious and later find out how many sick days he has had in a year In this case the trait anxiety would be noticed first and would be called the independent variable Operational De nition When discussing a variable it is important to de ne what counts as a possible value for that variable As you have already learned some variables have numerical values like quantitative data measured on interval or ratio scales and others have categorical values like qualitative data measured on the nominal scale When you specify exactly what kind of values will be used for your variable you have created the operational definition For example if you operationally de ned the variable smoking habits as the number of cigarettes smoked in a day you would be using a ratio level of measurement number of cigarettes per day But this might not be specific enough to really capture how much someone smokes To be more speci c you might use the number of seconds dragging on inhaling a cigarette in a day Some people light up but let the cigarette burn in the ash tray so you would not want to count that cigarette in the total Using seconds spent dragging takes into account the possibility that somebody could light up 60 cigarettes a day but drag less than someone who lights up only 20 So seconds is a more accurate indication of smoking habits than the number of cigarettes lit You can measure just about any variable on each of the 4 levels of measurement and you should practice being able to come up with different operational de nitions of the same variable In keeping with the example above we could measure the variable smoking habits on each of the 4 levels of measurement nominal level cigars cloves menthols nonmenthols smoking habits is measured in terms of what is smokedithe values are types of tobacco products ordinal level does not smoke smokes lt one pack a day smokes l 2 packs a day smokes gt 2 packs a day smoking habits is measured in terms of how much is smokedithe values have a natural order from less to more but the values are not the same size intervals the difference between values is not consistent interval level the number of seconds dragging greater than two seconds smoking habits is measured in terms of time spent dragging that s MORE THAN 2 seconds This is the key for making it interval What if I spent only 1 second dragging My measurement would be a l The possibility of getting a negative answer makes this operational de nition interval and not ratio ratio level the number of seconds dragging Smoking habits is measured simply as the number of seconds spent dragging If you are counting from zero to any positive whole number you have ratio data Research Design In order to analyze data using statistical tools you must rst decide who or what to study how they are operationally de ned and who or what to eliminate from the study etc The research design is the plan of how you are going to conduct the research Statistical tools are used to analyze the data once it is collected Your research design is limited by your knowledge of statistical tools and your statistical tools are limited by the research design you 18 choose If you do not know how to compare and show a signi cant difference between two groups then you are not able to support any comments you might make about the difference between the two groups If you do not know how to nd the correlation between two variables you are not able to backup your comments about a relationship that eXists So your statistics are bound to the type of research you are doing You have to know about design and statistics in order to answer any research question SelfHelp Exercises Lesson 1 The following exercises will help you apply the ideas from the lesson 11 a b c 12 13 14 s V D o V3V g 15 VV 6 V Identify the cases and the data A teacher has selfimage test scores for each of her kindergarten pupils A school psychologist uses a 15 scale to rate the level of cooperation in each of several elementary school classrooms A researcher records the number of weapons violations reported during the last year in each of 30 school districts a What do we learn from the data b What do descriptive statistics tell us c What do inferential statistics tell us In the statement quotLate registrants who enrolled in classes in which the number of allowed absences was restricted were signi cantly more likely to drop the course or be dropped than were timely registrantsquot What does the word signi cantly mean Identify which of the following are nominal which are ordinal which are interval and which are ratio Classi cation in school scored lrfreshman 2sophomore 3junior and 4senior Religious preference scored l 3rotestant 2catholic 3Jewish and 4 other Political affiliation scored lRepublican 2Democrat 3independent Leadership skill scored by rankordering the cases Cognitive speed scored as the number of arithmetic problems each case solves in 60 seconds Educational attainment scored 0no high school diploma 1 high school graduate 2 associate s degree 3bachelor s degree 4master s degree 5doctorate Attitudes toward a proposed tax scored 0oppose lfavor Intelligence measured by the number of IQ test items answered correctly Academic performance measured as one s ranking in the class Obesity measured as percentage of body fat In an investigation of pain sensitivity a researcher asks subjects to immerse their arms in ice water for 60 seconds Indicate the type of data he records as nominal ordinal interval or ratio number of seconds to the initial report of pain ratings of pain following 20 seconds of immersion rank orders of subjects based on the number of seconds to the first report of pain whether subjects did or did not report pain by 10 seconds following immersion Identify the independent and dependent variable in the following experiments a teacher gives stickers for good behavior to students in one of her classes and uses verbal approval alone to reward good behavior in a second class She later compares the conduct grades of students in her two classes A physiological psychologist injects one group of rats with a drug and gives rats in a second group a saline injection Four hours later he compares the mazeleaming 17 US VV 18 19 110 3833 111 112 113 114 llO performance of the rats in the two groups measured by number of trials to perfect performance Identify the cases and data in each of the following examples a family therapist records the number of children in each of several families a psychophysiologist measures the blood pressure of each of 10 executives a sociologist records population densities for each of 15 cities Why is a variable called a variable Create operational definitions for the variable intelligence on each of the four levels of measurement Identify the scale of measurement nominal ordinal interval or ratio used in measuring each of the following variables family size categorized as quotsmallquot quotmediumquot or quotlargequot classroom size measured as number of pupils psychopathology scored ldepression 2 schizophrenia and 3personality disorder academic achievement measured as number of items answered correctly on a test Explain the difference between a variable and a value Give an example of a descriptive statistic Give an example of a descriptive parameter A criminologist was interested in finding as many statistics as he could on the inmates of a prison Use 3 checkmarks to indicate whether each variable is a nominal ordinal interval or ratio b qualitative or quantitative amp c discrete or continuous Some of these are judgment callsibe prepared to explain your answers lll uiz uestions Lesson 1 How can Itell the difference between a sample and a population What is meant when someone says that they are able to generalize to the population from a sample What is the difference between continuous and discrete data What is the difference between statistics and a statistic What is the difference between descriptive statistics and inferential statistics When a researcher reports that his ndings are signi cant what does heshe mean Name and give one example of EACH of the four different scales of measurement What is the difference between the independent and dependent variable What is a parameter Come up with an operational definition for income for each level of measurement