PY 211 Lectures 1-10
PY 211 Lectures 1-10 PY 211
Popular in Elem Statistical Methods
verified elite notetaker
Popular in Psychlogy
This 7 page Class Notes was uploaded by Jennifer Scheuer on Saturday February 6, 2016. The Class Notes belongs to PY 211 at University of Alabama - Tuscaloosa taught by Andre Souza in Spring 2016. Since its upload, it has received 76 views. For similar materials see Elem Statistical Methods in Psychlogy at University of Alabama - Tuscaloosa.
Reviews for PY 211 Lectures 1-10
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/16
PY 211-Introduction to Statistical Modeling (Chapter 1) What is Statistics? The science of learning from data, and of measuring, controlling, and communicating UNCERTAINTY. Gives methods for analyzing, collecting, and making predictions based on data. Statistics quantifies variation. Variation Everything varies. Statistically Significant: if the variation observed is larger than expected. Non-Statistically Significant: if the variation observed is what one would expect. Something (an explanatory variable) is influencing (changing the variation of) what we are observing (the response variable) Variables Response variable: the variable whose variation you are trying to understand. Explanatory variable: the variable that is influencing the variation of the response variable. Variable: something that takes up different values. (the label not the actual value) Key Terms in Statistics Descriptive Statistics: the summary of the information in a collection of data. Inferential Statistics: using sample statistics to estimate the values of population parameters. Population: the total set of units. Sample: a subset of the population. Parameter: a number that describes the entire population (represented by greek letters) Statistic: a single number that summarizes a sample (represented by roman letters) Random Sampling: each member of a population has to have an equal chance of being selected as part of the sample. (Ex. Testing soccer players, but draw from a population of baseball players) Parameter Estimation: using a sample to guess the population. PY 211-Basic Concepts in Statistics (Chapter 2) Types of Variables Discrete Variables: variables that can only take on specific values. (Ex. Number of siblings) Continuous Variables: variables that can take any real number value. (Ex. Reaction time) Categorical Variables: qualitative variables, used to characterize a set of categories. They can be: Nominal: two or more categories where order does not matter. (Ex. Modes of transportation) Dichotomous: only two categories where order does not matter. (Ex. True or false) These are also nominal. Ordinal: two or more categories where order does matter. (Ex. Rankings) Quantitative Variables: variables characterized by numerical values. They can be: Interval: numerical values in which the intervals between the values are assumed to be the same. (Ex. Temperature) Ratio: numerical values with a meaningful zero point. (Ex. Height) Zero represents the absence of a variable Understanding Types of Variables All categorical variables are discrete. Quantitative variables can be either continuous or discrete. PY 211-Displaying Data (Chapter 3) Data Summary Categorical Variables: list the categories and show the frequency for each category. Frequency Distribution: the listing of possible values for a variable with the number of observations at each value. Relative Frequency: the proportion or percentage of observations that fall in the particular value. Outliers: extreme observations that fall far from the rest of the data. (cause exaggerated estimates) PY 211-Measures of Central Tendency (Chapter 4) Data Frames Object with rows and columns. Rows contain different observations. Columns contain the values of the different observations. The values can either be quantitative or qualitative. Central Tendency The tendency of measurements to cluster around certain values. Sample statistics often cluster around central variables. Find it by plotting data. Mathematical Notation A variable will be represented by a lowercase letter. Individual values of a variable are represented by a subscript. To refer to a since value without specifying which one it is use xi Σ means to sum everything that follows. Arithmetic Mean The most straightforward measure of central tendency. Represented by x . Answers the question of if all the data points had the same value, what would that value be? Only works for quantitative variables. It is very sensitive to outliers. Only single number for which the residuals sum to zero: ∑ x −x´ =0 ( i ) Geometric Mean Answers the question of if all the numbers in a dataset had the same value, what would that value be in order to achieve the same product? Represented by ^ Used when numbers are dependent on each other: ^ = n∏ x √ Median The middle value in a dataset, separate the higher half from the lower half. Not sensitive to outliers. Have to arrange the numbers from lowest to highest and find the term in the middle. If there is an even number of values then the median will be the average of the two middle values. Appropriate for quantitative and ordinal variables. Mode Represents the most common outcome. Mostly used for highly discrete variables, but applicable to all types. PY 211-Measures of Variability (Chapter 5) Variability Measures how well an individual score represents an entire distribution. Greater the variability the greater the uncertainty about the parameter that is supposed to be estimated. Range The distance between the minimum and the maximum values. Only two data points contribute to the range (minimum and maximum) Residuals The distance between each data point and the mean. The longer the residual line the more variability in the data. The sum of all the residuals will be zero. To get rid of the negative signs take the absolute value or square the residuals. x −x Absolute Value: | i | ) ∑¿ Square: ∑ (xi−x´) Sum of Squares 2 Square: ∑ (xi−´) = SUM OF SQUARES Sum of the squared deviations. 2 Computational formula: ∑ x − (∑ x) n Represents TOTAL VARIABILITY The more numbers (the bigger the n) the bigger the sum of squares. Have to make sum of squares NOT dependent on sample size. ∑(xi−´x) Mean squared deviation: n Degrees of Freedom The sample size minus the number of parameters estimated from the data. We have (n-1) free choices if we estimate a parameter from a sample size n. Variance Provides an unbiased estimate of population variability. The sum of squares divided by the degrees of freedom. Measures AVERAGE VARIABILITY ∑(xi−x)2 Conceptual Formula: n−1 2 2 ∑ x) Computational Formula: ∑ x − n n−1 Measures the reliability of an estimate Standard Deviation Conceptually the same as variance. Variance is difficult to interpret because it is in squared units. Resolve that problem by taking the squared root, which is the standard deviation. 2 Conceptual Formula: ∑(xi−x´) √ n−1 2 2 ∑ x) Computational Formula: ∑x − n √ n−1 The typical distance from the mean. Greater variation around the mean the greater the standard deviation.
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'