## Stat 206 Notes: Chapter 1

by: Brandon Gearhart

3

0

4

# Stat 206 Notes: Chapter 1 Stat 206

Stat 206 > University of South Carolina > Math
Brandon Gearhart
USC

Chapter 1 notes for the first exam
COURSE
PROF.
Angela Ferguson
TYPE
Class Notes
PAGES
4
WORDS
KARMA
Free

## Popular in Math

This 4 page Class Notes was uploaded by Brandon Gearhart on Sunday October 2, 2016. The Class Notes belongs to Stat 206 at University of South Carolina taught by Angela Ferguson in Fall 2016.

## Reviews for Stat 206 Notes: Chapter 1

### What is Karma?

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/02/16
1.1 Defining Data & 1.2 Measurement Scales Operational Definition: universally accepted meaning that is clear to all associated with the analysis Types of Variables: Categorical:  Nominal: name of category Ex: Gender  Ordinal: natural Ordering Ex: economic status Numerical/ Quantitative:  Discrete: Distinct cut offs between values (Counting Numbers) (Whole number) Ex: Number of friends, children, courses  Continuous: on a continuum Ex: Height, weight, time Numerical variables Measured On: Interval Scale: ordered scale in which the difference between measurements is a meaning quantity but does not involve a true zero point Ex: Temperature, test score Ratio Scale: ordered scale in which the difference between the measurements involves a true zero point Ex: Age (years or days), money 1.3 Collecting Data 1. Identify Data sources 2. Population (sample) 3. Cleaning your data 4. Recoding variables Population: all items/individuals about which you want to reach conclusions (parameters) Sample: items/individuals (from population) which are selected for analysis (items/individuals about which you collect data) (Statistics) Why Sample?  Less time consuming  Less costly  Less cumbersome and more practical Data Sources/Formatting  Primary data: collect own data  Secondary Data: data for analysis have been collected by someone else Collect Data from: 1. Data distributed by an organization/ individual 2. Outcomes of a designed experiment 3. Responses from a survey 4. Results of an observational study 5. Data collected by ongoing business activities Data formatted more than one way  Structured data (tables, standard forms, data stream)  Unstructured data (open questions, messages) Data Cleansing: Cleansing (for outliers & missing values)  Outliers: values that seem excessively different from most of the data values  Missing values: values that were not collected Mutually exclusive: each data value is place in one and only one category Collectively exhausted: all data values must be recorded in the categories created 1.4 Types of sampling Sampling Frame: listing of items/individuals/units from the population used to select the sample Probability Sample: select items/individuals/units for the sample based on known probabilities (GOOD!)--- results can be generalized to the population Non-probability sample: select items/individuals/units for the sample without knowing Sample should be representative of the population. That is, the sample should be “like” the population to the greatest degree possible. Judgement: collect a sample that an expert thinks is representative of the population Pros: hmmm… Cons:  Who/what is an expert? Who gets to make the decision?  E.g. Pre Columbus, experts believe the world Convenience: Collect the sample that is easiest to access Volunteer: subjects choose to participate in the study Census: (attempt) to collect data from every individual in the population  Pros: “like” the population because it is the population  Cons: time, money, can be destructive Simple Random Sample: sample is choosing in such a way that every subject is equally likely to be selected for the study Pros: most basic form of random sampling  Random sampling is the most likely way to achieve a sample that is representative of the population Cons:  Not always feasible  Can require large samples  May require most complex sample designs Data sets that cover large spatial areas and/or long time periods are usually  distributed by an organization.  In a designed experiment, the researcher subjects’ different groups to different  conditions and observes the results.  In a survey, people are asked questions about their beliefs, attitudes,  behaviors,  and other characteristics.  In an observational study, the researcher collects data by directly observing a  behavior, usually in a natural or neutral setting.  Data collected by ongoing business activities can be collected from operational  and transactional systems that exist in both physical and online settings, but can  also be gathered from secondary sources such as third­party social media  networks and online apps and website services that collect tracking and usage  data. A census oversees a variety of information regarding population, housing,  and manufacturing and undertakes special studies on topics such as  crime, travel, and health care.  Data obtained via ongoing business activities typically involves extracting  information from databases or website services that contain or track  information about a customer or client.

### BOOM! Enjoy Your Free Notes!

