Stat 206 Notes: Chapter 1
Stat 206 Notes: Chapter 1 Stat 206
Popular in Business Statistics
verified elite notetaker
Popular in Math
This 4 page Class Notes was uploaded by Brandon Gearhart on Sunday October 2, 2016. The Class Notes belongs to Stat 206 at University of South Carolina taught by Angela Ferguson in Fall 2016. Since its upload, it has received 3 views. For similar materials see Business Statistics in Math at University of South Carolina.
Reviews for Stat 206 Notes: Chapter 1
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/02/16
1.1 Defining Data & 1.2 Measurement Scales Operational Definition: universally accepted meaning that is clear to all associated with the analysis Types of Variables: Categorical: Nominal: name of category Ex: Gender Ordinal: natural Ordering Ex: economic status Numerical/ Quantitative: Discrete: Distinct cut offs between values (Counting Numbers) (Whole number) Ex: Number of friends, children, courses Continuous: on a continuum Ex: Height, weight, time Numerical variables Measured On: Interval Scale: ordered scale in which the difference between measurements is a meaning quantity but does not involve a true zero point Ex: Temperature, test score Ratio Scale: ordered scale in which the difference between the measurements involves a true zero point Ex: Age (years or days), money 1.3 Collecting Data 1. Identify Data sources 2. Population (sample) 3. Cleaning your data 4. Recoding variables Population: all items/individuals about which you want to reach conclusions (parameters) Sample: items/individuals (from population) which are selected for analysis (items/individuals about which you collect data) (Statistics) Why Sample? Less time consuming Less costly Less cumbersome and more practical Data Sources/Formatting Primary data: collect own data Secondary Data: data for analysis have been collected by someone else Collect Data from: 1. Data distributed by an organization/ individual 2. Outcomes of a designed experiment 3. Responses from a survey 4. Results of an observational study 5. Data collected by ongoing business activities Data formatted more than one way Structured data (tables, standard forms, data stream) Unstructured data (open questions, messages) Data Cleansing: Cleansing (for outliers & missing values) Outliers: values that seem excessively different from most of the data values Missing values: values that were not collected Mutually exclusive: each data value is place in one and only one category Collectively exhausted: all data values must be recorded in the categories created 1.4 Types of sampling Sampling Frame: listing of items/individuals/units from the population used to select the sample Probability Sample: select items/individuals/units for the sample based on known probabilities (GOOD!)--- results can be generalized to the population Non-probability sample: select items/individuals/units for the sample without knowing Sample should be representative of the population. That is, the sample should be “like” the population to the greatest degree possible. Judgement: collect a sample that an expert thinks is representative of the population Pros: hmmm… Cons: Who/what is an expert? Who gets to make the decision? E.g. Pre Columbus, experts believe the world Convenience: Collect the sample that is easiest to access Volunteer: subjects choose to participate in the study Census: (attempt) to collect data from every individual in the population Pros: “like” the population because it is the population Cons: time, money, can be destructive Simple Random Sample: sample is choosing in such a way that every subject is equally likely to be selected for the study Pros: most basic form of random sampling Random sampling is the most likely way to achieve a sample that is representative of the population Cons: Not always feasible Can require large samples May require most complex sample designs Data sets that cover large spatial areas and/or long time periods are usually distributed by an organization. In a designed experiment, the researcher subjects’ different groups to different conditions and observes the results. In a survey, people are asked questions about their beliefs, attitudes, behaviors, and other characteristics. In an observational study, the researcher collects data by directly observing a behavior, usually in a natural or neutral setting. Data collected by ongoing business activities can be collected from operational and transactional systems that exist in both physical and online settings, but can also be gathered from secondary sources such as thirdparty social media networks and online apps and website services that collect tracking and usage data. A census oversees a variety of information regarding population, housing, and manufacturing and undertakes special studies on topics such as crime, travel, and health care. Data obtained via ongoing business activities typically involves extracting information from databases or website services that contain or track information about a customer or client.