# Week One 30002

KSU

This 5 page Class Notes was uploaded by Cole Wojdacz on Tuesday February 2, 2016. The Class Notes belongs to 30002 at Kent State University taught by Dr. Eng in Summer 2015.

## Reviews for Week One

Date Created: 02/02/16

Foundations of Biostatistics What is Biostatistics? Application of statistical principles to medical, public health, and biological applications It is the science that deals with development and application of the most appropriate methods for the: o Collection of data o Presentation of the collected data o Analysis and interpretation of the results o Making decisions on the basis of such analysis Role of Biostatisticians To guide the design of an experiment or survey prior to data collection To analyze data using proper statistical procedures and techniques To present and interpret the results to researchers and other decision makers Population vs Sample Popular population: a collection of persons or things in which we have an interest o A collection of persons living in Florida who test positive for Hepatitis C o The collection of deer in a Michigan county who carry the tick responsible for Lyme disease Statistical population: a collection of some characteristic of a set of persons or things in which we have an interest o The blood pressures of some set of students o An indicator as to whether each student had Sample: a subset of a population that may consist of persons/things or characteristics of persons/things depending on whether the sample is from a popular or statistical population o A sample of 50 men over the age of 65 who suffer from hypertension o A sample of 50 blood pressures taken on men over age 65 who suffer from hypertension Statistic: a number derived from a sample Parameter: a number derived from a population Two Types of Statistics Descriptive: values that describe characteristics for a set of observations or data o Percentages o Means o Frequency distributions Inferential: values that represent generalizations about some characteristics of a population, based on information found in a sample Why Use Statistics? Descriptive Statistics o Identify patterns o Summarize information o Leads to hypothesis generating Inferential Statistics o Distinguish true differences from random variation o Allows for hypothesis testing Scales of Measurement Data Termed Data: a characteristic that is measured and recorded o A series of blood pressures is recorded on a piece of paper of a computer file Two Types of Characteristics Constant: characteristics that do not take different values under the conditions of the research Variable: characteristics that take different values under the conditions of the research o Dependent: outcome variables; a characteristic of interest, the variation of which is explained by variation in other variables o Independent: explanatory characteristic, the variation of which is used to understand the variation in the dependent variable Continuous vs Discrete Classified according to whether a variable can take on any values within an interval o Yes - Continuous o No - Discrete Continuous Variables: do not have gaps between values o Test scores Discrete Variables: has gaps between values o Number of hospital admissions o Dichotomous Variables: discrete variables that can take only one of two values Male or female Dead or alive Scales of Measurement Nominal o Qualitative Variables - Always discrete Categorical/nominal: assigns a name or number to observations in an arbitrary manner Least sophisticated (informative) of the four scales Display with a bar graph No information regarding quantity or amount Characterizes persons or things as being similar or dissimilar Has no ability to make "greater than" or "less than" distinctions Dichotomous variables have only two responses Yes/no Ordinal: assigns names or numbers to observations in a sequence Classifications incorporate the attributes of "greater than" and "less than" Cannot provide information as to how much less or more one category represents than another Quantitative Variables o Interval: assigns numbers that give both rank and order and constant distance between categories Any values between a theoretical minimum and maximum Provides information as to how much less or more one category represents than another Scale points with equal differences represent equal differences in the characteristics being assessed Has an arbitrary zero point o Ratio: identical to the interval scale except that the ratio scale has a true zero point Distributions and Graphs Descriptive Statistics Goal: take data from a sample and present it in a concise, understandable way Frequency distribution: the simplest way to present data Frequency Table Frequency: distribution shows the number of responses of each type Relative Frequency: distribution shows the proportion of responses of each type Cumulative Frequency: distribution shows the number of responses that are less than or equal to specified values Cumulative Relative Frequency: distribution shows the proportion of responses that are less than or equal to specified values Grouped Distribution It is sometimes more informative to arrange data into groups or intervals of values rather than dealing with individual values. In constructing tables of this sort, two related questions must be answered: o How many intervals should the values be grouped in to? o How long should the intervals be? There are no hard and fast answers to these questions but a rule of thumb suggests that there be no fewer than six and no more than 15 intervals Another helpful suggestion is that, when plausible, class interval widths of 5 units, 10 units, or some multiple of 10 should be used to make the summarization more comprehensible Graphic Representation It is often more informative to present distributions as graphs rather than in the tabular form. Many graphical forms are available Bar Graphs: used for nominal data o Can be constructed for frequency, relative frequency, cumulative frequency, and cumulative relative frequency distributions o Response categories are depicted along the horizontal (x) axis o Frequencies, relative frequencies, cumulative frequencies or cumulate relative frequencies are noted along the vertical (y) axis o The frequency, relative frequency, cumulative frequency, or cumulative relative frequency is read as the height, measured against the y axis, of a bar placed above the specified category o Normally used with discrete rather than continuous data Histogram - used for continuous data o Same general characteristics as bar graphs but are used with continuous data while bar graphs are used with discrete data o Because histograms are used with continuous data, it is usually the upper and lower real limits of the intervals that are labeled on the x-axis rather than the midpoint

