PSY 343 Ch. 6 part 2 Notes
PSY 343 Ch. 6 part 2 Notes PSY 343
Popular in Intro to Psychological Measurement
verified elite notetaker
Popular in Psychology
This 7 page Class Notes was uploaded by Tatum Notetaker on Sunday October 16, 2016. The Class Notes belongs to PSY 343 at DePaul University taught by Douglas Cellar in Fall 2016. Since its upload, it has received 3 views. For similar materials see Intro to Psychological Measurement in Psychology at DePaul University.
Reviews for PSY 343 Ch. 6 part 2 Notes
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/16/16
PSY 343 Ch. 6 part 2 Notes Item Analysis o Item Analysis: the statistical analysis of data obtained from an item tryout o Three closely related processes Item tryout Statistical analysis Item selection Item Tryout o Two stages of item tryout Informal tryout - 5-10 individuals similar to those for whom the test is intended take the test and comment on the items and test directions; may identify ambiguous wording, unexpected interpretations of an item, confusion about methods for responding, and other anomalies Formal tryout - samples that are representative of the target population for the test take it; samples need to be large enough to yield stable data Item Statistics o Traditional item analysis procedures depend on two concepts: item difficulty and item discrimination o Item difficulty: percent of examinees answering the item correctly for items scored correct/incorrect P-values: stands for percentage, the item difficulty levels o Item discrimination: an item’s ability to differentiate statistically in a desired way between groups of examinees o How to define groups with more or less of the trait External method – the basis for identifying groups is external to the test; depends on having two or more groups differentiated on the relevant trait Example: test a group of people who are depressed and hope their test scores are different from the group of people who are not depressed Internal method – the basis for identifying the groups is internal to the test; determine the extent to which an individual item differentiates between high scorers and low scorers Assume that the entire test is a reasonably valid measure of the trait High and low groups: either, the top and bottom halves of the distribution, the top and bottom thirds of the distribution, or the top and bottom quarters o Item Statistics in Item Response Theory Item characteristic curve (ICC): relates performance on an item to status on the trait or ability underlying the scale Performance on the item is defined as the probability of passing an item 2 The item’s difficulty parameter is the point at which the ICC crosses the mark for 50% probability of passing the item Slope: shows how sharply the item differentiates among persons of differing abilities Guessing parameter: the chance that a person will be able to guess the correct answer on a multiple-choice item The three parameters of ICC are: the slope, the difficulty parameter, and the guessing parameter A one parameter model only takes into account the difficulty parameter Rasch model: most popular one-parameter method Two parameter model takes into account difficulty and slope/discrimination o Factor Analysis as an Item Analysis Technique Help select items that will yield relatively independent and meaningful scores Item Selection o The final phase of the item analysis process o You have to select the items that will appear in the test to be standardized o Guidelines: General rule – to increase a test’s reliability, increase the number of items, but there is a point of diminishing returns where the addition 3 of new items does not significantly increase reliability The average difficulty of the test is a direct function of the item p-values; to get an easy test, use items with high p-values, and to get a hard test, use items with low p-values An easy test will provide the best discrimination at the lower end of the distribution of scores A hard test will provide the best discrimination at the upper end of the distribution In general, we want items with high discrimination indexes; good/high discrimination indexes are often no higher than about .50 and an index of .30 is respectable A collection of items with item discrimination indexes of .30-.50 will make a very good test There is an important relationship between an item’s p-values and the maximum possible discrimination index (D); if p = .50, then the max difference can be obtained between the high and low groups Certain items may be included in an achievement test to satisfy the demands of the test’s content specifications, to ensure content validity Standardization o Standardization program: yields the norms for the test 4 o Analysis of test scores by gender, race, age, geographic region, and other demographics is often completed with the norming data o Studies of the test’s validity may be conducted o Three types of equating programs may be conducted Equating alternate forms of the test (if available) Equating different levels of the test (if the test is multilevel) Equating new and old editions of the test (if the test is a revision) Preparation of final materials o The final step in the test development process is publication o A published test should have a technical manual Is the key source of info about the test’s purpose, rationale, and structure Information about the test’s reliability, validity, and norming procedure o Many tests have score reports o Publication may include a variety of supplementary materials Fairness and Bias o Test fairness: a test measures a trait, construct, or target with equivalent validity in different groups o Test bias: if a test does not measure the trait of interest in the same way across different groups There is bias only if the difference in averages does not correspond to a real difference in the 5 underlying trait the test is attempting to measure o Methods of studying test fairness Panel review: examination of the test items by representativeness of a variety of groups Two drawbacks: it’s hard to know how many groups should be represented, and members of the panel rely entirely on their own opinions Differential item functioning: addresses the question of whether individual test items function differently for different groups of examinees for reasons other than the actual differences on the trait being measured; aim to detect bias by statistical analysis Mantel-Haenszel procedure: dividing the reference and focal groups into subgroups based on total test score o Differential prediction An unbiased test should yield equally good predictions for various groups Criterion-related validity Intercept bias: the intercepts of the regression lines differ for two groups Slope bias: the slopes of the regression lines differ for two groups Accommodations and modifications o Accommodations: changes in the standardization procedures for a test 6 Should render the test’s validity and norms equally applicable to disabled and nondisabled examinees A person takes essentially the same test as other people but with slight changes in the testing conditions o Modification: an attempt to assess a given skill or trait but with essentially different methodology 7