PSY 343 Ch. 6 Notes
PSY 343 Ch. 6 Notes PSY 343
Popular in Intro to Psychological Measurement
verified elite notetaker
Popular in Psychology
This 7 page Class Notes was uploaded by Tatum Notetaker on Thursday September 29, 2016. The Class Notes belongs to PSY 343 at DePaul University taught by Douglas Cellar in Fall 2016. Since its upload, it has received 6 views. For similar materials see Intro to Psychological Measurement in Psychology at DePaul University.
Reviews for PSY 343 Ch. 6 Notes
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/29/16
PSY 343 Ch. 6 Notes 6 Steps in test development o Defining the test’s purpose o Preliminary design issues o Item preparation o Item analysis o Standardization and ancillary research o Preparation of final materials and publication Defining the Test’s Purpose o Statement of purpose: includes delineation of the trait(s) to be measured and the target audience for the test Must be clear After determining your purpose, see if an appropriate test already exists Preliminary Design Issues o Mode of administration Is it going to be administered to individuals, or in a group setting? o Length About how long will the test be? Not simply a matter of the number of test items and testing time Also consider if the test will be a global measure of the trait to be tested, so a broad understanding? Or will the test provide a bases for a sensitive diagnostic analysis of the trait? o Item format Will the items be multiple-choice? True-false? Agree-disagree? Constructed response? o Number of scores Necessarily related to the question of test length How many scores will the test yield? o Score reports Will there be a simple, handwritten record of the score, or an elaborate set of computer- generated reports? What will be reported? Total score for the test, or also reports of performance on clusters of items? o Administrator training How much training will be required for test administration and scoring? o Background Research Include discussions with practitioners in the fields in which the test might be used Origins of new tests o Three principal sources of test development Most test arose from a practical need Some tests are developed from a theoretical base A large amount of test development work is devoted to revising or adapting existing tests Item preparation o Includes both item writing and item review 2 o A test item has four parts A stimulus to which the examinee responds Often called an item stem May be a question A response format or method Is it multiple-choice or constructed- response A condition governing how the response is made to the stimulus Whether or not there is a time limit for responding Whether the test administrator can probe ambiguous responses How is the response recorded? A procedure for scoring the response, or a scoring rubric Must be specified and understood when considering a test item Types of Test Items o Selected-response items: there are at least two options for an examinee to choose from Also called multiple choice or forced choice Likert format: scale from Strongly Agree to Strongly Disagree Usually a 3-point, 5-point, 7-point, or 9- point scale 3 Graphic rating scale: responses are marked along a continuum between two poles and converted to a numerical form later Semantic differential: an object is rated on a series of scales bracketed by polar opposite adjectives Ex. hard-soft, hostile-friendly, etc. Scoring selected response items Usually scored as correct/incorrect May be given partial credit, or may give extra weight to items that are more important on a test – though these more complicated systems seem to only yield slightly more reliable scores May assign varying numbers to the different responses for items in personality, interest, and attitude measures o Constructed-response: presents a stimulus but does not constrain the examinee to select from a fixed set of responses Also known as free-response A simple version is the fill-in-the-blank Essay test: presents a situation or topic and examinee writes a response that might range from a few sentences to several pages Performance assessment: the response involves solving a problem, completing an assignment, or producing something Scoring 4 Requires some judgment Two key factors: ensuring inter-rater reliability, conceptualizing a scheme for scoring Holistic scoring: the person scoring the essay forms a single, overall judgment about the quality of the essay Analytic scoring: the essay is rated on several different dimensions that have been previously specified Point system: certain points are to be included in a “perfect” answer; scorer determines the presence or absence of a point Automated scoring: a sophisticated computer program simulates the process of applying human judgment to free- response items Pros of selected-response vs. constructed-response o Selected-response advantages Typically has higher scoring reliability Typically is more time efficient Reliability generally increases as a function of the number of items Typically scoring is more efficient o Constructed-response advantages Allow easier observation of test-taking behavior and processes 5 Allows for exploring unusual areas that might never come to light in a selected-response format The type of test item used influences the development of students’ study habits Constructed-response items encourage a more holistic and meaningful approach to studying Suggestions for writing selected-response items o Get the content right o Don’t give away the answer o Keep it simple and clear Suggestions for writing constructed-response items o Make sure the task is clear o Avoid use of optional items Ex. answer 3 of 5 questions o Be specific about the scoring system at the time the item is prepared o Score anonymously The person scoring should not know the identity of the respondent Avoid halo effect o When there are multiple items, score them one at a time Again helps avoid the halo effect Practical Considerations in Writing Items o How many items should be written? 6 Depends partly on making good decisions at the preliminary design stage – are you using the appropriate type of item, did you thoroughly research the area to be tested Depends on doing a reasonable job of informal tryout to make sure the prototypes of intended items will work Common rule of thumb is to prepare two or three times as many items as needed for the final test o Items should be reviewed for Clarity Grammatical correctness Conformity with the item-writing rules above Content correctness Possible gender, racial, or ethnic bias 7