ped4470.fall07.week4

 
Lecture on selection and administration of assessment instruments
By: ofurtado
1 years 0 months ago
Copy and paste these links in your blog or web site to share the presentation.
URL:
Embed:
Thumbnail:
Report Spam Share Presentation Download
Add to my content
 
Related Presentations
COGS General As...
Undergrad and Grad
By lee Luther
GEOL 414 Applie...
Undergrad and Grad
By Andrew Hudson
Focus Group: Co...
Undergrad and Grad
By Lambert Ardy
Presentation Transcript
PED-4470 Measurement & Evaluation in Physical Education : PED-4470Measurement & Evaluation in Physical Education Evaluation of Assessment Instruments Ovande Furtado, Jr., M.S.
Objectives : Objectives Students should be able to: Understand the implications of using different standards of comparison for making evaluative statements of about student learning
1. Setting the stage : 1. Setting the stage Norm-referenced standards Compare student to student Criterion-referenced standard Compare student to an expectation Self-referenced standards Student progress is observed, tracked and compared with prior performance 1 2 3 4 5 6
1. Setting the stage : 1. Setting the stage What are the implications? Form making judgments! What students know and are able to do (NASPE) Not guessing! Make accurate decisions Make use of assessment Not all assessment will do the job The rule one size fits all does not apply here 1 2 3 4 5 6
1. Setting the stage : 1. Setting the stage We have a problem Most appropriate assessment instrument based on our needs Two approaches to take when evaluating an instrument Administrative feasibility Psychometric quality
2. Administrative feasibility : 2. Administrative feasibility Testing population Purpose Age and sex appropriateness Safety 1 2 3 4 5 6
2. Administrative feasibility : 2. Administrative feasibility Assessment Population School grade Age group Special populations Gender Simply not ethical Why? Often will make wrong decisions based on test results
2. Administrative feasibility : 2. Administrative feasibility Purpose of assessment Test title not the same as test purpose Physical fitness Health-related physical fitness Athletic performance
Psychometric Qualities : Psychometric Qualities Validity Reliability Objectivity Freedom from assessment bias
What is Validity? : What is Validity? Veracity of an assessment instrument Degree to which is assess the attribute it claims to assess Allows meaningful inferences to be made Can an assessment reach 100% in validity? Degree to which accumulated evidence supports the inferences to be made from scores
Validity : Validity Gathering different types of evidence to support the different types of inferences to be made from scores Three sources of evidences for norm-referenced assessments: Content validity evidence Criterion validity evidence Construct validity evidence
Content Validity : Content Validity “Degree to which the sample of items, tasks, or questions on a test are representative of some defined universe or domain of content” (AERA, 1985) Ex.1: Questions not taught in the course Ex.2: Motor Skill Assessment Ex.3: Health-related Assessment Established through judgments of content experts
Criterion-relate Validity : Criterion-relate Validity Evidence that scores reflect one or more outcome criteria Two types of criterion-related evidence predictive evidence Future behaviors SAT concurrent evidence Behavior in the present Compare to already valid tests
Construct Validity : Construct Validity Item analysis studies If the outcome has real meaning: Individuals who posses a lot of the attribute should receive a better score Age Validity is claimed when the assessment scores tend to agree with the expectations (TGMD)
Decision Validity : Decision Validity Evidence that instrument correctly classifies masters from non-masters Coefficient of .80 preferred How can you tell a test is classifying accurately? Define the cut score (experts) Basketball free throw Compare with already validated test
The need for reliability : The need for reliability Not enough for a test to assess accurately, it must do it consistently Consistency with which an assessment instrument assess whatever it assess Nearly every time it is used A valid assessment is always reliable, but a reliable assessment is not necessarily valid Can you reason why is this so? Assess something else it claims to if it does it consistently, then it is reliable
Reliability : Reliability Test-retest Internal consistency Split half Parallel form Inter/Intra-Rater
Reliability - Types : Reliability - Types Test-retest Consistency of scores over time Same individuals taking the test twice How much time apart? Problem? Time and resources
Reliability / Types : Reliability / Types Split half Consistency between performances on the two halves of the test Problem? Long tests are more reliable Calculation ANOVA and Pearson correlation coefficient
Reliability - Types : Reliability - Types Parallel Forms Degree of consistency in scores on two forms Items, levels of difficulty, directions, scoring, and interpretation Calculation ANOVA
Reliability / Types : Reliability / Types Inter-Rater Consistence of scoring for independent raters Intra-Rater Consistency of scoring for a single rater
Reliability - Interpretation : Reliability - Interpretation Different types of consistency Cannot set a standard value Examples Muscle strength (.95) Motor accuracy (.85) Longer tests more reliable What to do? Look at others have used (.80)
Sources of measurement error : Sources of measurement error Lack of agreement among raters (i.e., objectivity) Lack of consistent performance by person Failure of instrument to measure consistently Failure of tester to follow standardized procedures
Freedom from assessment bias : Freedom from assessment bias Ensuring the testing group does not differ from the population from which the test was created
Desired levels of reliability : Desired levels of reliability Multiple-choice achievement tests .85 Open-ended paper-and-pencil .65 Portfolio .40 “Thus, you may tolerate moderate levels of reliability of .70 or higher for any one assessment results as long as several pieces of information are combined for classroom decisions”.