Tuesday, June 30, 2020
Problems With New PSAT Part 2 Score Discrepancies
[Part 2: Score Discrepanciesà is the secondà of a three part report on the new PSAT. à See Overview,à Part 1: Percentile Inflation,à and Part 3: Lowered Benchmark.à The entire report can also beà downloaded or distributed as a PDF.] Part 2 : Score Discrepancies An historically narrow gap between sophomore and junior performanceà does not seem credible and leads to questions about how scoring, scaling, and weighting were performed and reported. Sophomore Versus Junior Score Discrepancies Call Scoring Methodologies into Question Percentile inflation caused by redefinition and re-norming creates unfortunate misinterpretations, but the sources of the change can be readily identified; previous percentile tables can be restatedà based on the new definition; the difference between Nationally Representative percentiles and User percentiles can be compared to gauge the difference added there. However,à without further information from College Board it is impossible to know the accuracy ofà the 11th and 10th grade percentiles. Ourà analysis shows that there are significant problems in the way the numbers are being presented that mask the very thingà the new test was meant to reveal ââ¬â college readiness and academic progress. à If score results between grades are suspect, it leads to questions about the pilot studies that wereà performed and how they inform the scoring for the PSAT and SAT. Expected Versus Observed Score Differences Between Grades Historically, juniors have outperformed sophomores on the PSAT/NMSQT by approximately 5 points per section [see table below]. Translated into SAT scores, the differences between 10th and 11th gradersà in 2014 were 48 points, 47 points, and 51 points in Critical Reading, Writing, and Math, respectively. On the new PSAT, however, the reported difference is only 12 points onà Evidence-Based Reading and Writing (EBRW) and 19 points in Math.à The average difference in 2014 is more than 3 times that seen in 2015. The 2014 grade differences were in line with those seen over the last decade, so they wereà not anomalous. The old and new PSAT are different tests, but student growth tends to show up similarly even on different college admission exams. Are Low Score Discrepanciesà Due to Differing Testing Populations? Not all sophomores and juniors take the PSAT. Some take the PSAT as mandatory testing; some take the PSAT in order to qualify for National Merit; some take the ACT Aspire instead of the PSAT. If College Boards calculation of a nationally representative sample is correct, though, this years grade differences should be immune from differences in test-taker demographics. Previous PSATs lacked a nationally representative sample, so sophomore to junior comparisons may be distorted by test-taker patterns. A way of removing potential distortion is to look at the results only for repeat testers students who took the test in both school years. College Board has done research on the typical score changeà on the old PSAT by analyzing only students whoà took the testà as sophomores and repeated the testà as juniors [see table below].à The average increase, expressed in SAT points, was 33 points in Critical Reading, 33 points inà Writing, and 40 points in Math. The figures are still twice what is being shown on PSAT reports as the 10th grade to 11th gradeà score differential. Do Content Differences Between Old and New PSATs Provide an Explanation? A remainingà problem is thatà the old PSAT is not the new PSAT.à Although the new and old tests cover roughly the same score range and do not have radically different means or standard deviations, we cannot be certain that year-over-year growth is identical. A third set of data is College Boardââ¬â¢s own estimates of growth. Below are the College and Career Readiness Benchmarks. College Board assumes that students improve at roughly 30 points from sophomore year PSAT to junior year PSAT and another 20 points from junior year PSAT to SAT. The PSAT figures ââ¬â whichà themselves seem conservativeà ââ¬â are still twice that shown in the 2015 student data. Percentile Data for Sophomores and Juniors Mayà Prove the Existence of Errors in Presentation, Computation, or Norming The low observedà score differences between 10th and 11th graders do not fit into a historical pattern, match studies of repeat testers, or align with assumed College Board benchmark progress. As improbable as the small point discrepancy is, though,à ità seems impossible to go one step further and stateà that sophomores outperform juniors. But this is exactly what the publishedà percentile tables show [below]. As you move up the scale, the difference between 10th and 11th graders disappears and then turns in favor of the younger students. Read literally, the score tables say that more sophomores than juniors achieved top scores on the PSAT/NMSQT. There have always been talented sophomores who score highly on the PSAT, but as a group, these students should not do better on the PSAT in 10th grade than they do in the 11th. These figures are for the Nationally Representativeà groups, so cannot be explained away by saying that the test-taking populations are different. There isà no logicalà statistical or content explanation as to how sophomores could actually perform better than juniors. In fact, we should be seeing scores 30-50 points higher per section for juniors.à The most likely explanationà is that the surveying and weighting methods used for the PSAT did not properly measure the class year compositions. If we assume this to be the case, though, can we be assured that the studies did any better in measuring the intra-class composition? Will the SAT be immune from the same problems? Can Anything Explain the Low Sophomore/Juniorà Score Differences and the Score Inversion? A suspect in the mix is the PSAT 10. Although the content of the PSAT 10 is identical to that of the PSAT/NMSQT, it is positioned as a way for schools to measure how students perform near the end of the sophomore year rather than toward the outset of the year. The PSAT 10 will first be offered between February 22 and March 4, 2016. It is a safe assumption that spring sophomores, adjusted for differences in the testing pool, will score higher than fall sophomores. If College Board statistically accounted for PSAT 10 takers in their figures, the scores for sophomores would be inflated. It seems academically inappropriate to lump PSAT/NMSQT and PSAT 10 scores into the same bucket. The tests are taken at different phases of a students high school progress. In fact, oneà reason a PSAT 10 exists is because spring performance differs from fall performance. The only clue that College Board may have made such a combination is reproduced from itsà Understanding Scores 2015. Highlighting has been added. Its likely that this reference is simply the result of a production error. The document never makes this referenceà again in its 32 pages. In short, all figures likely measure Octoberà performance for sophomores and juniors. This final attempt to explain theà anomalous supremacy of sophomores comes up short. Even had a PSAT 10à explanation proved successful, it would have raised more questions than it answered. Tables surrounding the PSAT are all marked as ââ¬Å"Preliminary.â⬠College Board has madeà clear that final scaling for the redesigned SAT (and the PSAT is on the same scale) will not be completed until May 2016. Final concordance tables between old and new tests will replace any preliminary work. If the explanation ofà the statistical anomalies is that the paint is not yet dry, it begs the question as to what 3 million students and their educators are to do with the scores they have been presented. The new PSAT reports are the most detailed that have ever existed. They have total scores, section scores, test scores, cross-test scores, sub-scores, Nationally Representative percentiles, User percentiles, SAT score projections, sophomore and junior year benchmarks, and more. Which parts of the reports are reliable and which parts remain under construction? Should educators simply push these reports aside and wait until next year? Should students make test-taking and college c hoice decisions based on these scores? [Continue toà Part 3: Lowered Benchmark]
Subscribe to:
Posts (Atom)