88 Comments
founding

"The legacy SAT primarily evaluated students based on their scores and completion time, with the College Board responsible for crafting sets of questions that ensured a fair assessment of students' abilities. This involved analyzing both the questions themselves and the overall exam structure.

However, with the introduction of variable difficulty levels, a third evaluation criterion emerges. While this presents challenges in terms of management complexity, the key question is: how will this information be communicated to schools and students? Moreover, if this information is disclosed, how will the College Board handle situations where the test is perceived as too difficult leading to failure, or too easy resulting in success?

Recognizing the need for change, one potential solution could be the implementation of infinite testing, where questions continue to be presented until the end of the allotted time. Results could then be provided in two components: a core test score and an additional test score, offering a more comprehensive evaluation of students' abilities.

Ultimately, we need an assessment tool allowing to rank students on their core performance.

Expand full comment

Wait what? They adapt the difficulty of the questions to how well you’re answering them? So you get softballs if you’re a dolt? How on earth do you “standardize” that?

This again seems like an example of providing the answer to a question no one was asking. What was wrong with the old standardized testing that provided the impetus for changing it?

Expand full comment

The announcement of elite schools returning to "SAT-mandatory" just as the College Board also announces the most sweeping changes in decades is more than a little curious.

Expand full comment

Easier questions if you answer incorrectly and harder questions is you are doing well? There's equity for you. It's really just punishment for hard working and successful.

Expand full comment

Indeed there are many cases where computerized "assistance" ends up a total mess, and a few where an ideologically-motivated programmer skews results. But the possibility that an internal Bad Actor cheats isn't likely, and I kinda doubt that the new "improved" system would be more vulnerable to cheating than the old one. Honestly I don't know.

One way to check for monkey business would be to look at results, broken down by demographic of your choice, after the change vs before.

I'd say the company that runs it has a good track record, and has solid incentives to maintain its reputation. That doesn't guarantee anything, but ...

Expand full comment

All these changes sound great.

GMAT's been using adaptive testing for years, and using it successfully. It's more efficient, so needs lest testing time to make an evaluation.

None of these changes means the test is easier. We can judge that by comparing new vs. old score distributions.

Let's presume innocence, and let's not be grandpas that shout "Back in my day we had to sharpen PENCILS!"

Expand full comment

Does anyone know how a test in which a taker gets hard questions and a test taker gets easier question, but they get the same number right, Will they be graded differently and show different scores?

Expand full comment

1. The RN Licensing exam (NCLEX) has been using computerized "Adaptive Questions" since 1995:

The "Adaptive Questions" system is actually quite good and efficient, and I have both stochastic comfort with it (being an engineer), as well as first hand experience:

2. After 20 years working as a Chemical Engineer I changed careers in my 40s and obtained a second Bachelors degree in Nursing in 1995 (BSN). That also happened to be the first year that the Nursing Licensing exam in New Jersey (NCLEX) went both digital and used "Adaptive Questions". It did NOT 'lower the bar'. To wit:

3. As I recall, back then there were a total of 255 potential questions of which a minimum of 75 had to be answered. The test would stop as soon as the algorithm was able to determine your knowledge level with 75 questions being the minimum neede to reach such a determination.

4. All of the top students in my class (myself included) received the minimum 75 questions and passed, implying high consistancy in answer quality. In fact all of the students in my class who had to answer less than 100 questions also passed. Above 100 questions and some classmates started failing, with the numbers who failed increasing almost linearly with the number of questions required, implying inconsistancy in answers (AKA 'guessing').

5. So the "Adaptive Questions" method makes it harder to get lucky by guesing the correct answer, whiich is easier to do with the 'total number of correct answers' method.

Expand full comment

Thank you so much for your comprehensive response.

Just to let you know, a friend was college placement high school counselor for a private school. The school had kept copies of SATs going back to the early 60s. About 15 years ago he told me that several teachers had taken the 60s 70s SATs and compared them those from the 90s/early 2000s. He told me there was a huge difference in difficulty from those earlier challenging test to the newer less rigorous tests.

Expand full comment

So, if your questions are dumbed down but you get them all right, are you going to end up with the same score as someone whose questions got harder and got them all right? If so, this test is complete trash.

Expand full comment

1. The RN Licensing exam (NCLEX) has been using computerized "Adaptive Questions" since 1995:

The "Adaptive Questions" system is actually quite good and efficient, and I have both stochastic comfort with it (being an engineer), as well as first hand experience:

2. After 20 years working as a Chemical Engineer I changed careers in my 40s and obtained a second Bachelors degree in Nursing in 1995 (BSN). That also happened to be the first year that the Nursing Licensing exam in New Jersey (NCLEX) went both digital and used "Adaptive Questions". It did NOT 'lower the bar'. To wit:

3. As I recall, back then there were a total of 255 potential questions of which a minimum of 75 had to be answered. The test would stop as soon as the algorithm was able to determine your knowledge level with 75 questions being the minimum neede to reach such a determination.

4. All of the top students in my class (myself included) received the minimum 75 questions and passed, implying high consistancy in answer quality. In fact all of the students in my class who had to answer less than 100 questions also passed. Above 100 questions and some classmates started failing, with the numbers who failed increasing almost linearly with the number of questions required, implying inconsistancy in answers (AKA 'guessing').

5. So the "Adaptive Questions" method makes it harder to get lucky by guesing the correct answer, whiich is easier to do with the 'total number of correct answers' method.

Expand full comment

That sounds like a pretty good system. I hope they are using the same sort of criteria here.

Expand full comment

Unless I'm mistaken, licensing exams are pass/fail. Would such a system be applicable to the SAT, especially for selective schools where under this new scheme, a 1450 achieved by one student may not necessarily equal a 1450 achieved by another student. I'm assuming that two students could have the same score but one score has been adapted because they tanked some early questions. I'll be curious to see how this works in practice.

Expand full comment

Same. Because if it's what I'm thinking, I am a big fat hell no on this.

Expand full comment
Mar 25·edited Mar 25

"Adaptive questions."

So. Two test-takers. One got the easy questions and the other got the harder questions. Both ended up with the same score. Without some indication as to the level of difficulty that each test-taker faced, how is there any meaning to attach to their on-its-face equal scores?

This smacks of "equity," meaning equality of outcome. Amazing how our institutions are so dedicated to the DEI framework.

Expand full comment

That's NOT the way it works:

1. The RN Licensing exam (NCLEX) has been using computerized "Adaptive Questions" since 1995:

The "Adaptive Questions" system is actually quite good and efficient, and I have both stochastic comfort with it (being an engineer), as well as first hand experience:

2. After 20 years working as a Chemical Engineer I changed careers in my 40s and obtained a second Bachelors degree in Nursing in 1995 (BSN). That also happened to be the first year that the Nursing Licensing exam in New Jersey (NCLEX) went both digital and used "Adaptive Questions". It did NOT 'lower the bar'. To wit:

3. As I recall, back then there were a total of 255 potential questions of which a minimum of 75 had to be answered.

The test would stop as soon as the algorithm was able to determine your knowledge level with 75 questions being the minimum needed to reach such a determination. If you answered a question correctly the next question would be harder. If you answered incorrectly the next question would be easier. This pattern would repeat until you were getting consistently getting 50% of the questions correct and 50% wrong. The level of difficulty of the questions at which the 50-50 point was reached determined whether you passed or not. I don't know what your math background is, but I have a degree in Chemical Engineering so I'm quite at ease with Probabilities & Statistics.

4. All of the top students in my class (myself included) received the minimum 75 questions and passed, implying high consistancy in answer quality. In fact all of the students in my class who had to answer less than 100 questions also passed. Above 100 questions and some classmates started failing, with the numbers who failed increasing almost linearly with the number of questions required, implying inconsistancy in answers (AKA 'guessing').

5. So the "Adaptive Questions" method makes it harder to get lucky by guesing the correct answer, whiich is easier to do with the 'total number of correct answers' method.

Expand full comment

TY for the history lesson. I wonder if the 'adaptive question' scheme being applied to today's SAT exams is the same as was used nearly 30 years ago for an RN licensing exam. If anyone here has inside knowledge of this new change to the SAT format, input would be appreciated.

Expand full comment

Excellent question, and the answer is nobody really knows because the SAT officials have not been transparent about exactly how the new system is being implemented.

Expand full comment
Mar 25·edited Mar 27

The biggest fix required for academic integrity - offer the test as a choice for "timed" or "extra time". And report that choice to the colleges.

The most obvious form of academic dishonesty concerning standardized tests is unreported extra time. And the colleges are equally guilty. Colleges are incentivized not to distinguish between tests taken with time vs 50% extra time. These inflated scores will then improve the overall scores that colleges report as their testing averages. Currently, the admissions office receives no information regarding test time as part of the college application...in order to protect the applicants medical privacy. When the number of untimed test takers is so high, there is hardly a stigma. And therefore hardly a reason to protect the medical privacy.

Extra time is offered to students who are diagnosed with anxiety, ADHD, or other disabilities. Unsurprisingly, the wealthier the student, the more access to a doctor who will provide that diagnosis. Some reasons for extra time are with merit - as my daughter's intelligent classmate with cerebral palsy certainly needed the extra time. But for the most part, these reasons for extra time are acquired through subjective testing from psychiatrists who are also incentivized to positively present a disability diagnosis. There is no bloodwork or ct scan or any definitive medical test to diagnose so many disabilities, and so it's a system that can easily be manipulated. Extra time is the cheater's loophole.

My daughter attends a high school with students along a wide socio-economic range. The wealthier the student, the more likely that their tests are arranged with extra time (school tests, standardized tests and even AP tests). Exactly half the students in her AP Calculus class receive extra time.

An alarming number of honors students (I'd love the statistic) take standardized tests with extra time at the public and private high schools nearby. At my daughter's school, more than one classmate is a National Merit Finalist, receiving extra time advantage. These classmates are nationally recognized for academic achievements as elite students. Though in the classroom, it's a different story. One of her classmates relies on her testing advantage - to complete tests after taking a break and sometimes finishing the following day. How is that even a test?? Needless to say, this student studies far less than others who must complete their work before the bell rings.

My daughter scored in the mid-1400's on the SAT, running out of time and guessing on the last problems. She took a few practice tests to prepare, and on the untimed tests, she's scored in the mid-1500's. If we were cheaters and her classmates played it fair, my daughter could have been the national merit finalist instead.

Expand full comment
founding

That's a great score. My son scored 1380 doing it without the extra time. He also said he ran out of time and just filled in the circles on some questions. That is part of the test design. Processing speed is an important aspect. Never heard of kids going home and coming back the next day to finish, but the whole extra time scam throws off the whole test. Totally unfair. Despite that, I am sure your daughter will do fine as she is prepared to do college work and hopefully she found a great place for her to go to school.

Expand full comment

I would love an expose on accommodations. Rich kids are never average or dumb; they just have different "learning styles" or "test anxiety."

Their parents are the academic equivalent of air travelers who take their emotional support peacocks on the plane.

Expand full comment

THANK YOU for sharing that!!! I can't believe the timing of that article - especially after writing my response.

There are only two solutions for this cheating.

1. Report all scores as timed or extra time.

2. Offer the test a. timed or b. extra timed to everyone AND report all scores as such

I'll be very unpopular writing this. I don't believe that ADHD is real. I don't believe that having anxiety should permit an advantage taking the test. And most of all, I hate the idea of kids being medicated for these so-called disabilities. Aside from dyslexia, it's all BS.

The worst is that these kids who get a big boost 100-200 points higher are getting into schools and taking a spot from other students who are more deserving. I've been watching it happen real time to my daughter. She's watching her weaker classmates (with extra time) succeed at this admissions game - they have been accepted to schools while my daughter has been waitlisted. Just terrible.

The schools are definitely playing a game too. As test optional, they need to make sure their admitted students who submit scores will fit in the ranges that they want to publish (ie. 1520 at the 75th percentile to 1450 at the 25th percentile). They make it clear not to submit scores if you don't fit in this range. Only half of applicants now submit test scores at many highly competitive schools - given the criteria.

Expand full comment

I absolutely agree with you. Anyone who truly needs extra time struggles in ways that probably preclude going to the most competitive colleges.

Given that these cheating parents will resist any efforts to report who gets extra time, one potential "fix" is to give all students the maximum time permitted for those with accommodations.

Your daughter has something none of her peers has - integrity. They may be winning the admissions battle, but they are losing the life skills war. Their parents have basically told them that there is something wrong with them when there isn't or that they don't feel confident that they can succeed on their own merits. Eventually, most of them will reach a point when mommy or daddy can't run interference for them anymore and it won't be pretty.

Wishing you and your daughter the best of luck.

Expand full comment

For context I'm a 76 year old Boomer who took the SAT in 1965 and went on to earn a degree in Chemical Engineering, followed by an MBA and then much later in mid-career by a BSN.

1. The SAT is the best, and fairest, predictor of college performance because it is OBJECTIVE, and it should be made even more objective by eliminating any 'essay' component from the Verbal section: Verbal REASONING skills (which are largely innate) need to be evaluated, not WRITING skills (which can largely be taught). So eliminate any essays entirely.

2. The "Adaptive Questions" system is actually quite good and efficient, and I have both stochastic comfort (being an engineer), as well as first hand experience with it:

After 20 years working as a Chemical Engineer I changed careers in my 40s and obtained a second Bachelors degree in Nursing in 1995 (BSN). That also happened to be the first year that the Nursing Licensing exam in New Jersey (NCLEX) went both digital and used "Adaptive Questions". It did NOT 'lower the bar'. To wit:

3. As I recall, back then there were a total of 255 potential questions of which a minimum of 75 had to be answered. The test would stop as soon as the algorithm was able to determine your knowledge level with 75 questions being the minimum neede to reach such a determination.

4. All of the top students in my class (myself included) received the minimum 75 questions and passed, implying high consistancy in answer quality. In fact all of the students in my class who had to answer less than 100 questions also passed. Above 100 questions and some classmates started failing, with the numbers who failed increasing almost linearly with the number of questions required, implying inconsistancy in answers (AKA 'guessing').

5. So the "Adaptive Questions" method makes it harder to get lucky by guesing the correct answer, whiich is easier to do with the 'total number of correct answers' method.

Expand full comment
founding

What you describe is great as a pass-fail test. But, the black-box of adaptive scoring isn't clear how it would give 400-800 scores.

Expand full comment

I agree that they need to be transparent with how the adaptive method generates a specific numerical score, but it can be done. For example:

With the NCLEX test that I took back in 1995 if you answered a question correctly the next question would be harder. If you answered incorrectly the next question would be easier. This pattern would repeat until you were getting consistently getting 50% of the questions correct and 50% wrong. The level of difficulty of the questions at which the 50-50 point was reached determined whether you passed or not. So to generate a specific numerical SAT score one might use the level of difficulty at which the 50-50 point was reached. With the present setup you would have to have 800 levels of difficulty which is a bit overly fine a graduation (not impossible, but not easy), - however you could certainly do it with 100 levels of difficulty and then the SDAT scores would be 1 to 100.

Expand full comment

I’ve been in the assessment industry for 20+ years and have seen the benefits and drawbacks of incorporating technology into testing. I can’t speak to the specific changes to the SAT, but I can say that adaptive testing has been highly successful in accurately assessing abilities in less time. As someone else mentioned the GRE has been adaptive for over 20 years with strong validity evidence. We use adaptive testing in employment settings as well. The best assessments are highly predictive and create a positive tester experience. It’s good to see the SAT catching up in this regard.

Expand full comment

I agree. Mine was the first Nursing Class in New Jersey to take the adaptive Licensing Exam (NCLEX) way back in 1995. And it was my second career: Prior to that I was a Chemical Engineer for 20 years and thus very comfortable with stochastic approaches.

Expand full comment

test

Expand full comment

I sympathize with those appalled by elimination of the SAT but I’m surprised by those whinging about the new digital format. Psychometrically speaking it should provide a more accurate measurement. I guess the opposition relates to those who simply dislike change, along with those who distrust authority.

Expand full comment

Any computer program has the issue of garbage in, garbage out. AI has already made it plain that the ideology of the programmers has a huge effect on the end product.

I can't blame people for questioning what is being put INTO the programming of the digital SAT. Seems like there was a scandal few months back where POCs were given a code word to use in their applications/resumes to ensure that they wouldn't be screened out. Is there any way to guarantee that there are no "cheat codes" in the digital SAT?

Expand full comment

The College Board, which makes the SAT, really fooled conservatives that believe in merit for college admissions. Now, they will continue to rake in billions in tax payer funding with a new test that has little to do with merit, while indoctring kids with leftist ideology via their tax payer funded AP curriculum that's taught in most schools around the country.

Expand full comment

Big Academia, including the College Board, has lost the faith of a growing number of Americans through its deceptive and insidious radical politicalization of education. At every step they have obfuscated their actions with benign-sounding terminology and nomenclature that disguised the reality of their objectives. Consequently, it is reasonable to view the new SAT with suspicion and skepticism. Is it purely an improved assessment tool or the vehicle for an unstated agenda? Unfortunately, our ability to assess its true impact is handicapped out of the gate. What academic psychometrician wants his name on a paper that will evaluate this new test if the results of that assessment are politically incorrect?

Expand full comment

Bring back the old SAT no essay section . 1800 is perfect and it works . If u do not do well u know what extra classes should be taken before paying college expenses.

It’s good for all students to know if they r prepared for college and where their weaknesses lie. It saves parents and students money!

Every time the teacher union changes things it’s always bad . We need phonics and rote math . Everyone can learn this way . Also , stop catering to the stupid :bring students up to and beyond expectations. I taught and if students believe they can do amazing things .

Expand full comment

I took the SAT way back in 1965 and I completely agree with eliminating the essay section: Verbal REASONING skills (which are largely innate) need to be evaluated, not WRITING skills (which can largely be taught). So eliminate any essays entirely. I don't believe there was an essay section back then, but as its been 59 years since I took the SAT my memory could be off.

Expand full comment