Commonalities/Differences across tests

Posted by: hamburgerman

Commonalities/Differences across tests - 02/14/22 01:42 PM

Just after turning 5, my son took both the WPPSI-IV and then the SAGES-3 Non-Verbal Reasoning within a few weeks of one another. The WPPSI was first part of an evaluation for a medical condition typically associated with cognitive difficulties, the SAGES was for entry into our districts G/T pre-program.

His WPPSI scores were:


I was able to observe this test, unfortunately taken in the afternoon after a full day of school, and saw the loss of focus over the 2 hours, with the zoo locations among the last, and he was mostly just being silly and uncooperative.

Based on the FSIQ, I had thought the SAGES-3 95% cut-off would be met with certainty, so I was surprised when that came back with a 91%.

I have tried googling to understand what the SAGES-3 questions are like in comparison to the WPPSI-IV and can't find any good examples.

I am pretty new to all this, and if not for the WPPSI from this other program I would have just moved on. I am concerned because he misbehaves when bored, and that is not a road I want to go down. My slightly older son scored a 99% on the SAGES-3 for school the year prior, and I consider them evenly matched albeit with some different strengths.

So among my many questions are:

1) Am I wrong in my expectations off of the WPPSI score? Where does this FSIQ fall in % terms.
2)How do we end up with a FSIQ of 138 when only the Visual Spatial Composite is that high?
3)Does one of the WPPSI subtest mimic the SAGES-3 NVR?
4)If these results are "inconsistent" is one more valid than the other? Is this type of inconsistency common?

Thanks for the insight -- its been tough to gain info, the school administration has a very laissez faire attitude, and the developmental pediatrician was satisfied with his result in the context of the medical predisposition. (And I didn’t have the SAGES result to ask about yet)
Posted by: aeh

Re: Commonalities/Differences across tests - 02/16/22 06:41 PM


Some context: different tests are not expected to generate identical scores, both because they are different tests with different specific items on them, and because of the concept of regression to the mean, which finds that if the first score is far off the mean, the second measure is likely to be closer to the population mean (in somewhat oversimplified terms).

Also, score stability in very young children, such as your DC, is not particularly high--not nearly as high as a few years later, closer to eight or nine years old. This is for many reasons, including the wide range of normal variation in when young children make certain developmental leaps, and the challenges (which you witnessed) of obtaining optimal performance from them, for perfectly normal developmental reasons.

To your specific questions:
1. Your expectation from the WPPSI is not exactly incorrect. 91 is not really that far away from 99, if you include confidence intervals. This is where regression to the mean and differences in tests come in.

2. The FSIQ is not derived from a simple average of the subtest scores. It is much more unusual to have multiple areas quite high than to have only one. Also, not all of the subtests are included in the FSIQ. Only the two VC, two FR, and one each from VS, WM and PS are included. As it happens, the two scores not used are his two lowest subtests (including ZL, which you questioned the validity of).

3. The SAGES-3 NVR is probably most like the WPPSI-IV FRI--but not exactly.

4. Neither is more valid, when used as they are designed to be. Children test differently on different occasions, and with different measures, even when the tests are intended to access the same underlying constructs of cognitive ability.

SAGES is also more of a screener than a comprehensive measure. If I were going to pick one to lean on more heavily as a straight-up cognitive measure, it would be the WPPSI, as it's designed as a comprehensive measure, and also had almost twice as many individuals in its standardization sample, but that doesn't mean the SAGES-3 result is not real as far as it goes, for the purpose (in the context of the original complete measure) of identifying appropriate placements to certain types of programming for advanced learners. That original context consisted of two cognitive measures (verbal and nonverbal reasoning), and two academic measures (language/social studies and math/science). I should also note that your DC was comfortably in the middle of the age range on the WPPSI, but at the very bottom of the age range on SAGES, which may or may not have affected the SAGES score more than the WPPSI scores.
Posted by: hamburgerman

Re: Commonalities/Differences across tests - 03/04/22 06:50 AM

Thanks for your response, its been insightful going through your other posts as well.

I have noticed in googling that the SAGES-3 offers the test as 'K-3' and then a version for older children. Does that mean kindergarten students answer the same NVR questions as the third graders? My understanding is the test was 20 minutes for my son, which I interpret as meaning it couldn't have had very many questions to provide differentiation across such a wide age range. The older grades seem to have longer testing windows. Does that mean it’s just the same question but more of them?

Posted by: aeh

Re: Commonalities/Differences across tests - 03/10/22 04:23 PM

I'm glad they were helpful to you!

In answer to your question: not necessarily. Most tests of this nature have basal and ceiling rules, often with start points based on age, so younger students typically will see easier items than older students, even if they are working off of the same form. For example, (and these are not taken off this test in particular) say a subtest has 30 questions. They would be arranged in order of difficulty, with the easiest questions at the beginning, and increasing progressively in difficulty until the last item. The kindy students might start on number 1, and continue until they get three in a row wrong, at which point the discontinue rule would be triggered, and the examiner would stop testing. Grade 3 students might start on item 8, and work until they triggered the discontinue rule. So the 20 minutes might not cover the same items for any two students.

If there is a single start point, but a ceiling/discontinue rule of some kind, then one would expect the older students to take longer, since they would probably go further before getting that many wrong in a row.

When the raw scores are converted to percentiles/scaled scores, they are compared to their own age groups, which compensates for the difference between what a typical older student and younger student might have learned.
Posted by: Eagle Mum

Re: Commonalities/Differences across tests - 03/11/22 04:41 PM

Originally Posted By: hamburgerman

I was able to observe this test, unfortunately taken in the afternoon after a full day of school, and saw the loss of focus over the 2 hours, with the zoo locations among the last, and he was mostly just being silly and uncooperative.

Tangential post for casual viewers that since IQ testing is quite costly, when arranging for an appointment, consideration should be given to this issue of fatigue affecting concentration.

My eldest’s first test was administered when she was 3.5 yrs old, to determine her eligibility for early entry to school. It was a morning appointment, but the assessor administered a ‘school readiness test’ before the IQ test and nearly three hours later, DD had had enough and tried to hold the assessor to a ‘promise’ that testing could be completed at a second session. The assessor, insisting that they were already so close to finishing, pushed on and DD became passively uncooperative, not answering questions which I knew she could. In the IQ report, she was in the 98th & 99th percentiles for all sections except the last in which she was in the 63rd percentile. Her reported FSIQ was still very comfortably within the range for early entry (so we’d achieved our main objective), but well below what was reported when she was retested at age 9.
Posted by: hamburgerman

Re: Commonalities/Differences across tests - 04/12/22 04:02 PM

Happy to report that my younger sons second sitting (now at 5.5yo) of the SAGES-3 for 1st grade entry (rather than Kindergarten) was a 99%.
Posted by: aeh

Re: Commonalities/Differences across tests - 04/27/22 05:28 PM

Great to hear that! Sometimes it's just a matter of being in the right frame of mind and developmental state for testing, and that extra half year of maturity might have been all he needed.