Culturally Responsive Assessment Archive 2015-16
James Hiramoto, Ph.D.,
James graduated with an MA and PhD in Educational Psychology from UC Berkeley’s, School Psychology Program. He has 17 years of experience as a school psychologist. He advises and provides trainings for superintendents, school administrators, teachers and special education staff. He has over 8 years of experience as a university professor and director, training school psychologist at the Master and Doctoral level. His areas of expertise align with the subjects he teaches and or presents at state or international conferences. These areas include: Cognitive ability, neuropsychological, alternative and culturally responsive assessment; crisis planning, management and intervention; educational research methodology and statistics; program evaluation; consultation and special education law.
Click a topic below to expand the full question and answer.
Standard Scores, Age and Grade Equivalent
I have a question about the difference between Standard Scores and Grade Equivalent and Age Equivalent on standardized tests of achievement test. Why can these be so divergent from the Standard Score? For instance a student who is eleven and a half can have a Standard Score of 98, but an age equivalent of 9 years 4 months and a grade equivalent of 3.9? That about two years behind, but he is right in the middle of the average range! How is this possible?
I am not sure who thought age and grade equivalents were a good idea but they often create more confusion than any other reported test information. Here is what you need to know: Age and grade equivalents do not provide the same information that Standard Scores do.
A “standard score”, given a student’s age, usually has a mean of 100 and a standard deviation of 15. Nearly all standardized tests use this scoring convention (but a few use T-Scores) and they often, at the subtest level, use a “scaled score” where the mean is 10 and the standard deviation is 3. This makes them easily interpretable between and across different types of tests; hence, the term standard. They all have that famous bell curved distribution, where 100 is always the 50th percentile and an 85 is always the 16th percentile, etc.
However, age and grade equivalents are based on raw scores and the point at which 50% of students age-wise or grade-wise are able to solve that many problems. For example: on a given subtest, the point at which 50% of students can get 14 items correct, corresponds to an age equivalent of 9 years 4 months and a grade equivalent of 3.9 years; the point at which 50% of students can get 15 items correct, corresponds to an age equivalent of 11 years 7 months and a grade equivalent of 5.7 years. It is a limitation of the number of test items. Yes, sometimes one point can make that much of a difference with respect to age and grade equivalent scores, but not so much with Standard Scores. So you can be 11.6 years old with a Standard Score of 98, because the normal distribution of students your age determines your score (think bell curve), not the specific location where 50% of students who get 14 correct are, with respect to age and grade.
This is a good thing. If we had tests that could discriminate at that granular a level for age or grade equivalent, it would take days to administer and exhaust the student, which in itself could invalidate the score(s). The reality is even if we did that amount of testing, there is little information that it would provide other than what can already be obtained from looking at a student’s classroom performance.
So while age and grade equivalents may seem like good information to provide, make sure that you explain to parents, teachers, students, what it really means. I often have to correct parents’ misconception that their child is ready for high school because their 9 year old has a grade equivalent of 9.0 on a few subtests. While it means their child is doing very well for their age, it also means that 50 percent of high school students (9.0) are doing better than their child. How far is the drop if they got one item less correct…a grade equivalent of 7.6, maybe? What if the one or two items they got correct were blind guesses? This is why actual classroom performance is critical, as it is a far better indicator of what a student is capable of in real world situations..
I hope that helps and makes you less “Perplexed”.
Our district is considering the purchase of the Woodcock Johnson IV ECAD for standardized testing with young children (as opposed to using components of the WJ-IV battery). We are trialing it right now at some elementary school sites. Do you or any colleagues have any experience with this test and/or have any input regarding the usefulness of this assessment?
The DCN cannot endorse a test kit or not. However, I can describe why there would be a benefit of a test like the ECAD. A test designed specifically for a younger age range will have a lower, more discriminating floor for that age group (in the case of the ECAD that would be 2 years 6 months to 7 years 11 months and can be used for those with cognitive delays up to 9 years 11 months). A specific feature of the ECAD are that the subtests focus on the CHC factors associated more toward early academic success (those processing areas that are more closely tied to language and reading development), than the standard WJ-IV battery can provide. Even though the standard Woodcock Johnson IV cognitive battery goes down to 2 year olds, the number of items that discriminate functioning skills are sparse and can lead to over and underestimations of ability. When a test has more items to discriminate for more subtle difference, you as the assessor are better able to uncover if there is a pattern of cultural/environmental/language exposure that is impacting performance.
I hope this helps.
Regarding Larry P.
I have a student from Trinidad who I need to assess, whose primary language is English. Do the same issues apply re: IQ tests and African American students per Larry P case?
Regarding this particular case, it really comes down to how the family marked his school registration card. If the family identified themselves as African American, Larry P. applies. If it is left blank, the clerk of the school makes the determination according to federal regulations. Regardless, of whatever assessment tool you use, there may be cultural/linguistic differences even with English being the primary language. Therefore, going into the assessment making note of irregular responses and talking with the parents to find out if these are culturally acceptable, will be important.
California Code of Regulations Section 3030 (b) (10) Specific Learning Disability Eligibility
During your training you showed us in the Federal and California Code of Regulations where “reading fluency skills” was one of the academic areas that could be significantly low and could qualify a student under Specific Learning Disability (SLD). Our districts new eligibility forms do not have this academic area listed. I’ve asked why this was left off, but no one seems to have an answer. Do you have any idea why this might be?
Inquiring minds want to know,
The problem has to do with cutting and pasting.
In the updated California Code of Regulations (CCRs), in 2014, California chose to adopt (copy word for word) the language found in the latest version of the Code of Federal Regulations (CFRs), which were reauthorized and updated in 2004. However, California also decided to keep the discrepancy model and this is where the problem lies.
First, this latest version of the CFRs, eliminated the discrepancy model. The only mention of a discrepancy model can be found in Sec. 300.307 “(a) (1) Must not require the use of a severe discrepancy between intellectual ability and achievement for determining whether a child has a specific learning disability, as defined in Sec. 300.8(c)(10).”
Because there is no discrepancy model defined in the latest CFR for a specific learning disability, California kept what they had for the discrepancy model. By combining the old CCRs discrepancy model with the latest CFRs, the latest CFRs include “reading fluency skills”. See below:
3030 (b) (10) (B) "In determining whether a pupil has a specific learning disability, the public agency may consider whether a pupil has a severe discrepancy between intellectual ability and achievement in oral expression, listening comprehension, written expression, basic reading skill, reading comprehension, mathematical calculation, or mathematical reasoning. The decision as to whether or not a severe discrepancy exists shall take into account all relevant material which is available on the pupil. No single score or product of scores, test or procedure shall be used as the sole criterion for the decisions of the IEP team as to the pupil's eligibility for special education. In determining the existence of a severe discrepancy, the IEP team shall use the following procedures:"
As you can see there is no “reading fluency skills” mentioned in this section for the discrepancy model because the discrepancy model is referring to old CFRs that had only seven academic areas. It is the latest CFRs (2004) that include the eighth academic area, reading fluency skills. The new, California CCRs include reading fluency skills under a separate subsection copied directly from the newer CFRs, as an academic area to consider.
(3030 (b) (10) (C) (1) "The pupil does not achieve adequately for the pupil's age or to meet State-approved grade-level standards in one or more of the following areas, when provided with learning experiences and instruction appropriate for the pupil's age or State-approved grade-level standards:(i) Oral expression.(ii) Listening comprehension.(iii) Written expression.(iv) Basic reading skill.(v) Reading fluency skills.(vi) Reading comprehension.(vii) Mathematics calculation.(viii) Mathematics problem solving, and.(2) (i) The pupil does not make sufficient progress to meet age or State-approved grade-level standards in one or more of the areas identified in subdivision (b)(10)(C)(1) of this section when using a process based on the pupil's response to scientific, research-based intervention; or (ii) The pupil exhibits a pattern of strengths and weaknesses in performance, achievement, or both, relative to age, State-approved grade-level standards, or intellectual development, that is determined by the group to be relevant to the identification of a specific learning disability, using appropriate assessments, consistent with 34 C.F.R. sections 300.304 and 300.305;”
Some have argued that from the way it is written, reading fluency is only required for, “response to scientific, research based intervention” (RtI) or a “pattern or strengths and weaknesses” (PSW). That is not the case. Remember that this is copied from the latest version of the CFRs. So even though LEA’s have choice with respect to a discrepancy model, RtI or PSW, there are now eight academic areas that must be looked at.
Standardized Tests of Achievement and their Usefulness in the Identification of a Specific Learning Disability
I’ve been a school psychologist for over 20 years and it seems to me that it is getting harder and harder to qualify students using the discrepancy model here in California. Is there a reason for this or is it just my imagination?
Hi Inquiring Mind,
The discrepancy model was predicated on 1960’s and 70’s research that showed a correlation of .7 between IQ and Achievement Scores. By squaring the correlation we get the amount of variance in achievement accounted for by and IQ score which was 49%. That is nearly half the variance, with just one number.
However, today’s IQ score and achievement scores are not correlating as high. WISC-V with WIATT III the correlation drops to .53 only accounting for 28% of the variance. The WJ-IV Cog with the WJ-IV Ach correlates at .54 only accounting for 29% of the variance. These tests are co-normed reducing error as the same individuals are given both sets of tests. These correlations should be viewed as best case, because we that know in the field there is a mix and match process that goes on, where the WISC is used for IQ and WJ is used for Achievement (which increases error as the population norms are completely different groups). Even a .5 correlation would be optimistic in these circumstances. What does that mean? At best, 25% of the variability in achievement is accounted for by these IQ tests. Another way to say this is that 75% of the variability in achievement is left unknown if all you do is rely on a general ability score (which the discrepancy model rests on).
Why there is a reduction in prediction has to do in part with the latent variable nature of the overall IQ score (g), but that discussion is better saved for another time and when we talk about test construction and factor analysis.
Since 2001, NCLB required every state to establish standards that by the year 2014 all students would meet. This created (what became known as) a race to the bottom, as many states maintained low standards or lowered their standards to meet this objective. It was much easier to meet this 2014 deadline this way. Not everyone can be above average, right? California was not one these states. California had some pretty high standards and chose to maintain them.
How does it impact the discrepancy model?
All tests have to have norms to be able to generalize to, and every 10 years or so, tests must be re-normed to be able to remain valid. Because these standardized achievement tests (3rd editions of the Wechsler Individual Achievement Test and Kaufman Test of Education Achievement or the 4th edition of the Woodcock Johnson Test of Achievement) are nationally normed with states that often have wide ranging standards, variability in what makes up the scores was increased. Therefore, if you happen to be from a low standard state, meeting the discrepancy would become easier, as your students may not, at that grade level, have been exposed to certain material. Conversely, if you are from a state with higher standards, meeting the discrepancy would be far more difficult, as lower standard items would inflate standard scores. This is why in California, tests of achievement that are based on national norms, bear little resemblance to actual classroom performance and sometimes contentious IEP meetings e.g., “How can you say my child is achieving in the average range, but is failing in school?!”
Things might get better with Common Core, but you have to remember that each state can tweak these standards by 15%; in addition, at last count only 40 states are participating in Common Core. Also, it will take time to see the effects of Common Core’s impact at all grade levels, and testing companies have only recently normed many of their achievement tests, so be prepared for at least 7-10 more years of difficulty if your district is still using the discrepancy model.
Final and most important point:
The discrepancy model also requires you to take into consideration classroom performance. Classroom performance or real world evidence trumps standardized test scores, for all the above reasons and more. You must exercise professional judgment. CCR 3030 (b) (10) with respect to the discrepancy model that requires a 1.5 standard deviation difference (22.5 if the mean is 100 and standard deviation is 15) including a standard error of measurement of + or – 4 points. Everyone always seems to reduce the discrepancy to 18.5. Remember, in some instances the discrepancy may need to be more than 26.5 depending on how all your other data looks.
I hope this is helpful.