A Brief History of Psychiatric Rating Scales
By Monika Vance
Today’s best rating scales are scientifically and mathematically conceptualized. They are designed to be…something like multi-dimensional “rulers” that are either individually, or collectively, integrated with the sensitivity and utility of a compass.
Rating scales are tested on a very specific group of people to confirm that they measure what they’re supposed to measure and that the design of their construct can detect change on what they measure over a pre-defined period of time. The evolution in their development is parallel to the progression of neuroscience, psychology, psychiatry and medical anthropology.
Now…don’t worry, I’m not going to bore you with textbook chronology of how psychiatry evolved from Ancient Greece, through the Middle Ages and into the Enlightenment Era. I’d fall asleep while writing that. Edward Shorter already did a great job of it in the mid ‘90s.
Rather, let me walk you down a brief memory lane of evolution in the utility of psychiatric rating scales that shaped the role of psychometric measurement into what it is today in Western clinical practice and research.
In future posts, I will build on this to demonstrate how cultural differences, translation and migration from paper to digital formats influence scale selection for research, ratings for diagnostic and treatment monitoring purposes, and interpretability of your multi-cultural data. So stay tuned…
“If there is one central intellectual reality at the end of the twentieth century, it is that the biological approach to psychiatry – treating mental illness as a genetically influenced disorder of brain chemistry – has been a smashing success. Freud’s ideas, which dominated the history of psychiatry for the past half century, are now vanishing like the last snows of winter.
― Edward Shorter, A History of Psychiatry: From the Era of the Asylum to the Age of Prozac
In a previous post, I introduced how rating scales impact the quality of data in clinical trials and listed just a few terms that are used to refer to rating scales across healthcare and pharmaceutical research.
A rating scale is just one type of several types of psychological tests. I’m assuming that most of you who find yourself reading this right now are more familiar with the term rating scale, or PRO, ClinRO, PerfRO, PROM, COA, eCOA, psychiatric measure, or some variation of these. So, I’ll continue with using rating scale, but keep in mind that this post applies to all psychological tests.
Psychiatric Rating Scales in Context
Psychiatric rating scales were given to us from the field of Psychology. In Psychology, they are used to systematically and empirically measure sample function or behavior.
What you always need to keep in mind is that rating scales are scientifically constructed instruments that are used to objectively measure states and/or traits within a specifically defined conceptual framework (e.g. attitudes, aptitudes, competencies, cognition, mood, physical function, and so on). Such conceptual frameworks – also called constructs – can be generalized to quality of life, or become slightly more targeted to health-related quality of life, or be highly specific, such as measuring types and severity of hallucinations in schizophrenia.
Rating scales were initially created to aid in making decisions about people in the contexts of educational competencies, occupational fit, and were later adopted for clinical differentiation between “normal” and “abnormal” behavior and functioning. For lack of better alternatives, at least for now, well designed rating scales remain to be particularly useful for evaluating interrelationships of cognitive, affective and behavioral traits.
Evolution in Testing: From Education to Clinical Practice and Research
Before they became common in clinical settings, the most prevalent utility of rating scales was in the field of education. They were developed for measuring cognitive intelligence, function and behavior.
The challenge in empirically differentiating abnormal individuals from normal individuals within the context of behavioral, intellectual and emotional functioning, transcended from the field of education into psychiatry in late 19th Century.
Many of the early versions of rating scales were developed in England, Germany, France and in the United States. These scales were progressively adopted into clinical settings and frequently used for assessing cognitive functioning in individuals with brain damage and mental retardation, by way of rating patients’ understanding of proverbs, oral memory tasks, and comparing differences and similarities in series of words. However, emic stigma toward mental illness, religious biases, superstitions, and limited clinical knowledge in psychopathology made progress in treatment of mental disorders slow and difficult.
Most notably at the time, in 1879, Wilhelm Wundt, a German physician, founded the Institute for Experimental Psychology at the University of Leipzig where he and his students contributed a monumental amount of intellectual work to the field of psychology, psychiatry and neurology. The Institute of Experimental Psychology built the foundation for standardized procedures in accurate measurement of psychological constructs under controlled laboratory conditions, and for various methodologies used in data collection and data analyses.
Just like language evolves within cultural groups, lingo also changed over time. As rating scales became more frequently used for clinical and research purposes within the psychiatric assessment process, the psychiatric community began to refer to the tests as psychiatric rating scales, psychiatric tests and psychiatric measures.
Today, rating scales are used in clinical practice as part of a psychological or psychiatric assessment, and in clinical research across various settings, including academia, healthcare reimbursement policy, pharmaco-economics, and drug development.
Psychiatric Rating Scale Adoption: Clinical Setting, Culture, Purpose and “Fit”
Most commercially available psychiatric rating scales are developed in the course of routine clinical practice in busy private clinics and psychiatric institutions. Then they are either published in trade journals, or assigned by their developers to copyright collective agencies, or to test publishers for marketing, global distribution and copyright management.
Many more remain hidden in thousands of international scientific journals, doctoral dissertations, obscure (and sometimes insanely expensive) databases, owned by academic institutions and professional associations, and in an array of compendia of measures commonly used within a specific psychiatric specialty.
Most of the instruments used in clinical practice and research today were developed in, or after, the 1970s with intent to improve empirical data in clinical research, quality of longitudinal care for a certain groups of patients, and later also for justification of importance and effectiveness of treatment programs for managed care reimbursement purposes. Successes with methods of treatment monitoring and outcome reporting from these initiatives have been, and continue to be, publicized in journals and sometimes also through marketing efforts by test publishers and their international affiliates.
In clinical research, this leads to multiple (questionable) versions of published rating scales being passed around from service providers to sponsors and vice versa. These multiple versions are usually the result of the test developer not being aware of the responsibilities for test developers set forth in Standards For Educational and Psychological Testing (American Psychological Association et al, 1999). As adaptations to items, test-retest time frames, rating instructions, and translation to new languages occur, and new versions of the same rating scales continue to appear, the integrity of the rating scale is compromised and affects the quality of clinical trial data. On the other hand, new scales are developed to fill the gaps in current measurement capabilities. Having access to new scales also provides new possibilities to measuring the effect of experimental drugs with greater precision and/or with greater focus on the mechanism of action.
In healthcare, new scales publicized in scientific journals leads to clinicians practicing in other settings, or in other countries, or treating different types of patient groups (i.e. immigrants who do not yet speak the language), to adopt such instruments in their local settings.
Remember that the rating scale is a sensitive hybrid of a ruler and a compass designed for a very specific purpose that may not be a good “fit” for the new setting. So, while having access to new instruments is generally viewed as an advancement in mental health treatment methods, adopting an instrument for use within a new culture, with groups of people speaking different languages and harboring different behavioral norms, can become exceptionally challenging and sometimes impossible. If the instrument is used in the context for which it was not designed, a perceived and well-intentioned “advancement in mental health care” can actually result in unintended harm to patients and due to stigma, potentially to their caregivers as well.
Clinical psychologists and social workers have led the research in cross-cultural assessment, and have generated an enormous body of literature on the subject.
Cross-Cultural Assessment is Where We Are
With today’s globalization, ethnic homogeneity in any country is, or soon will be, a thing of the past. Cross-cultural assessment is where we are. Rating scales have come a long way, but they continue to need thoughtful attention to their design and psychometric properties, so they can perform the way we expect them to in the variety of settings in which we decide to use them, and in the language to which we choose to translate them.
Follow me! Don’t miss out on good stuff! Sign up to have new posts and helpful tips on selection, use, translation and validation of rating scales delivered to your inbox. You’ll also get first dibs on special tutorials!