Note: This blog post contains spoilers for the book Moneyball. I have no idea if it contains spoilers for the movie; I haven’t watched it yet.
Time to come clean: I’ve always thought that every aspect of our lives is data-driven. If we had the right data and knew how to use it, we’d always be able to make the best possible choice. That’s why, when in my graduate school statistics class, I heard other SLPs talking about how much they hated math, statistics, and data, my reaction was, “If you don’t like data or statistics, you’re in the wrong field.” Increasingly, the field of speech and language pathology has relied on evidence to guide practice and policy, and what recently clarified this further for me was when, in preparation for watching the movie, I read Moneyball by Michael Lewis. The book (short version) is about how Oakland A’s manager Billy Beane was able to use good data well to put together a baseball team that could be competitive against teams that had significantly higher payrolls (like the Yankees, who, as a Detroit Tigers fan, I am currently loving to hate). So how does this relate to evidence-based practice in speech pathology? I picked up five major themes in the book that relate perfectly.
1. Data is always better than no data. In Moneyball, one of the first things that Billy Beane does after becoming general manager of the Oakland A’s is try to wrest drafting choice away from the A’s scouting department. Why? Because the scouts, after accumulating decades of experience, claim to be able to tell which baseball prospects will have good careers by how they look. Not much real data, just whether these prospects, most of whom are high school students, look like they could one day be major league stars. Beane knows this is crap, because when he was in high school, he was the kid who looked like a major league star, and he washed out. What Beane wanted to start doing was drafting players based on their stats as opposed to the scouts’ opinions. As speech pathologists, we have the same responsibility to make our decisions based on actual data rather than how someone “looks”. I’ll give an example from one of my wife’s professors’ experiences: the professor was referred a child who was 18 months old with congenital deafness and blindness. This child, from the perspectives of the time (largely observation-based), was not a good candidate for speech therapy. The professor, willing to do the evaluation to make sure, asked the parents about the child’s communication skills. The parents indicated that the child was learning sign through physical prompting, and knew some signs. By “some”, the SLP thought the parents meant somewhere between five and ten. The parents meant 50 – at eighteen months old. Because this SLP sought data rather than relying on how the child “looked”, she was able to provide assistance to the parents on improving the child’s communication, and he ended up in a gifted and talented program that couldn’t order Braille books quickly enough to keep up with the student’s reading pace.
2. Contextual data is better than data out of context. To be fair to baseball scouts, many didn’t work off of no data. They had gathered plenty – they knew the prospects’ batting averages, home run counts, and ERAs from high school, and had timed their running speed, fastball speed, and swing power in scouting camps. Unfortunately, this data didn’t have all the information needed to make a good decision because it was, for the most part, data out of context (especially the data from the scouting camps). Sure, a prospect can hit a 90 mile-per-hour fastball into fair play 40% of the time, but that didn’t necessarily translate into performance during a game, where the pitcher won’t always throw a fastball and tens of thousands of fans are screaming at you. What Beane did was start to focus on prospects who had better contextual data – he focused on college prospects, whose stats had been shown to more closely simulate professional performance in the past, and he completely ignored scouting camp data. Similarly, speech pathologists need to be wary of relying only on noncontextual data. Standardized tests are a great way to get good baseline data, but (at least in the case of language testing) it’s not terribly reliable as a predictor of speech and language performance in the classroom. In order to make the testing data count, it’s important to also gather information from teacher feedback, classroom observations, and also consider potential social impacts on language performance (for example, whether a kindergarten student has ever been away from home for a full day at a time or how comfortable a student may be in a new environment). Note that because that data is all qualitative rather than quantitative data, it also shouldn’t be considered on its own (see Lesson 1 above), but integrating both types of data, especially if it comes from multiple sources, can provide a far better perspective on a student’s performance. A recent example from my caseload is a student, 5 years 9 months, who scored a Core Language score in the low 50’s on the CELF-4. The student got a raw score of 1 on Concepts and Following Directions (for those of you unfamiliar with the CELF-4, this means the student could not “point to the ball, then the shoe” in a field of 6), and when asked to say the alphabet, the student said “A, C, X” and then started crying. When this data was discussed with the parent and teacher at the IEP meeting, the teacher reported that the child was able to say the alphabet and was copying letters well, and the parent reported that the child was spelling and had never been in a full-day educational program before. Both individuals who were able to see the child’s actual performance saw something that I, with my standardized, out-of-context data, did not, and using all that data together when developing the IEP led to a treatment plan with more flexibility in the event that the student’s performance improved as she got more comfortable with being in school all day and with me as an SLP.
3. Some data are more equal than others. While Beane was general manager of the A’s, there was already a revolution in what statistics were being considered – on-base percentage and slugging percentage were being used to determine which players have the best offensive performance rather than batting average. However, on-base percentage (the percent of the time a player gets on base) was given the same weight as slugging percentage (the average number of bases a player reaches when at bat) when determining effectiveness. Beane’s statisticians determined that this figure was not accurate – a player with a 1.000 on-base percentage (reaches base every time at bat) will generate infinite runs (because there will never be an out), where a player with a 1.000 slugging percentage (an average of one base per at-bat) will generate one out per double, two outs per triple, and three outs per home run, so is not nearly as valuable a player. In the same way (bringing this back to speech so that people who don’t understand baseball can catch up), we need to constantly review the conclusions we reach based on our evidence-based practice and RTI data to make sure that we’re making the right recommendations. Our field is still relatively new, and fields with far more experience than ours are still trying to figure everything out – after all, people were tracking baseball stats while Leonard Logue was still trying to make it as an actor, more people analyze baseball stats, baseball stats are clearer on face, and statisticians are still reviewing what conclusions should be drawn from their data.
4. Look at the story the data tells. Beane’s perspective on using data in recruiting and drafting prospects began with a publication called Baseball Abstract by Bill James. James’s trick to understanding baseball statistics more deeply is that he looked at what story was told by the statistic rather than simply its number. As such, he recommended placing higher value on some statistics than others. For instance, on-base percentage, which is the times per at-bat a player reaches base, relies on the hitter’s ability to either hit a base hit, draw a walk, or force a fielder to make an error. It’s reliant chiefly on one player, and doesn’t have a lot of variability if a player changes teams or gets places elsewhere in the batting order. Runs batted in (RBI), on the other hand, is the number of other players who score when a player hits the ball. It is extremely reliant on other players, since an RBI cannot be scored if no other players on your team get on base, and being placed in the order after players with a low on-base percentage significantly reduces a player’s ability to get RBI’s. As such, James argues that the RBI is a less meaningful statistic than on-base percentage. In the same way, as speech therapists, we can look at our data to see that some data is reliant on other data or circumstances, and we need to consider the full narrative of what the data reflects before we can determine the impact of some language deficits or other factors on treatment plans, eligibility for treatment, or goals. For instance, a student who gets a Core Language Score of 75 on the CELF-4 is a far better candidate for treatment if that student has an IQ score of 100 than if the student has an IQ score of 60 (acknowledging that IQ is, in and of itself, a flawed means of acquiring data). My example student from Lesson 2, in addition to her low score on Concepts and Following Directions, also scored very poorly on Recalling Sentences. However, analyzing the full range of data from that subtest (what the student said rather than just correct or incorrect answers) revealed that the test was entirely inconclusive with regard to the student’s actual ability to recall sentences – the student did not understand the directions of the assessment, and treated each statement as something to respond to rather than something to repeat. The full story of what the data displayed, among other things, allowed the IEP team to focus on the prerequisite skill – understanding and following directions – rather than the concepts in her low-scoring subtests that relied on following directions as well as the concept assessed by that subtest.
5. No matter how objective we try to be, we are subjective. Beane’s greatest failure was that he could not get his managers or players to actually execute strategies based on his data, no matter how much they believed in his data. Sacrifice bunting and stolen bases were shown by the data, again and again, to be ineffective ways of generating runs and preventing outs (especially in the case of sacrifice bunting, since it by definition creates an out). Even the players who most bought into Beane’s philosophy still tried to steal a base when they saw an opening or performed a sacrifice bunt to move two baserunners ahead – despite knowing, based on data generated by their own performance, that the behavior was unlikely to produce runs or anything that increased the team’s chances of winning. Likewise, we need to recognize that as human beings, we look at our data through a subjective perspective, and keeping track of the objective data should always be at the forefront of evidence-based practice. While I discussed in Lesson 2 the importance of using subjective data, it’s important to remember that the subjective data (classroom observations, teacher input, and the like) during assessments are primarily useful as data to support, corroborate, and confirm what is determined by the objective data. If we abandon the objective data because it’s not the “full picture” of what the child can do, then we are abandoning evidence-based practice, and that takes our discipline backward.
In short, data is data. Reading about and discussing how other fields (even seemingly unrelated fields like baseball) manage and use their data can teach us important lessons about how SLPs can better manage, use, and improve upon their data. I hope this perspective was useful at least in helping you to think about and consider how you use EBP in your practice, even if you determine that you are in fact following the Moneyball model of data use.