Tag Archives: COHA

The Dudes Do ATIA 2013: Day 3 – Of Dining and Data

Today was a day of meetings. Fortunately, the first was at a delightful restaurant; the Thai Thani  on International Drive in Orlando. Being an Indian curry lover, I opted for the Curry Fried Rice with chicken, and wasn’t disappointed. One of the house specialities is a pineapple yellow fried rice curry with a choice of beef, chicken or pork, stir fried with raisins, cashews, and onions but I wanted something less fruity so I’ll save this special for another visit.

Thay Thani restaurant

Thai Thani Orlando

Following two more meetings, I did the first of my two joint-presentations. I usually fly solo – then there’s only me to blame of things go wrong – but this year I tried sharing. And this one was on one of my favorite topics: automated data collection and analysis with AAC devices. The content was similar to the presentation I gave at ASHA 2012 and which has already been documented in The Dudes Do ASHA 2012: Day 4, so feel free to click and read that.

What wasn’t discussed in that older post was the way on which the word data itself can tell us something about language change over time. So try this quick test – and don’t spend too long thinking about the answer:

Which is these statements is correct:

(a) The data is good.

(b) The data are good.

If you answered (b), then you are in the company of the good people at the  Oxford English Dictionary (and that’s not bad company to be in) and the hearts of die-hard grammatical prescriptivists [1].

But if you answered (a), then you are not that different from the population of the English-speaking world as a whole because the is and the are seem to be in free variation! If you take a look at the Corpus of Historical American English, you’ll see that in terms of frequency of use, they don’t seem to differ that much since the 1930’s, and you can make a case, I suppose, for arguing that the is-form has edged ahead of the are-form.

Take a look at these charts that track use since 1830.

The word data and the verb is

“The data is…”

Notice that “data is…” was being used at the turn of the century and peaked in the 1990’s. Compare that with the “data are…” instances:

The word Data and the word Are

The data are…

There are hardly any examples prior to the 1930’s and from the 1960’s onward, both is and are appear to be neck and neck in terms of usage.

So why does this happen? What is it that makes data such a tough word for folks to decide whether it should be used with is or are? The answer – or a t least part of it – is related to our understanding of whether a noun is a count noun or a mass noun.

For those saner readers who are less obsessed with language than this Dude, count nouns are – unsurprisingly! – those that can be counted. So dog, cat, shoe, table, boat, and cup, are all count nouns because we can talk about “three cups” or “five shoes” or “a room full of dogs.” With a count noun, you’re usually able to turn it into its plural form by adding an “s.”

On the other hand, a mass noun cannot be counted. Pork, education, furniture, and weather, cannot be used with a number or pluralized by adding an “s.” You don’t have “*three weathers” or “*a room full of furnitures.”

Data is one of those words that has become a mass noun, even though it was originally a count noun. And by “originally,” I mean going back to Latin, where the singular was datum and the plural was data. What often happens with foreign words that are imported into English is that we apply regular English rules to them. On that basis, it wouldn’t have been surprising to see datums – but it didn’t happen ;)

What appears to have happened is that the word data has become a synonym for information, and folks feel that if “the information is good” sounds OK, then so does “the data is good.”

Incidentally, there is a way to turn a countable noun into a mass noun by using a rather gruesome linguistic device called a “universal grinder [2].” Suppose that in a frantic effort to catch a bird that has found its way into your house, you cat leaps up into the air and accidentally hits a rapidly rotating heavy fan. Saddened by its untimely demise, you might, through your tragic sobs, explain to someone over the phone that, “There is cat all over the room.” In this situation, a regular count noun has suddenly transformed into a mass noun.

Kitten playing with a fan

Careful, Mr. Tibbles!

Equally, in certain circumstances, some mass nouns can take on the appearance of a count noun. Although water is typically a mass noun, you might be in a restaurant and remark  that, “there are four or five waters already on the table.” Needless to say, folks learning English have a bit of a struggle trying to learn the difference between them as the only rule seems to be that liquids and powders (amorphous items) tend to be mass nouns, and the rest are count.

The learning point from all this – and we’re trying to be recognized as an educational blog as well as providing entertainment – is that when we are evaluating someone’s ability to use language, it’s critical to be aware of the fact that sometimes the prescribed way of speaking may actually be in free variation with the popular way, and this is actually one of the ways in which language changes over time [3].

For the sake of completeness, the day ended with wine, pizza, beer (mass noun), and a cocktail before bed. Needless to say I fell asleep quickly.

Notes
[1] In the world of language mavens, there are constant arguments between prescriptivists, who take the line that there are “correct” ways to say things, and descriptivists, who say that so long as you can be understood, there ain’t no right and wrong.  Although I’m more often the prescriptivist boat, I’m happy to jump ship depending on my mood – and whether I want to just get into a bit of a row with someone just for the hell of it.

[2] The Universal Grinder is a linguistic thought experiment first written about by Francis Pelletier, who used it in a paper talking about the nature of count versus mass nouns. Pelletier didn’t use household pets and rotating blades as his examples but the Dudes feel more at home with Edgar Allan Poe as a role model than, say,  Noam Chomksy or Stephen Pinker.

Pelletier, F.  J. 1975. Non-Singular Reference: Some Preliminaries. Philosophia 5.

[3] A pretty comprehensive coverage of how and why languages change over time can be found in Larry Trask’s 2010 book Why Do Languages Change? For those who want the Dude notes, you can click on the following Dude Link to get the 38-page summary. Link to book summary

Efficacy or Effectiveness? How To Be A Word Detective

Late last week I was in a meeting with a chappie from the International Organization for Standardization, talking about the role of the research group I belong to and explaining how we measure out performance. This sort of thing is typical of any company that needs to maintain its ISO status [1] and having lists of procedures, processes, and parametrics is de rigueur for the whole shebang.

In the course of the discussion, I happened to talk about the challenge of measuring the efficacy of a department whose purpose is to generate speculative ideas, 80% of which are likely to be unfeasible. The examiner stopped me and asked me to repeat the word, which I did, and my colleague also offered a “translation” by saying “effectiveness.” That did the trick and chalked it up to my being an Englishman who is still struggling to learn American. [2]

But being me, I jotted the words down in my ever-present notebook with a few to investigating whether the efficacy/effectiveness was, indeed, a transatlantic difference.

Of course, in this age of Evidence-Based Practice, the call for measures how much effect therapy has on a client means that it’s common to talk about the “efficacy of treatment” or the “effectiveness of an approach.” Or is it? Do we say “efficacy” or “effectiveness?” Is there, in fact, a difference?

Well, the first thing I often do with questions like this is to use the Google search engine and get a Ghit measure. “Ghit” is short for “Google Hit” and appears in a search as a number under the search bar. [3] Here’s what comes up for efficacy and effectiveness:

Efficacy: 17,100,000 ghits
Effectiveness: 179,000,000 ghits

Whoa! Quite a difference there, by a factor of ten. Just to corroborate the difference, I did a Bhit count and a Yhit count (Bing Hits and Yahoo Hits, if you weren’t sure).

Efficacy: 52,400,000 bhits and 52,600,00 Yhits
Effectiveness: 143,000,000 Bhits and 139,000,000 Yhits

So not ten times larger for effectiveness but still significantly more popular. But what about the notion that it’s a UK/US thing? After all, it is possible that the high ghit count is masking it – after all, the percentages will always skew in favor of the US when it comes to number of speakers.

This is when I turn to my trusty friend, the BYU-Corpus site, where we can play with the Corpus of Contemporary American to check on how a word is used in the US, and also the British National Corpus to get a UK perspective. I did this for my previous post on the use of have versus take in relation to bathing – and this turned out to be most definitely a US/UK distinction. Here’s what we see;

Oh bugger! It doesn’t look like a BrE versus AmE difference after all. There is a 10% variation between the two but I’m pretty sure it’s not statistically significant. My choice to use efficacy puts me in the minority in both the States and the Isles.

Desperate for some validation, I dug a little deeper by looking at some historical data. Maybe I’m just old and the incidence of the words has changed since I was a lad. The British National Corpus isn’t much help as it only covers the period from the 1980’s through to 1993, and I want to see older data than that.

The Oxford English Dictionary is a good source for historical information on word meaning, so I went to the bookshelf and did a little more research.

Efficacy as a noun dates from 1527 and is defined as the “(p)ower or capacity to produce effects.” It’s derived from the earlier Latin efficere meaning “to accomplish.” Its meaning hasn’t really changed since then and so we can call it a 16th century word – old enough.

Effectiveness as a noun is a little younger, with the OED identifying a first appearance in 1607, almost a hundred years after efficacy. It has a similar definition of, “(t)he quality of being effective.” Not surprisingly, it, too, can be traced back to the same Latin root as efficacy, efficere. However, it is a 17th century word so I can take some comfort (perhaps) in arguing that my use of efficacy is more “traditional.”

However, we can see something much more interesting if we take a peek at the Corpus of Historical American, which cover the period 1810-2009, and that certainly goes back further than my birth!

Here’s the chart of the behavior of the word efficacy since 1810:

The history of the word efficacy

efficacy 1810-2009

 Even before you click on the image to enlarge it, it’s clear that efficacy has been in a slow decline for decades. There’s been a modest upswing since the 1950’s but it’s nowhere near its glory days. So the inevitable question is, what has pushed it aside?

History of the word effectiveness

effectiveness 1810-2009

Well, well, well, what a surprise! The usurper turns out to have been no more than the Pretender to the Throne, effectiveness! From out of the shadows, the word has slowly increased its popularity to the point that it now hogs the limelight and commands center stage. Alas, poor efficacy, I knew it, Horatio.

The story might end there, with my claiming to be simply the sort of dude who uses older words, and who also is victim to the invisible hand of lexical change that can overturn the fortunes of synonyms. But there is something else: Although for most of the world, efficacy and effectiveness are synonymous (and dictionaries typically say that) there is a field in which they are not synonymous: the Clinical World.

Ah. but that’s a story for another day…

Notes
[1] For some time, I took pleasure in pointing out that the “International Organization for Standards” was clearly guilty of failing to notice that the acronym should be IOS and ISO. Alas, my mistake was to assume the ISO was an acronym, when in fact, it allegedly isn’t! The organization say that it’s derived from the Greek word isos, meaning “equal” and that they did this so they wouldn’t have to use different acronyms in different countries based on the languages. For example, in France it would be Organization Internationale de Normalization (OIN), so ISO is international.

[2] When folks ask me if I speak more than one language, I say I’m bilingual and can speak both English AND American. One of the delights of being an Englishman Abroad is that not only have I had the chance to be immersed in the UK’s melange of dialects and accents for the first 30-something years of life but now I get to go through it all over again with the different flavors and recipes of American English. I’m comfortable with Fall, happy to spell tyres as tires, and say “to-MAY-toe” and not “to-MAH-toe.”

[3] The accuracy of using ghits as a measure of word use is always open to question but as a quick and dirty metric it’s used by linguists who want to get a feel for how the world of words is playing out. Arnold Zwicky used them in a recent blog about the prefix “telephon-” and Geoff Pullum has them in a post on “Assholocracy,” so I think I’m in pretty good company.