All I Needed to Know About Adjectives I Learned at Starbucks

Language is an example of a moving target par excellence. Only today, I received a tweet that outlined a number of reasons why you should instantly wife your girlfriend. Wife her, I thought? Since when did wife switch teams and become a verb? Well, truth be told, it turns out that it became a verb in 1387, as evidenced by a quote from that popular 14th century pot-boiler Prolicionycion wrtten by Ranulf Higden:

Þey..kepeþ besiliche here children, and suffreth hem nouȝt to wyfe wiþ ynne foure and twenty ȝere.

But for reasons unknown – as is often the case in etymology – the use of wife as a verb disappeared sometime during the early 18th century, leaving only the noun usage in common use [1]. After a brief dalliance with verbiness, the word settled back into its original home.

Let’s now go back to just last week during the 2014 Closing the Gap conference in Minneapolis. After standing in line for almost 15 minutes to get a Starbucks latte from the hotel’s coffee bar, I asked for a “tall skinny” and was then quizzed with, “Is that the short tall?”

A “short tall?” Dear Lord, how much more torture do we want to subject the English language to? Prescriptivists everywhere would be wailing in anguish and putting red pens to paper – or maybe tweeting their disgust in 140 characters or less!

However, it’s pretty clear what’s happening here. Just like wife in the 14th century, the word tall is getting bored with being a simple adjective and deciding that being a rambunctious noun is much better; “Noun Envy” as the psychoanalinguists might say [2].

Starbucks, for purposes of marketing and not linguistics, decided to ignore the more semantically accurate method of labeling coffee sizes by “small,” “medium,” “large,” and “freakin’ huge,” in favor of “tall,” “grande,” “venti” and “trenta.” But they created an element of cognitive dissonance in consumers’ minds by linking a word like tall, which is semantically typically opposed to short, with the word small, which is more likely to be balanced against large. So using a word like tall to describe something that is cognitively small just doesn’t jibe.

What our consciously unaware but unconsciously linguistic barista has done here is to overcome that dissonance by treating the word tall as a noun and using short as an attributive adjective. Pretty damn cool, eh? [3] I can easily imagine that at some point, various baristas [4] have uttered not only “Is that the small tall?” but also “Do you mean a medium grande?” or “Is that a large venti?”

So while I’m hanging out here with you all in our virtual Starbucks, something else you might be curious about is the whole “How do I order my coffee?” issue. Does one ask for “a skinny grande cappuccino” or “a grande skinny cappuccino?” And when you start adding caramel or extra shots, where on earth do  you hang them?

Well, having castigated my good friends at Starbucks in relation to their idiosyncratic naming of drink sizes, I’ll offer them points for actually providing a “syntax” for budding baristas in order to make ordering easier. In a 2003 manual distributed to employees, the following generic ordering structure was recommended:

1. CUP: That’s hot, cold, iced, or “for here.”
2. SHOT and SIZE: No stipulation for which should be first.
3. SYRUP: For your caramel, raspberry, cinnamon etc.
4. MILK: Skimmed, 2%, soy, or whatever.
5. DRINK: Coffee, tea, mocha, or any other name.

My personal common order is for a “grande, non-fat latte,” which fits the rules of 2>4>5. During summer, I might order an “iced, grande, non-fat latte,” which again conforms with 1>2>4>5. My wife has a “grande non-fat, caramel macchiato” that follows the rules, and sometimes goes for the “iced, grande non-fat caramel macchiato,” which illustrates the full-blown 1>2>3>4>5 ordering.

Budding researchers [5] might want to spend an afternoon at their local Starbies armed with a pen and a notebook, jotting down as many orders as they can overhear – what researchers like to call “taking a sample.” After an hour of sampling both orders and coffee, they should be able to do some analysis to see how many people actually conform to the ordering paradigm. Remember, this is what research is all about; setting up a hypothesis about how we think folks will order coffee, and then testing it against observations of how they really order it!

Outside the world of Starbucks, adjective ordering in English also has some rules. One of the most common ordering paradigms is as follows:

Order of adjectives

If we compare this with the Starbucks recommendations, we can see that the sequence CUP-SHOT/SIZE-SYRUP-MILK-DRINK corresponds to the generic OPINION-SIZE-MATERIAL-QUALIFIER-NOUN. So they’re pretty much on the syntactic ball here!

Doubtless our hundreds of “proxy Dudes” collecting real data at coffee bars across the world will find exceptions to the ordering rules, but language performance has always been variable. On the other hand, we’re unlikely to hear “macchiato iced grande caramel” or “caramel latte venti soy.”

Or are we?

[1] I suppose as a proponent of using evidence and data to support propositions, I did take a look at the Corpus of Contemporary American and found no instances of wife as a verb in the 450 million word sample. Same for the British National Corpus (100 million word sample) and the Canadian Strathy Corpus (50 million words). Of course, absence of evidence is not evidence of absence, but I think I’m pretty confident in asserting that using wife as a verb is extremely rare and unlikely.

[2] Don’t rush out to your dictionary – even if YOUR dictionary is the Urban Dictionary – to find the word psychoanalinguist. It doesn’t exist. It’s only a “real word” in the sense that (a) I have just used it and (b) it can be understood within the context of this article.

[3] I suppose I need to appreciate that not everyone gets as excited about language change as I do. But this type of living example of how new meanings come about helps us all understand how important it is to be aware of the simple fact that languages are not, and never have been, static. I’m not suggesting that we allow some form of lexical anarchy where you can simply stick any old word anywhere but knowing that words can, and do, change meaning and category can, I believe, make us more aware clinicians.

[4] The word barista is, as you might know, Italian, so you might be tempted to point out to me that I should really be using the word baristi to mark the plural. However, the word baristas is perfectly acceptable because it’s an example of a word that’s been Anglicized i.e. taken into the English language, and the normal rule for making a plural word is simply to add an “s.” Hence baristas. I think I’ve talked about this before in relation to octopuses as being a wonderful plural, with octopi being fake Latin (octopus comes from Greek, not Latin, and if you wanted a Greek plural, it would really be octopodes!)

[5] It strikes me that a generous supervisor might be totally OK with letting a grad student work on a study such as, “Syntactic adjectival variability in coffee ordering.” And should that student be the recipient of a grant from Starbucks itself, it seems a bit of a no-brainer, don’t yah think?

The Dudes Dissect “Closing the Gap” 2013: Day 2 – Of Speech and Sessions

Having looked at the vocabulary used in the Closing the Gap 2013 preconference sessions, it’s time to cast a lexical eye on the over 200 regular presentations that took place over two-and-a-half days. For most attendees, these are the “bread and butter” of the conference and choosing which to attend is a skill in of itself. It’s not uncommon [1] to have over ten sessions run concurrently, which means you’re only getting to attend a tenth of the conference!

So let’s take a look at the vocabulary used in the titles to all theses presentations to get a flavor of the topics on offer.

Conference Presentations: Titles

The total number of different words used in the session titles was 629 after adjusting for the top 50 words used in English [2]. As a minor deviation, kudos to all who used the word use correctly instead of the irritatingly misused utilize. Only one titled included utilizes – and it was used incorrectly; the rest got it right! For those who are unsure about use versus utilize, the simple rule is to use use and forget about utilize. The less simple rule is to remember that utilize means “to use something in a way in which it was never intended.” So, you use a pencil for drawing while you utilize it for removing wax from your ear; you use an iPad to run an application while you utilize it as a chopping board for vegetables; and you use a hammer to pound nails but utilize it to remove teeth. Diversion over.

Top 20 Most Frequent Words in Titles

Top 20 Most Frequent Words in Titles

Top 20 Most Frequent Words in Titles

No prizes for guessing that the hot topic is using iPad technology in AAC. Your best bet for a 10-word title for next year’s conference is;

How your students  use/access iPad AAC apps as assistive technology

This includes the top 10 of those top 20 words so your chances of getting accepted are high.

Conference Presentations: Content Words

The total word count for the session descriptions text is 2,532 different words (excluding the Stop List), which is a sizable number to play with. And when I say “different words,” I mean that I am basically counting any text string that is different from another as a “word.” So I count use, uses, used, and using as four words, and iPad and iPads as two. A more structured analysis would take such groups and count them as one “item” – or what we call a LEMMA. We’d then have a lemma of <USE> to represent all the different forms of use, which lets us treat use/used/uses/using as one “word” that changes its form depending on the environment in which it is sitting [3]

Top 50 Words By Frequency in Session Content

Top 50 Words By Frequency

A 2,3oo-word graphic would be rather large so I opted to illustrate the top 50 most frequently used words. As you can see, the top words seem to be the same as those in the titles, which suggests that on balance, presenters have done a good job overall in summarizing their presentation contents when creating their titles – something that is actually the strategy you should use.

Keywords in Content

Finally, let’s take a look at the keywords in the session content descriptions. Remember, the keywords are those that appear in a piece of text with a frequency much higher than you would expect in relation to the norm.

Top 10 words by Keyness score

Top 20 words by Keyness score

Top of our list here are apps with the iPad coming in at three. Fortunately this fetish for technology is tempered by the inclusion in our top 20 of words like strategies, learn, how, and skills, all critical parts of developing success in AAC that are extra to the machinery. It’s good to think that folks are remembering that how we teach the use of tools is far, far more important than obsessing over the tools themselves.

[1] WordPress’s spell and grammar checker flagged the phrase “it’s not uncommon” as a double negative and told me that I should change it because, “Two negatives in a sentence cancel each other out. Sadly, this fact is not always obvious to your reader. Try rewriting your sentence to emphasize the positive.” Well, although I generally agree that you shouldn’t use no double negatives, the phrase “not uncommon” felt to me to be perfectly OK and not at all unusual. I therefore took a look at the Corpus of Contemporary American English and found that “it’s not uncommon” occurs 313 times while “it’s common” scores 392. This is as near to 50/50 as you get so I suggest to the nice people at WordPress that “it’s not uncommon” is actually quite common and thus quite acceptable – despite it being a technical double negative.

[2] For the curious among you, here are the contents of the Stop List I have been using, which is based on the top 50 most frequently used words in the British National Corpus (BNC): THE, OF, AND, TO, A, IN, THAT, IS, IT, FOR, WAS, ON, I, WITH, AS, BE, HE, YOU, AT, BY, ARE, THIS, HAVE, BUT, NOT, FROM, HAD, HIS, THEY, OR, WHICH, AN, SHE, WERE, HER, ONE, WE, THERE, ALL, BEEN, THEIR, IF, HAS, WILL, SO, NO, WOULD, WHAT, UP, CAN. This is pretty much the same as the top 50 for the Corpus of Contemporary American English, except that the latter includes the words about, do, and said instead of the BNC’s one, so, and their. Statistically, this isn’t significant so I suggest you don’t go losing any sleep over it.

[3] When you create and use lemmas, you also have to take into account that words can have multiple meanings and cross boundaries. In the example of use/used/uses/using, clearly we’re talking about a verb. But when we talk about a user and several users, we are now talking about nouns. So, we don’t have one lemma <USE> for use/used/user/users/uses/using but two lemmas <use(v)> and <use(n)> to mark this difference. It gets even more complicated when you have strings such as lights, which can be a verb in “He lights candles at Christmas” but a noun in “He turns on the lights when it’s dark.” When you do a corpus analysis of text strings, these sort of things are a bugger!

The Dudes Dissect “Closing the Gap” 2013: Day 1 – Of Words and Workshops

Regular readers of the Speech Dudes will know that when the “Dudes Do…” a conference, Day 1 is typically all about the travel experience, usually including some unfavorable comments about taxi cabs and hotel coffee, but this time I’m feeling charitable and, although not yet ready to “Hug a Cabbie,” I’ve decided to provide an overview of the preconference sessions, which I didn’t attend.
Now, you may think that not having attended a workshop might put me at a bit of a disadvantage with regard to reporting on content and offering a critique – and you would be right. On the other hand, what I can comment on is the contents of the preconference brochure that everyone can have access to prior to the actual event and which they use to decide the workshops and sessions they want to attend.

So what you’re going to see is an example of corpus linguistics in action, dissecting the very words used to influence YOUR choices. In short, you’re about to learn about what words presenters and marketers use to make up your mind for you. Grab your coffee, hold on to your hats, and prepare to be amazed at what you didn’t know!


The Dudes are big believers in the scientific method and the application of evidence-based practice. We strive for some objectivity where possible, although we acknowledge that our occasional rants may be just a tad subjective. We don’t expect our readers to take everything we say as gospel sharing the methodology of how we analyzed our data seems fair.

The raw data came straight from the official conference brochure, available for any to check at http://www.closingthegap.com/media/pdfs/conference_brochure.pdf. From that I extracted all the text in the following categories:

  • Preconference Workshop Titles
  • Preconference Workshop Course Descriptions
  • Conference Session Titles
  • Conference Session Descriptions
  • Exhibitor Descriptions

Technically, I simply did cut-and-paste from the PDF and then converted everything to TXT format because that’s the format preferred by the analysis software I use.

WordSmith 6 is a wonderful piece of software that lets you chop up large collections of text and make comparisons against other pieces of text. These comparisons can then show you interesting and fascinating details about how those words are being used. I’ve talked in more detail about WordSmith in our post, The Dudes Do ISAAC 2012 – Of Corpora and Concordances, so take a look at that if you want more details.

Once I have the TXT files, I can create a Word List that gives me frequency data, but I also use a Stop List to filter out common words. If you simply take any large sample of text and count how often words are used, you’ll find that the top 200 end up being the same – that’s what we call Core Vocabulary. And when you’re looking for “interesting” words, you really want to get rid of core because its… well… uninteresting! Hence a Stop List to “stop” those words appearing.[1]

Preconference Workshop Titles

The first opportunity you have to encourage folks to come to your session is to have a title that makes a reader want to find out more about what you have to offer. The title is, in fact, the door to your following content description. Of course, you have to find some balance between “catchy” and “accurate.” For example, a paper I presented at a RESNA (Rehabilitation and Engineering Society of America) conference entitled Semantic Compaction in the Dynamic Environment: Iconic Algebra as an Explanatory Model for the Underlying Process was, in all fairness, technically accurate, but from a marketing perspective it had all the appeal of a dog turd on crepe. [2]

Let’s therefore take a look at what seem to be the best words to use if you want to attract a crowd.

Pre-conference Sessions: Keyword in Titles

High frequency words in Pre-conference titles

High frequency words in preconference titles

The Word Cloud here counts only words that appeared twice or more, and the size of the words is directly proportional to frequency, so it’s clear that students is a critical word to use, followed closely by iPad, technology, learning, and communication. On that basis, if you’re planning to submit a paper for 2014, here’s your best “10-word-title” bet for getting (a) accepted and (b) a crowd:

The implementation of iPad technology for learning and communication

In the event that the CTG review committee find themselves looking at multiple courses submitted with the same title, you’re going to have to consider how you describe your actual course contents – and luckily, we can help there, too!

Preconference Sessions: Keywords in Course Content

The actual highest frequency words were workshop and participants, which is something of an artificial construct because most people include phrases such as “in this workshop, participants will…” and so I removed these from my keyword analysis.

Frequent words in preconference sessions content

Frequent words in preconference sessions content

So to further enhance the pulling power of your course, you need to be talking a lot about students, how they use iPads and communication, along with using apps to learn, enhance learning, and any strategies that help meet needs. In fact, you need to include any of these Top Ten words:

Top Ten keywords in Pre-conference session content

Top Ten keywords in preconference session content

But wait, wait… there’s more

I’ve been using the word keywords to refer to those words that appear within a piece of text more frequently than you would expect based on comparing them to a large normative sample. If you perform  a keyword analysis on the preconference contents sample, you find that the top five keywords that appear are iPad, iPads, AAC, apps, and students. This suggests that we do an awful lot of talking about one, very specific brand name device – which is good news for the marketing department at Apple!

Top 15 words by Keyness score

Top 15 words by Keyness score

The relevant score is the keyness value. The higher the keyness, the more “key” the word is i.e. its frequency in the sample is significantly higher than you would expect to see in the normal population. So when you look at the table above, you’re not just seeing frequency scores but how significantly important words are. [3] As an example, the word iPads is used less frequently than the word communication (10 times as against score 16) but iPads is almost twice as “key” as communication i.e. it is significantly more important.

Now, as a final thought for folks who are working in the field of AAC (augmentative and alternative communication), I suggest that if you are developing vocabulary sets for client groups, using frequency studies is certainly a good start (and more scientific than the tragically common practice of picking the words “someone” thinks are needed) but if you then introduce a keyness analysis, you can improve the effectiveness of your vocabulary selection.

[1] In truth, there is more I could say about the methodology, and were this intended to be a peer-reviewed article for a prestigious journal, rest assured I’d go into much more detail about some of the finer points. However, this is simply a blog post designed to educate and entertain, so I ask you to allow me some leeway with regard to precision. I’m happy to share the raw data with folks who want to see it but all I ask is you don’t toss it around willy-nilly.

[2] Not only did it have a title that included the word “algebra” but it was scheduled for 8:00 am on the final day (a Saturday, no less) of the conference. Surprisingly, people showed up – which says more about the sort of folks who attend RESNA conferences rather than anything about my “pulling power” as a presenter.

[3] There is a mathematical formula for the calculation of keyness values. One way is to use the Chi-Square statistic; the other is to use a Log-likelihood score, which is something like a Chi-Square on steroids. As I’ve often said, I didn’t become an SLP because of my ability to handle math and statistics, so I admit to finding these things a strain on my brain. However, for the non-statistically inclined among us, the point is that both these measures simply compare the frequency value of a word from an experimental sample against the frequency value it has in a very large comparative sample (such as the British National Corpus or the Corpus of Contemporary American), and then shows you how similar or dissimilar they are. If their frequencies are very, very dissimilar, the word from the experimental sample is a keyword – like iPad and AAC in the examples above. Now feel free to pour yourself a drink and let your brain relax.