Category Archives: Language

“‘Scuse me while I kiss this guy!”

Watching the French Open Men’s final last week, I was primarily focused on the action on court between Novak Djokovic and Stefanos Tsitsipas and only half-listening to the commentary. As we all know, “half-listening” can be a very dangerous thing – or on the upside, pretty funny. As the players headed off for a quick rub down with a cold towel, I swear I heard one of the commentators say, “Djokovic was definitely hit in the balls.”

New balls please

This comment instantly turned my semi-attention to full-blown concentration, and I wondered if I’d missed some action on the screen. After all, being smacked in the scrotum by a ball traveling at somewhere in the region of 85-95 miles-an-hour isn’t something you’d fail to notice. At that sort of speed, your meat-and-two-veg are likely to swell up to the size of a melon and a vigorous massage from your physio isn’t going to offer any help whatsoever. So did I really miss one humdinger of a nut nobbling?

It took me maybe a minute to work out that what was actually said was, “Djokovic was definitely hitting the balls.” Clearly I had fallen victim to a spectacular mishearing, or what is commonly called a mondegreen. Given that in spoken language, many of the individual sounds at the beginning and end of words will blend and overlap with others, the phrases “hitting the balls” and “hit in the balls” can be identical.

The process at work here is called phonetic assimilation and it happens all the time. And I mean, all the time. What we think we hear and what actually reaches our brain is often different from what we might believe. In rolling speech, “hit in the balls” will be /’hɪtɪnðəˌbɔlz/ and “hitting the balls” will also be /’hɪtɪnðəˌbɔlz/. That /ŋ/ sound in “hitting” (/’hɪtɪŋ/) becomes an /n/ because it assimilates with the preceding /t/sound, which is made by tongue against the back of the teeth (what we call an alveolar sound) and also with the /ð/ sound that’s also made with tongue between the teeth (or an interdental). The /ŋ/ sound is made at the back of the mouth up towards the soft palate (a velar location) but the process of assimilation in connected speech pulls it forward to become a regular old /n/ sound.

Assimilation is the basis for mondegreens, phrases that are misheard, often to humorous effect but sometimes to the detriment of the listener. The word itself comes from a mishearing of a poem called The Bonny Earl of Murray, which contains the line “They have slain the Earl o’Moray and layd him on the green,” and that in turn was interpreted as, “They have slain the Earl o’Moray and Lady Mondegreen.” It’s a plausible mistake because it’s not impossible that a double murder could have occurred with the Earl and his Lady being the hapless victims. But if we consider the phonetic forms, we can easily see how “laid him on the green” and “Lady Mondegreen” can sound alike:

  • Laid him on the green: /’laɪdɪm’ɔnðəˌgrin/
  • Lady Mondegreen: /’laɪdɪ’mɔndəˌgrin/

The stress in the second syllable changes and the /ð/ becomes a /d/ in the second instance, but what we might consider to be clear, phonetic differences when written down can be much less clear when spoken and heard. Also, the words in spoken language are separated by spaces like they are in text – they simply all run together into one long stream of sound. It’s the listener that has to decide where the “spaces” are, and it’s not uncommon to get that wrong [1].

The internet is full of examples of misheard lyrics for songs that have become almost classic example of mondegreens. In Jimmy Hendrick’s Purple Haze, “‘Scuse me while I kiss the sky” gets heard as “‘Scuse me while I kiss this guy”; George Michael’s I Want Your Sex includes “Sex is natural, sex is good, not everybody does it, but everybody should,” which turns into, “Sex is natural, sex is good, not everybody does it, not everybody should”; Bon Jovi’s Living on a Prayer has “Doesn’t make a difference in we make it or not” that sounds like “Doesn’t make a difference if we’re naked or not”; and even I thought for years that when Dobie Gray sang Drift Away, the chorus was “Gimme the Beach Boys and free my soul…” instead of “Gimme the beat boys and free my soul…”

In his classic book The Psychopathology of Everyday Life, Freud called such errors “verhören,” which is German for mishearing [2]. His focus wasn’t on the underlying linguistic errors based on the misinterpretation of assimilation processes but on the words actually misheard and what that might say about the individuals underlying psyche. What he’d make of a patient hearing “Scuse me while I kiss this guy” hardly takes six years of Psychoanalytical training to work out but if you’re hearing Abba sing “See that girl, watch her scream, kicking the dancing queen” or Queen sing, “You got mud on your face, big disgrace, kicking your cat all over the place,” then maybe Dr. Freud needs paging after all!

Notes

[1] Although I’m no fan of Donald Trump and take pleasure in hearing and seeing his gaffs, it would be uncharitable of me not to come to his defense in relation to his much ridiculed use of the word “bigly.” What really happened is that he said the phrase “big league” with a weaker stress on the second element “league,” which was in turn misheard by reporters as “bigly.” If you’re not convinced, just try yourself to say “big league” but let that final /g/ sound weaken slightly and you’ll hear that it easily becomes “bigly.” From /’bɪglig/ to /’bɪgli/ is a dropped final consonant away, a process that is stunningly common in connected speech. And as a bonus, I can tell you that “bigly” is a real word that dates from the 15th century and, as an adverb, means “With great forced, firmly, violently; or loudly, boastfully, proudly, haughtily, pompously.” As an adjective, it means “habitable, fit to live in,” and was used in Scotland until it became obsolete in the early 19th century.

[2] Freud introduced a number of such errors, or what he labeled parapraxes (German Fehlleistungen), that begin with “ver-“: Versprechen (slips of the tongue); Verschreiben (slips of the pen); Verlesen (misreading); Verhoren (mishearing); Vergessen (forgetting of names or intentions); and Verlegen (the misplacing of things). When folks talk about someone making a “Freudian Slip” it’s usually a spoken error – a Versprechen – but as you can see, Freud had a much more extensive classification set to errors. Many of these errors are nowadays analyzed as linguistic phenomena rather than psychoanalytical – but it’s fun to try!

As a side note to the side note, if you intend reading just one of Freud’s books in your life, although The Interpretation of Dreams is typically considered his magnum opus, I highly recommend The Psychology of Everyday Life as an alternative. Along with Jokes and Their Relation to the Unconscious, he builds on many of the themes that he wrote about in The Interpretation of Dreams and applies them to non-pathological areas of life.

Peppa Pig: Go Ahead and Let Your Kids Watch!

One of the special things about having grandchildren is that when you’ve had enough of them, you can give ’em back to their parents. There’s a certain amount of schadenfreude to be reveled in with this, particularly if you had some challenges bringing up your kids in the first place. Although I don’t actually gloat, I can’t but help feel a frisson of pleasure when my darling daughter tells me she’s had a sleepless night because her 3-year-old got up a 3:00 AM and began running round the house, and her 7-year-old had a tantrum before going to school. I simply nod sagely and say, “Yes, it’s rough, isn’t it.” Bad Daddy!

So while she and her husband get all the pain and anguish of living and working with two young kids (and we all know it doesn’t get any easier as they age!) I get to have fun time with them because (a) they only get me in small doses and (b) I can spoil them rotten [1].

Of course, this doesn’t give you free rein to allow total anarchy and hedonistic behavior so you have to at least rationalize your choices when it comes to letting your offspring decide what they want to do. Which brings me to Peppa Pig.

For those unfamiliar with this delightful British cartoon character, Peppa lives with her mummy and daddy and little brother George, who apparently has an expressive language disorder that no-one is in the least bit worried about. His only two utterances appear to be “dinosaur” and “Rrraarrrgggghhhh!” neither of which is core vocabulary and represent only two grammatical classes; noun and interjection. Sure he’s only a toddler pig but come on, his motor skills suggest he’s at least 24 to 30 months, so I’d expect him to have a much larger lexicon!

Language disorder aside, Peppa has an extended family in the form of Grandpa and Granny pig, who appear to be pretty well off considering they have a boat, which is not as common in the UK as in the US [2]. Then she has an extensive network of imaginatively named friends such as Suzy Sheep, Rebecca Rabbit, Zoe Zebra, Emily Elephant, and Delphine Donkey. It seems that initial consonant alliteration is a critical feature of animal nomenclature! But it’s actually a very good way to develop phonological awareness skills. According to Reese, Robertson, Divers, and Schaughency (2015):

…parents who play rhyming or alliteration games with their children, who sing rhyming songs more often with their children, or who engage in other types of wordplay (e.g., tongue twisters), may be fostering their children’s phonological awareness. (p.57).

Wittingly or unwittingly, the writers for Peppa Pig have built in so cute, subtle ways of providing viewers with phonemic cues that can help in speech sound development. And as Reese et al. also point out, “Children’s phonological awareness develops rapidly in the preschool years and is an important contributor to later reading skill. (p.54)” Clinicians and educators are usually much more aware of this. Thatcher (2010) points out that:

Children gain important information about rhyme and alliteration from learning poems and rhymes in which the prosodic features of the poem stress the shared sounds in the word. The profession of speech pathology must take possession of this area of early intervention… (p.476).

But wait, wait – there’s more! The didactic properties of Peppa Pig don’t just end with phonology. For the purpose of analyzing the vocabulary content of the show, I obtained a written set of transcripts from the complete first season [4] and ran the data through WordSmith 7, my trusty corpus linguistics software tool of choice. With this, I’m able to compare the frequency of use of words from the Peppa Pig sample with any other list that I choose. What I wanted to do was get an idea of how “core” the vocabulary in Peppa Pig is, and by “core” I mean how much of the entire vocabulary used is made up of high frequency words used by many people of many ages across different situations [5].

Being the author of Unity 84, a language program available in Prentke Romich devices, I choose the vocabulary associated with that as my core comparison. This is simply because it’s a set based on data from a number of core vocabulary studies and includes hundreds of low frequency nouns, which offer a little balance to a pure core list that would be weak in such words. But so long as I use the same core to make comparisons against other samples, the resulting “Core Scores” will be comparable [6].

So here’s how Peppa Pig fares in the “Core Score” arena.

corescorepeppapig

Core Score for Peppa Pig

What this means is that I counted ALL the instances of where core words were used in Season One, then counted all the instances of fringe words, and generated a simple percentage. So if someone is watching Peppa Pig, almost 83% of all the words they hear will be core words. I therefore give Peppa Pig a “Core Score” rating of 83.

It’s great to be able to toss out a number and say “Hey, this TV show is an 83” but that’s not tremendously useful unless there are comparisons. So I found a transcript for an episode of another of my favorite cartoons shows; SpongeBob SquarePants. And here’s how he did:

corescorespongebob

SpongeBob SquarePants Core Score

As you can see, SpongeBob gets a “Core Score” of 75, which tells me that my clients would be better off watching Peppa than SpongeBob if I want them to hear more core words. And in general, I would. After all, if I want to encourage clients to use more core words, putting them in situations where they hear lots of models of how those words are used is a solid goal.

Just out of curiosity, I applied the same analysis to three common, popular children’s books; Where the Wild Things Are, Goodnight Moon, and The Very Hungry Caterpillar. Here’s what I found:

corescorebooks

Books Core Scores

All of the preceding is not peer-reviewed research. It’s not even close. In fact, I’d even be hesitant to call it a “pilot study.” In the world of Business, it’s what we call a “Proof of Concept” – where you test out a few ideas so as to demonstrate that what you’re thinking about is something on which someone would be prepared to spend money [7]. But if you were to use it to argue the merits of suggesting that watching Peppa Pig is not a bad thing, then I think the data supports your decision!

References

Reese, E., Robertson, S.-J., Divers, S., & Schaughency, E. (2015). Does the brown banana have a beak? Preschool children’s phonological awareness as a function of parents’ talk about speech sounds. First Language, 35(1), 54-67.

Thatcher, K. L. (2010). The development of phonological awareness with specific language-impaired and typical children. Psychology in the Schools, 47(5), 467-480.

Notes
[1] It’s right there as number one in the Grandparent Commandments; “Thou shalt bestow upon thy grand offspring anything and everything they desire, and in the event that this is not possible, thou shalt feel perfectly OK with saying, ‘Oh sweetheart, that’s something to ask mommy and daddy.'”

[2] My older daughter and her husband have a boat on which my wife and I have spent some happy hours letting them do all the work of dragging it to a lake, dropping it in the water, steering it to the nearest lakeside bar, and paying the cost of repairs, maintenance, and storage required so that we can enjoy those 5 days in summer when the nautical life is the thing to embrace. Like having grandkids, having another family member own a boat means you can have all the pleasure but none of the responsibility.

[3] As further evidence that Peppa’s younger brother has a problem, note that he is one of the only character who does NOT have an alliterative name – he is “George Pig” as opposed to, say, Peter Pig or Paul Pig, or even Patrick Pig. So not only has he a more complex name structure to deal with than all the other animals, but he also has that initial “djuh” sound /d͡ʒ/ to struggle against. Poor George!

[4] My source is at “Glamour and Discourse”: Peppa Pig transcripts Season One. In the spirit of transparency, you’re free to use the same data and run your own analyses to see if they match with mine. I think they will but in a world driven by President Donald Trump’s “alternative facts” who’s to know?

[5] New visitors to this blog who are unfamiliar with the notion of what we refer to as a “core” vocabulary set in the field of augmentative and alternative communication (AAC) might like to check out the following posts:
Of Puck and Patois
Of Corpora and Concordances
The Monteverde Invincia Stylus Fountain Pen – and Keyword Vocabulary

[6] At a more technical level, the Unity core list is an unlemmatized list that consists “words” that are defined as “a string of letter terminating in a space or punctuation mark.” So the words eat, eats, and eating are counted as three distinct words, even though they are really just variations of the one lemma, <EAT>. A critical question in deciding on what constitutes a “core” list is whether it should include only root words such as eat and drink but not eating and drinking, or whether it should have all forms of a word in there. If you use a core that has eat but not eats, then any TV show or book that uses the word eats would not have that token counted towards a “core score” – but shouldn’t it? I’m open to suggestions, folks!

[7] I intend to test out a few more core lists in order play with the Core Score idea a little more.

The Contronymic Properties of Shit

Some years ago I posted a piece called Shitosophy: A Philosophy for the Existentially Lost, which relied heavily on the use of the word shit and its synonyms to make a point. This time around, I’m using shit again to introduced – 0r reacquaint – readers to the concept of the contronym. You may not have heard the word contronym before but you will have come across examples of it.

contronyms

A contronym is a word that can be used in two ways to mean exactly the opposite of the other. The classic example is cleave. On the one hand, it’s used to mean “to join or stick together” as in “I was so dry my tongue cleaved to the roof of my mouth,” and on the other it’s used to mean “to split apart” as in “The hatchet cleaved his head in two.” [1]

Other contronyms include sanction (meaning both “to permit” and “to ban”), strike (meaning both “to hit” and, in particularly in baseball, “to miss”), fast (meaning “moving quickly” and “stuck immovably”), and peruse (both “to look over quickly” and “to look at in great detail”).

So where does shit come into this? Well, below is an image shared with me last week from Facebook that is ostensibly one of those “kids, look at what they come up with” pieces:

we-have-shit

To be fair to the kid, it’s not wrong! And what’s more interesting is that there are two meanings to the sentence that can only be disambiguated by changing which word is being stressed.

If the stress comes down on the verb have, as in “We HAVE shit,” then that means “we have something.” In this case, shit is used as a mass noun meaning “stuff” or “something.” However, if the stress comes down on the noun for “We have SHIT,” this means “we have nothing” or even “we ain’t got diddly squat.” Here the word shit means “nothing” or an absence of something.

What we’re seeing here is the word shit being used contronymically as it can mean both something and nothing. Of course, shit has many other meanings and so isn’t solely a contronym but the example above demonstrates its contronymic aspect. The Oxford English Dictionary has multiple entries for shit as a noun, adjective, verb, and interjection, along with a list of phrases that includes shit as an essential component. It’s clearly a very flexible word (as is the case with a number of profanities) and very, very old.

There’s another contronymic example of shit that depends on whether it is used along with the indefinite or definite article [2]. Consider the sentences below:

  1. You are the shit.
  2. You are a shit.

In the first instance, shit means something that is good and desirable but in the second it means something bad and undesirable. You’d be happy if you were THE shit but not if you were A shit.

Both cases serve to illustrate how a word’s meaning can be changed dramatically by minimal effort. In the first, it’s stress that determines meaning, and in the second it’s the definite/indefinite article that does it.

So now you know some new shit!

Notes
[1] Fans of the tremendously entertaining Game of Thrones on HBO can now go back and re-watch the series to count the number of examples of cleaving that take place on a regular basis. From the cleaving of Cersei and Jaime Lannister in an incestuous rendition of “the beast with two backs” to the cleaving of Gregor Clegane’s horse’s head from its body. And as a final piece of cleaving trivia, when Cersei Lannister, played by Lena Headey, did her naked “Walk of Shame,” she actually used a body double actress by the name of Rebecca Van Cleave, which involved the photo-shopped cleaving of Lena’s head onto Rebecca’s body. And who said linguistics was boring!

[2] In the field of Augmentative and Alternative Communication (AAC) words like the, a, and an are often lumped together with words such as at, in, be, and is and called “Little Words.” As I’ve whined about before (Stop with the Little Words Grab-bag) this category is non-linguistic and based purely on the number of letters used in a word. But in the case of “the shit” versus “a shit,” we can clearly see that teaching a versus the is essential because the wrong choice can significantly change the intended meaning of a phrase or sentence. So the a/an/the distinction has to be treated as much more that just “little words.”

My Hovercraft Is Full of Eels and My AAC Device Full of N-grams

Hovercraft

A hovercraft with or without eels

Back in my college days – that’s 1977 to 1982 for those who like historical perspective – a friend of mine was taking East European studies with a view, I think, to improving his chances of joining the Socialist Workers Party. Although it wasn’t actually obligatory to speak any of the languages from Communist Europe, he clearly felt it might help. And come the day of the Glorious Revolution, when the Working Class of England would cast off their Capitalist shackles and take control of the means of production to become part of the global Socialist world, he’d be one of the intellectual elite who would help the under-educated proletariat rise to power. Sadly for him, the down-trodden workers decided to vote for Margaret Thatcher and usher in a new age of Capitalism where owning the means of production meant buying shares in British Telecom, British Aerospace, British Gas, and a host of companies that they already owned as tax payers! This was Mrs. T’s version of Clause IV socialism [1].

And that is why I happened upon a Czechoslovakian phrasebook.

I have to admit that my brief flirtation with radical socialism was fueled at that time by the fact that the local Labour club served subsidized beer, and another of my friends who worked behind the campus bar would serve Russian vodka as doubles or triples while still charging for a single. Hardly a rock upon which to build a firmly held political perspective but unlike my socialist buddy, I wasn’t at college to change the world – I was there to get a degree in Psychology and Linguistics so I could become a professor with a job for life [2].

Like many foreign language phrasebooks, it contained many “useful sentences” that one could simply trot out in the appropriate situation. Although I no longer have the book itself – and can’t for the life in me remember the title – I did keep the following short list of examples:

Can this be invisibly mended?
I have broken this denture.
How high is that mountain?
The clutch engages too quickly.
To whom does this concrete sports pavilion belong?

The latter, if memory serves me correctly, had an example answer along the lines of “It belongs to the people of the glorious Czech Republic.”

There’s actually a name to describe these types of sentences; postilion sentences. This was coined by the UK linguist David Crystal in a 1995 article where he talked about sentences used in teaching English as a Second Language:

A postilion sentence is one which has little or no chance of ever being useful in real-life. It could be used, obviously, because it is grammatically well-formed; but the contexts in which it would be natural to use it are either so restricted or so adult that the chances of a child encountering it, or finding it necessary to use it, are remote. In short, it is uncommunicative. It conveys a structural meaning, and a lexical content, but that it is all.

Crystal refers to a sentence from a early 20th century Hungarian-English phrasebook that went “The postilion has been struck by lightning.” It’s not perhaps a coincidence that the British Monty Python’s Flying Circus comedy group came up with a skit called “The Dirty Hungarian Phrasebook video,” where Hungarian phrases were translated into obscene, or simply ridiculous, English phrases, one of which has taken on a life of its own; “My hovercraft is full of eels.” It has become such a popular example of a postilion sentence that the linguists at the Omniglot website have devoted a page to provide over translations in over 130 languages from Afrikkans (“My skeertuig is vol palings“) to Zulu (“Umkhumbi wami ugcwele ngenyoka zemanzini“). So should you ever find yourself needing to explain the fishy condition of your water-skimming vehicle while vacationing in Iceland (“Svifnökkvinn minn er fullur af álum“) remember to bookmark that page!

Postilion on the Queen's carriage

The royal postilion

In fairness to phrasebook creators, creating lists and lists of sentences can appear to be a reasonable goal. After all, should you find yourself in the middle of a crowded street in some foreign land with the need to scream “That organ grinder’s monkey has stolen my wallet,” having it written down at the tip of your fingers would clearly be of benefit [4]. Similarly if you’re out on a dark and stormy night in Transylvania and your postilion does indeed suffer a lighting-related injury, you’d also be covered (“A légpárnás hajóm tele van angolnákkal“).

The limitation – and it’s a pretty big one – is that it is impossible to predict all the sentences that a traveler could potentially need. The best you can do is create a selection of fairly generic sentences that can be used across situations, such as “I like that” or “That’s not what I wanted” or “Excuse me but I need some help.” Now, if you put your lexical and statistical hats on, ask yourself why “That’s not what I wanted” seems like a better choice than “My hovercraft is full of eels.” If you said it because the former seems to be a more probable sentence than the latter, then you’re definitely on the right track. When you consider strings of words, one way of analyzing them is in terms of frequency of use, and words like that, not, want, what, my, and is, are far more frequently used than hovercraft, eels, monkey, and postilion [5].

If  you wanted to perform a simple test at a bar, a much underrated and underused experimental venue, write down the following cloze sentence [6] and ask as many folks as possible to fill in the blanks:

My <blank> is full of <blank>

In truth, I have no idea what you’ll get in response, although glass and beer may well score higher than most nouns, but the  chances you’ll get hovercraft and eels is very low. What I can predict is that the missing words will be nouns because when you look at sub-strings of words, the inherent rules of how the English language behaves start to bias our choices. In computational and corpus linguistics, folks talk about such string as n-grams, where n is the number of words in the string.

The n-gram [my <blank> is] is a trigram, and words my and is limit the words that could fit into the blank. In fact, if we look at the bigram of [my <blank>], even that excludes certain choices. This is because when we use a possessive adjective such as my, the probability is that the word to follow will be a noun. If it’s the trigram of [my is], that probability actually goes up. For example, we can find examples of the bigram [my ] as follows:

[My dog]. (my + NOUN)
[My old] dog. (my + ADJECTIVE)
[My very] old dog. (my + ADVERB)

As you see, the bigram [my ] doesn’t have to be a noun if it’s part of a longer string. But if we contrast that with the trigram [my is] then we are much more limited:

[My dog is] hungry. (my + NOUN + is)
[My *old is] hungry.
[My *very is] hungry.

For those of us who work in the field of augmentative and alternative communication (AAC) we’re actually more familiar with the science of n-grams than we might have realized because this is essentially how word prediction works. Outside of AAC, anyone who uses a mobile phone will have seen next-word prediction and not necessarily worked out that it’s based on algorithms that use n-grams to estimate the most likely next word.

Of course, probabilities are simply that; probabilities. Given a word or n-gram as a starting point, we can make good guesses as to what word or words may come next but you can never be 100% sure. A number of AAC vocabulary sets have a feature whereby if you bring up the n-gram [SUBJECT PRONOUN + TO BE] a selection of verbs appear that are all in the progressing form i.e. VERB+ING. This is based on the thinking that whenever you say something like “I am…” or “he is…” or “we are…” any following verb is likely to be along the lines of eating, drinking, running, finishing etc. But that’s a probability only –  I might want to say “I am finished” or “He is done” or even “I am really thinking about…” or “we are certainly not wanting…” where the verb is actually in the ED form or there are other words (typically adverbial) before the following verb. If I want to say “I am doing something” then having doing appear automatically after “I am” can save keystrokes; but if I want to say “I am done,” I have to delete the word doing then find done as a word on its own, which adds keystrokes and takes more time.

Designing AAC systems to take advantage of n-grams is not a bad idea. Back in the 1990s when I was working with the team that developed the Unity symbol-based language program for devices built by the Prentke Romich Company, we included a number of bigrams and trigrams based on the thinking that phrases such as “I like” and “do you want” or “she doesn’t feel” have frequencies that are comparable to individual words and actually much higher than the vast majority of nouns. At the time, we didn’t have the resources to check the figures but nowadays it’s pretty easier to do that with online corpora. A phrase such as “do you want” has a frequency score or 11126 in the Corpus of Contemporary American English (COCA), which is way above words like postilion (9), hovercraft (109), eels (464), and even lightning (6724). Another example is “I don’t like,” which comes in at 5282 but when you look for “I don’t like ,” the frequencies drop dramatically:

I don’t like (5282)
I don’t like it (682)
I don’t like this (211)
I don’t like being (128)

What you see is that in general, as the length of the string increases, the frequency drops, to the point that “I don’t like eels” and “I don’t like hovercrafts” score a big fat zero. It’s only those bigrams and trigrams that seem to have frequencies that make them practical within an AAC vocabulary set.

You can now probably work out why sentence-based AAC systems are not only impossible to design but unlikely to be of use. Sentences are in effect simply n-grams with a large n value. “My hovercraft is full of eels” and “My postilion has been struck by lightning” are a 6-gram and a 7-gram respectively, and because probability is cumulative (the sum of the probability of each word) you can imagine how stunningly low the frequencies can be for sentences. Word-based systems, supplemented with high-frequency bigrams and trigrams provide access to vocabulary sets that are flexible and practical. Having the individual words eels, full, hovercraft, is, my, of as building blocks from which to construct novel sentences when turns out to be much better than having thousand upon thousands of prefab sentences stored “just in case.”

Notes
[1] The phrase “Clause Four Socialism” came from the fourth clause in the UK Labour Party constitution of 1918, which read; “To secure for the workers by hand or by brain the full fruits of their industry and the most equitable distribution thereof that may be possible upon the basis of the common ownership of the means of production, distribution and exchange, and the best obtainable system of popular administration and control of each industry or service.” Although it sounds like it was written by a lawyer and has more embedded clauses than a convention of Santas, it formed the basis for the Socialist ticket of Britain in the 1970s, where the country came as close to being a satellite of the USSR as it had ever been.

[2] That turned out to be yet another dream unfulfilled with my life taking a very different path that kept me well out of the world of academia. But you’re not here to read about me so go back to the article and keep reading 😉

[3] Crystal, D. (1995). Postilion sentences. Child Language Teaching and Therapy, 11(1), 79-90.

[4] Technophiles will point out that the better way to do this is to shout “That organ grinder’s monkey has stolen my wallet” into their smart phone with translation facilities. That may be true but even machine translation can get a little iffy at times, and there’s a good chance that if the aforementioned simian is smart enough to target your wallet, it’s probably going to snatch your iPhone too. No-one reads books any more – not even monkeys – so your pocket phrasebook would be safe.

[5] I suppose now is a good time to add a little bit about postilions for those who are curious. On horse-drawn carriages, the postilion is a person who sits on the leading left-hand horse and who can guide the carriage if there isn’t an actual coachman on the carriage itself. The word derives from the French postillon meaning “the person who rides the post horse,” and the post horse was the one reserved for a mail carrier who would use it to take letters from one location to another. The earlier Middle French noun poste referred to “Any of a series of men stationed at suitable places along appointed post-roads, the duty of each being to ride with, or forward speedily to the next stage, the monarch’s (and later also other) letters and dispatches, and to provide fresh horses for express messengers riding through.” (OED).

[6] A cloze sentence is one where words are purposely left out so that readers can add appropriate choices. It’s a standard tool for research and education, especially when teaching literacy. The word is simply a shortened version of the word closure, hence the pronunciation of /kləʊz/ and not /kləʊs/. It’s not a “close” sentence but one that needs “closure!” It was first noted in 1953 so is relatively new.

Retronyms: We Only Get ‘Em When We Need ‘Em

I learned a new word this weekend. Retronym. For me, learning new words is simultaneously exciting and depressing; exciting because it’s something new, but depressing because it serves to remind me of how much I don’t know. If my vocabulary were to be measurable and I turned out to have a 60,000 word lexicon, you can bet your life I’d be miserable that it wasn’t 60,001. And if I learned a new word, I’d be equally bummed that I didn’t have 60,002.

My psychological issues aside, the word retronym is also fascinating to me because it serves to describe a phenomenon that we all know and use but without actually knowing the word to describe it!

Back in the 1970’s, when phones were not smart and coffee was not decaffeinated, clever inventors at the Hamilton Watch Company designed a new timepiece that eschewed such primitive things as “hands” and “winders” in favor of a using something called light-emitting diodes that would light up and show numbers. Imagine that – actual light-up numbers! So instead of learning that “the little hand is between the 2 and the 3, and the big hand is on the 30, so it must be two-thirty,” you just saw a 2 and a 30 and said. Two-thirty.” Brilliant!

This became the first ever digital watch. and it was called that to distinguish it from the original watch. But the next thing to happen was the use of the new compound analog watch as a way of being more specific about the difference between the two timepieces. Analog watch thus becomes an example of a retronym; a word that the Oxford English Dictionary defines as, “a neologism created for an existing object or concept because the exact meaning of the original term used for it has become ambiguous.” Clearly there was no need for this word prior to its coinage because all we had was a watch [1]. Digital watches became cheaper and cooler to the point that it was pretty naff [2] not to have one.

Digital LED watch

Is this cool of what?

Of course, like all such fashion accessories, they eventually became so ubiquitous that folks began to stop wearing them in favor of analog watches – what we used to call watches but can now also be called analog watches to distinguish them from digital. I for one love my Accurist MS832Y Chronograph and always recommend that a dude should have at least one real watch in his collection of fashion accessories.

Accurist watch

Real watch

But now we have the smart watch. Here’s another retronym we now need because it contrasts with the previous stupid watches; you know, the ones that only tell you the time – duh!

In general, technological advances are a spur to the creation of retronyms. I have a wired headset and a bluetooth headset (I used to just have a headset) to listen to music from my wireless radio or my satellite radio (we used to have radios); I see both American football on TV and European football (because we used to just have football until the Americans decided to use it for their version of rugby with padding), and get calls on my landline phone as well as my cell phone (all phones were landlines 40 years ago); and I prefer to read paper books (thought more people now read e-books) and avoid non-alcoholic beer (because we all used to drink just a beer). Fortunately we don’t yet have a retronym for non-alcoholic beer as there seems to be no ambiguity about it.

As you can see, a retronym is typically a compound noun where the original noun is preceded by an adjective or noun that modifies it. The word e-book is a step ahead of other retronyms in that the full form, electronic book, has quickly been shortened to the e– prefix [3], as have many other electronic devices such as the e-cigarette, e-mail, and perhaps e-learning. However, only e-mail seems to have gained any real traction as a “real” word, with hopeful monsters such as e-zine, e-banking, and e-reader still left struggling for acceptance.

Just for completeness, the original word from which a retronym is derived can be called a protonym. So e-book is the new word (or neologism), paper book is the retronym, and book is the protonym. Similarly ballpoint pen was a new word, with the retronym being fountain pen, and pen the protonym.

Learning a retronym is also another lesson in aging. Most frequently, the retronym represents something that is on the way out or outmoded. I guess that’s why I cling so dearly to my paper books.

Notes
[1] The first watches were designed to be carried on a chain and kept in a pocket. Then when a watch was designed to be worn of the wrist, we suddenly found we had a wristwatch and a pocket watch. But in this specific case, eventually the word wrist was dropped from the new word, leaving us with a watch and a pocket watch. The word watch was originally a protonym for a timepiece you kept in your pocket, but it became a protonym for a timepiece you have on your wrist. Essentially, it changed its meaning. So we didn’t see the new word *digital wristwatch versus *analog wristwatch but digital watch and analog watch.  Well, at least I find that interesting 😉

[2] The British English word naff is relatively recent (1960s) but of uncertain origin. It means, “unfashionable, vulgar; lacking in style, inept; worthless, faulty.” The phrases “Naff off” or “Naff all” are euphemisms for “Fuck off” and “Fuck all” and may be a nod toward one suggested origin of naff as being Polari slang for “Normal As Fuck,” but this is hard to substantiate. And Polari is “A form of slang incorporating Italianate words, rhyming slang, cant terms, and other elements of vocabulary, which originated in England in the 18th and 19th centuries as a kind of secret language within various groups, including sailors, vagrants, circus people, entertainers, etc.” It was used extensively by the gay community of London in the 1950s and 1960s but has pretty much faded out now.

[3] The modern e-prefix is a shortened form of the word electronic. The older e-prefix (as in eject, egress, eviscerate) comes from Latin and means “out of,” “from,” “without,” or “former.”

Language on the Move: The Case of the Flat Adverb

In a not-so-long-ago ad, Apple asked us all to “think different.” Even longer ago, Elvis Presley asked us to “love me tender. And when I was a wee bairn, my mum used to tuck me in at bedtime with the phrase, “Night night, sleep tight, don’t let the bed bugs bite.”

I wasn’t a particularly precocious or bright toddler, so my response to mum was simply to smile and stick my head under the covers to check for insects, rather than, “But mum, surely it should be sleep tightly because you’re using the word as an adverb and therefore the correct formation of the word is to take the adjective as the base and use the –ly ending as an adverbial morpheme?” I suppose if I has said that I’d have been called a “clever clogs” [1] and told to “just go to sleep.”

drive slow sign

Adverbs, by definition, are used to describe verbs, adjectives, and other adverbs. With Elvis, if the question to him was “How do you want me to love you,” he should reply “tenderly”; with Apple, if the question was “How would you like us to think,” the reply would be “differently”; and with mum, she should be telling me to sleep “tightly.” We might also find we’re “talking loudly,” “laughing heartily,” “arguing vehemently,” “working quickly,” and “complaining bitterly” whenever the occasion demands it.

So why isn’t Apple thinking differently, Elvis being loved tenderly, and I sleeping tightly? Well, it’s all to do with something called flat adverbs and the appeal of the –ly ending.

The commonest way to create an adverb is to take an adjective and add an –ly to the end of it. You have a “hungry cat” and a “thirsty dog” but the former will “eat hungrily” and the latter “drink thirstily.” Similarly your “perfect day” should “end perfectly” and a “generous patron” will always “give generously.” It’s regularity like this that should make the lives of teachers of English easier, and the possibility for artificial intelligence more likely. Alas, consistency and continuity seem to be in short supply when it comes to language. In fact, just when you think you’ve got it all worked out, the lexical world starts to wobble on its axis and, like tectonic plates on a bed of molten rocks, words slide around and rearrange themselves in all sorts of non-standard ways.

Flat adverbs are an example of these slippery words that want to have it both ways – adjective and adverb. It’s like Bruce Jenner wanting not to become just Caitlyn but both Bruce and Caitlyn at the same time! They skip and jump around like frogs on a hot plate, not pausing long enough for anyone to get a grip on which is right or wrong – or perhaps more accurately which is better at any particular time.

One situation where you can take a stab at which to choose in when you’re writing songs or poems and meter is important. When mum told me “Night night, sleep tight,” she was simply adhering to the underlying stress pattern of the phrase, along with the rhyme for night and tight. The form “Night night, sleep tightly” would be judged grammatically correct but poetically wrong. Similarly when Johnny Cash sang about how “the sun shines bright on my old Kentucky home,” using brightly would have buggered up the timing [2], forcing the Man in Black to slip in an extra syllable that really doesn’t want to be there. And try singing “Love me tender-ly, love me do…” to get a feel for why Elvis flattened his adverb.

Our confusion over flat adverbs is comes primarily from those that are identical to an adjective. If you consider the pair fast and slow, the former presents less of a problem because it doesn’t have an –ly form. I can “run fast” (adverb) or drive a “fast car” (adjective) and not worry about whether it’s an adverb or an adjective because there simply is no *fastly. However, although I can drive a “slow car” (adjective) it’s less obvious whether to “drive slow” or “drive slowly.”

As you might suspect, the bastard nature of English also plays a part in spreading confusion [3]. Way back when Old English was the current flavor of the language, changing an adjective into an adverb was done by the addition of a final -e; fairly simple, eh? So if we had the word glaed (OE for our modern glad) then you could add an e to make glaede meaning gladly. So far so good.

If you wanted to turn a noun into an adjective, you could add the ending -lic; again, not to tricky. The word craeft (meaning skill) became craeftlic, an adjective meaning skilful. So guess what you did to say skillfully? Yup, you added the e-ending to get craeftlice.

This meant we had some adverbs ending in a very weak-sounding –e and others with a more pronounced-sounding –lice. Gradually over the years, the weak –e disappeared and the stronger –lice became the slightly weaker –ly. Equally, those adjective ending in –lic also wore down to take on the sound of –ly. By the 14th century, we had adjectives and adverbs ending in –ly but this ending became the more commonly used to mark adverbs. Folks then started adding it willy-nilly to adjectives and this is pretty much how we do things in Modern English.

It’s not surprising that folks have some trouble working out whether adverbs should have an ly at the end or not, and those fossilized flat adverbs don’t make it any easier. Strang (1970) [4] expressed a sentiment that is as true today as it was in the 20th century:

…the sense of unease about adverbs homophonous with an adjective […] has been felt at all periods, and there has been a steady progress from plain to –ly forms (p.273).

Apart from my earlier suggestion that you can use poetic meter to decide which word to use, another guideline you might want to consider is that flat adverbs are more likely to sound right in short, imperatives. So “sleep tight” and “drive slow” are fair enough. As is “think different.” As always, if you’re unsure, use a dictionary or better still an online corpus. But don’t get too wound up about whether to use an ly form of not; if it’s taken a thousand years to get to this point where no-one is sure, you’re not going to find the definitive answer from reading this one blog post!

Notes
[1] I’m something of a fan of the UK cartoon series Peppa Pig, and in an upcoming post I’ll explain in some detail precisely why but for now, just take this as a snippet of information that gives you a peek into what makes me tick. In several episodes, the phrase “clever clogs” is used, and although I had to explain this to my American family, folks over in the UK have no difficulty with it. And why not, seeing as it appears to have been around since 1866 at least! Joseph Wright’s 1898 English Dialect Dictionary also includes the phrases “clever-breeches,” “clever-clumsy,” “clever-dick,” “clever-head,” and “clever-shanks.”

[2] When I was a kid in the 1960s, the word bugger was a swear word that would get me a clip round the ear for using. In the hierarchy of swear words, bugger was about as profane as bloody, with bloody hell being a tad more shocking. In the more liberal 21st century, bugger and bloody are now little more than quaint Britishisms, especially to the American ear because they never crossed the Atlantic as curse words. It’s a little known fact – but allow the Dudes to enlighten you! – that the word bugger comes from the Latin Bulgarus, which means Bulgarian, and was used to refer to a group of 11th century heretics who came from Bulgaria. As often happens when people talk about any group with which they disagree, the orthodoxy ascribed certain “practices” to the Buggers, one of which was sodomy. By the 16th century, the word was being used to describe anyone who committed the crime of buggery (engaging in sodomy), and by the 19th century it was being used as a general term of abuse or insult. By the end of the 20th century, it had become less profane and could also be used in a more affectionate”blokish” way, such as “He’s really quite a decent bugger when all’s said and done.”

[3] An interesting article on the development of the ly-ending in English and its parallels in other languages is:

Hummel, M. (2014). The adjective-adverb interface in Romance and English. In P. Sleeman, F. V. d. Velde & H. Perridon (Eds.), Adjectives in Germanic and Romance (pp. 35-72). Amsterdam and Philadelphia, PA: John Benjamins.

There’s also some information in the highly entertaining book:

Burridge, K. (2005). Weeds in the Garden of Words: Further Observations on the Tangled History of English Language. Cambridge: Cambridge University Press.

[4] Strang, Barbara M.H. (1080). A History of English. London: Methuen.

But Why Are Irregulars “Irregular?”

The English language is a glorious bastard child. Like the English themselves, its words and grammar are the result of the promiscuous and incestuous interbreeding that has been going on since the Angles, Saxons, and Jutes decided that they’d like an island vacation rather than sprawl out topless on the beaches of 5th century Europe – just as their current descendants do. Add to the mix the vocabulary of the Picts and Scots, along with a smattering of the ancient Welsh and Irish, and you’ve got yourself a language that turns out to be more wanton and debauched than a Roman orgy hosted by Caligula in a particularly creative mood.

As a result of this linguistic licentiousness, speech pathologists and English Language teachers find themselves having to teach a host of irregular, eccentric, and downright capricious words and grammatical structures. And there’s no finer example of this than something we call “the irregular verbs.” In fact, the very name “irregular verbs” tells us all we need to know; that here is a bunch of words so odd that we’ve just given up on them and tossed them into a huge bucket labeled “irregular.”

Irregular verbs cartoonOccasionally, you might hear the uncomfortable question…

“But Miss, Miss, Miss, why is it went and not goed? Why is it saw not seed? And why can’t I say taked instead of took?”

As pragmatists, 99% of us will just say, “Because it is” and then focus on the job at hand – teaching the exception to the rule. But 1% of us – and I count me as “one of us” – really does wonder “But why IS it went and saw and took rather than goed and seed and taked?” After all, when we invent a new verb such as to google or to tweet, it only takes a few weeks until folks have googled and tweeted or maybe even Facebooked. We know the rules; we apply the rules; we’re done!

Well, much as we all like to think we are hip, modern, trendy, and capable of being innovative game-changers who think outside of the box and shake up current thinking, as far as language goes, we’re tied to our undeniable linguistic history – the ghosts of the philological past are still haunting our etymological present. And like prehistoric flies trapped in amber, some of the words we use are really just fossils from an earlier age.

Back in the mid 1990s, Eva Grabowski and Dieter Mindt published a paper [1] that listed the most frequently used irregular verbs. They didn’t just sit in an office and google “most frequently used irregular verbs” but went back to basics and used the data from two pretty big (at the time) corpora: The BROWN corpus of American English [2], and the LOB corpus of British English [3]. Using real data rather than the “best guesses” of lexicographers was a huge step forward. For those of you who like FREE STUFF, you can click below to get a PDF copy of the top 100 irregular verbs by frequency. And why would you want it? Well, if you’re going to teach irregulars, starting with those used most makes a lot of sense.

Link to 100 most frequent irregular versb100 most frequently used irregular verbs

So let’s take the top of the list item, the verb to say, and crack open the amber to extract its etymological DNA.

Old English, and its Germanic predecessors, had more verb forms than modern English. Today, if you invent a new verb, such as to twerk, you only need to add three different endings to make it sound right: +s, +ing, or +ed.

“Miley can twerk, She twerks too much. Yesterday she twerked, I think she’s twerking too much.”

But Old English was a much tougher, with most verbs having around 14 different forms. And some verbs were strong while others were weak. It wasn’t that the strong ones would bully the weaker ones but the strong verbs would change their forms in a much more dramatic fashion than the punier weak ones. A strong verb would change its base form by muscling in new vowels. A commonly cited example of a strong verb is to sing, where you get sing, sang, and sung, with each form differing by the vowel [4]. Similarly ring, rang, and rung, or swim, swam, and swum. In contrast, the reason weak verbs are so-called is because they merely add an ending to their base form rather than man-up and ram those new vowels between the consonants.

I’m over-simplifying a little. There’s something of a sliding scale from “very strong” to “milquetoast weak,” and Old English scholars talk about 7 classes of strong verbs and 3 classes of weak ones. You have to think that with such a complex system, being a grammar teacher back in the 5th century CE must have paid more than it does today.

Having just explained the distinction between strong and weak verbs [5], take a look again at the verb to say. Is it strong or weak? Well, it’s so weak I’m surprised it hasn’t locked itself in a bathroom for fear of being hassled by to begin and to go! All that happens is that a /d/ sound gets added to the base form of /seɪ/, and the vowel changes ever-so-slightly by getting a tad shorter to leave /sɛd/. It’s technically from an Old English Class 3 weak verb that began life as secgan, meaning “to say,” and now has the pitiful pair of say/said left.

Number two on the list of irregulars, to make, is really pretty similar to to say, and so we should skip hastily on to the much more interesting to go, which has the disarmingly bizarre went as one of its forms. Why not, indeed, *goed?

Well, Old English did, in fact, have a *goedeode. But there was also another verb around in the 5th century that meant “to wander around or go slowly,” and that was wendan. You still hear people talk about “wending their way around” but other than that, the word wend is pretty rare. So between Old English and Middle English (that’s between the 5th and 15th centuries) the word oede got pushed out by wend, the past tense of wendan, and the devoicing of that final /d/ sound to a /t/ gave us the now-familiar went. For those who are geekily curious, this is called suppletion in the world of historical linguistics, and it’s where one word is used as the inflected form of another, but where both words come from different origins. Ever wonder why things go from bad to worse – or worst? Suppletion. Or why things go from good to better and best? Suppletion. Hey, it’s not just a verb thing!

Before I wind up this work and wend my weary way to bed, there’s one other question that might still be nagging at you; why is it these particular irregulars that are irregular and not others? Why say, and make, and go, and come, and take, and see…? It’s because of their frequency! When we started shifting from using those many different types of strong/weak verbs in Old English to the more relaxed syntax of “+s,” “+ing,” and “+ed,” the words that were used most  often had a built-in inertia – a resistance to change. We very easily – and perhaps it’s better to say unconsciously – take new verbs like tweet and twerk and add those three endings to them, but if we wanted to change went to goed [6] or see to seed, we’d have a harder time because it just sounds so wrong! So although we know that many new words are coined and used every day, there’s a core of  thousands of other words that are protected from change by a lexical inertia that anchors them firmly into our language and presents a formidable resistance to change.

So next time you’re focusing on teaching the irregulars, just remember that you’re also providing a small but fascinating lesson on the history of the English language!

Notes
[1] Grabowski, Eva & Mindt, Dieter (1995. “A corpus-based learning list of irregular verbs in English.” ICAME Journal 19, 5-22.

[2] Francis, W. Nelson, Kucera, Henry, & Mackie, Andrew W. (1982). Frequency analysis of English usage : lexicon and grammar. Boston: Houghton Mifflin.

[3] Hofland, Knut, & Johannson, Stig (1982). Word frequencies in British and American English. The Norwegian Computing Center for the Humanities, Bergen, Norway: Longman.

[4] Using vowel variation as a form of morphology is called ablaut. It’s from the German prefix ab- meaning “out of” or “away from” and laut meaning “sound.” So it refers to that notion of taking a sound away and replacing it with another.

[5] In today’s era of political correctness with the insistence on not hurting anyone’s feelings – ever, I can see the day coming when there will be pressure to re-define strong and weak verbs as robust versus relaxed. In that way, verbs like to chant and to hum no longer have to feel threatened by to sing.

[6] As every parent knows, kids will, in fact, quite happily “regularize” irregular forms when they are learning to talk. It is not unusual for kids to actually use irregular forms like went before they use regular, but erroneous, forms like goed. This overregularization is, in a sense, a good thing because it shows that a kiddo is learning to apply the more common rules of morphology – even if the words are technically wrong.

Errata
Thanks to eagle-eyed reader Mark Durham, we made a couple of corrections to the original text on 5/14/15; n two instances. we originally published tweak and here instead of twerk and hear. Both of these illustrate that relying solely on the built-in WordPress spell checker has some risks. It is, of course, better than not using it at all, but because both tweak and here are “good” words, the spell checker happily leaves them alone. So the teachable moment is “treat your spell checker as a friend who offers suggestions but not necessarily all the answers.”

The Dudes Do ATIA 2015: Day 2 – Of Powwows and Portmanteaus

The day before the Dudes left for the Assistive Technology Industry Association (ATIA) conference happened to be Lewis Carroll’s birthday. Folks who know me well – and maybe some who just happened to have heard me in presentations – will be painfully aware that I recommend Carroll’s Alice’s Adventures in Wonderland and Through the Looking Glass to anyone with the slightest interest in language. In fact, both books should be on the required reading list for all Educators and Speech and Language Therapists/Pathologists – seriously. Read the following single sentence as spoken by the Duchess in Wonderland and savor the complexity:

Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise.

Now parse it. There’s glory for you [1]. The books are just overflowing with words, phrases, and sentences that can provide enough material for several seminars on morphology, syntax, semantics, and pragmatics.

Time for a Powwow

Time for a Powwow

Coincidentally, or perhaps serendipitously, on the same day a Twitter colleague, @TactusTherapy, posted that she was about to take part in an appathon, which is clearly a blend of the words application and marathon. This is commonly referred to as a portmanteau word, a term first used by Carroll in Through the Looking Glass, when Humpty Dumpty is explaining what the words in the poem Jabberwocky mean:

“Well, slithy means ‘lithe and slimy.’ Lithe is the same as active. You see it’s like a portmanteau — there are two meanings packed up into one word.”

He then gives another example of a portmanteau with mimsy, which is a jamming together of miserable and flimsy. Linguists call these blends, or perhaps more specifically lexical blends – as distinct from, say, phonological blends where two or more sounds run together to end up as one. Other examples include positron (1933: positiveelectron); guesstimate (1936: guessestimate); skort (1951: skirtshorts); modem (1958: modulatordemodulator); metrosexual (1994: metropolitanhetero/homosexual); and hacktivist (1995: hackeractivist). My @TactusTherapy colleague also pointed out that she’d just come across a new portmanteau, listicle, to refer to one of those “5 Ways to Drive Your Lover Wild” or “10 Words Guaranteed to Get You a New Job” articles, where it’s basically a list modified into prose. Hence it’s a portmanteau of list and article.

ATIA15 Powwow 1

Moving ahead to Day 2 of the conference, I spent some time over lunch with a group of AAC/AT folks who had at some time attended one of the Pittsburgh AAC Language Seminar Series, or PALSS [2]. It’s a good excuse to get together with a group of like-minded folks for an informal powwow. Curiously enough, the word powwow (or pow-wow) may be another example of a portmanteau except from a non-English source. It can be traced back to the Narragansett language and pawwaw meaning a priest, shaman, or healer. It’s suggested that this in turn came from an earlier language, Proto-Algonquian [3], and the phrase *pawe-wa, which means “he who dreams.” The two words were blended into one by the elision of the middle syllable, and became the portmanteau, powwow.

During this powwow, yet another new portmanteau made its way into the discussion: the spamference. It’s clearly derived from spam and conference, and represents a relatively new concept in the field of academia – the junk conference. Basically, it’s a conference created not for the “free exchange of ideas and research from leaders in the field” but “a way of generating revenue for conference organizers by way of inviting folks to exotic and faraway places for a good time.” The typical invite goes along the lines of:

“Dear Speech Dude

As a recognized leader/expert/authority in the field of AAC/Linguistics/Toad Husbandry, our panel of professionals invite you to chair a session at our upcoming prestigious conference in Maui/Maldives/Vegas/Fiji (insert name of any place in which you’d love to spend a week).

As a conference chair, your registration fees will be discounted by 75% and hotel rooms by 25%. You will also be acknowledged as an Editor/Reviewer in the conference proceedings.”

And so on, and so on. The first hint of bogosity is the unsolicited nature of the invitation from someone who you’ve probably never heard of, and also that slightly hard-to-avoid-but-it’s-probably-true realization that you are maybe not quite the leader/expert/authority that you’d like to think you are!

Of course, if you want to beef up your resume and can get someone to fund you for your trip to Hawaii for “the conference,” then there’s nothing actually illegal going on here. Nothing. Like the whole “Open Access Journals” discussion – where you can get published so long as you stump up some cash – it’s a fundamentally grey area with advocates both for and against.

But spamference is definitely a portmanteau.

Notes
[1] This comes from a discussion between Alice and Humpty Dumpty in Through the Looking Glass about unbirthday presents. It ends with a classic definition of “the word” that’s beloved by linguists around the globe:

“There’s glory for you!’

`I don’t know what you mean by “glory,”‘ Alice said.

Humpty Dumpty smiled contemptuously. `Of course you don’t — till I tell you. I meant “there’s a nice knock-down argument for you!”‘

`But “glory” doesn’t mean “a nice knock-down argument,”‘ Alice objected.

`When I use a word,’ Humpty Dumpty said in rather a scornful tone, `it means just what I choose it to mean — neither more nor less.’

`The question is,’ said Alice, `whether you can make words mean so many different things.’

`The question is,’ said Humpty Dumpty, `which is to be master – – that’s all.’

See what I mean about great seminar material?

[2] The Pittsburgh AAC Language Seminar Series is a 2-and-a-half day event run by Semantic Compaction Systems in, no surprise, Pittsburgh. It’s focus is on implementing the Unity/Minspeak language system, with each seminar having a nationally recognized guest speaker. The seminars are monthly and registration is free but there are limited numbers – only 24 folks per seminar. It’s pretty cool because food and lodging is free AND you can get $150 towards your flight or mileage. Oh, and you get to meet me on Thursday morning – and that’s gotta be worth the trip! If you’re curious, here’s the link:
http://www.minspeak.com/PittsburghAACLanguageSeminarSeries.php

[3] A proto-language is one for which there is no direct evidence but can be (re)constructed, hypothesized or inferred on the basis of the structure and behavior of words that are verifiable. Algonquian is a genus of languages spoken primarily by Native American in north-eastern regions of North America, and Proto-Algonquian is thought to be the version spoken around 3,000 years ago. Here’s a link to a map of the family of Algonquian should you be curious – and if you’re still reading, you are 😉 THE ALGONQUIAN FAMILY

Valentine’s? President’s? Whose Day IS It?

On a singularly dull day in Hell, when the screams of tortured souls no longer gave Lucifer a thrill, he came up with a new form of torture: the apostrophe [1]. It’s a brilliant piece of evil engineering because it takes up less than the merest dab of ink to pop it onto a piece of parchment, yet placing it in the wrong place can wreak maximum havoc on the sensibilities of gentle readers. And over-worked copy editors. It’s possible one of Satan’s most wickedly powerful dividers of nations ever invented.

Evil apostropheWithin the space of one week, we’re about to experience the full force of an apostrophe debate that will also generate more examples of that malevolent little mark all over the internet. February 14 and 16 are all set to become a grammatical confluence of biblical proportions. Perhaps.

Let’s start with the easier one: the case of St. Valentine and a celebration of card sales love. According to one version of the legend, St. Valentine was a priest who was martyred by the Roman emperor Claudius II for being a Christian, and for performing marriage rites. In one of the more lurid descriptions of his death, he was first stoned and clubbed but when that failed to kill him he was beheaded. I’m not sure that’s ever been part of a Valentine card illustration – though in the interest of accuracy, I think Hallmark need to consider it.

His performing of marriages seems to fit in with the idea of love, but oddly St. Valentine is also the patron saint of epilepsy, fainting,  plague, and bee keepers. Again, potential new avenues of exploration for the folks at American Greetings.

St Valentine

Can you look after these bees for me, Val?

When we celebrate St. Valentine, we do so on St. Valentine’s Day, where the apostrophe comes before that final “s.” Why? Well, it’s because one of the accepted norms for using an apostrophe is that you use it before a final “s” to indicate the notion of possession; the idea that the preceding succeeding noun belongs to the apostrophized previous thing. In this instance, this is a special day that “belongs” to St. Valentine. So you can have “the cat’s whiskers” because the whiskers belong to the cat; “the man’s coat,” because the coat belongs to the man; or “my brother’s wife,” because the wife belongs to my brother [2].

A second rule says that if you have more than one possessor, and the plural form ends with an “s,” you still put the apostrophe after the word but you can ignore a following “s.” Hence we can have “the dogs’ bone,” which is a bone shared by multiple canines; “the bishops’ fund,” which is a fund administered or used by a bench of bishops [3]; or “my brothers’ wives,” which is a clumsy way of referring to the collection of women owned by my brothers.

Valentine’s Day is, therefore, a pretty easy one. There is only one Valentine; it’s a day that is in some sense “owned” by him; so the apostrophe can happily nestle itself between the “e” and the “s” and copy editors can sleep at night. Sanity 1 – Satan 0.

But the Prince of Darkness is not yet done with us. He’s fully aware that although some folks will have trouble with Valentine’s Day, those who find it relatively easy have been lulled into a false sense of security. Lurking in the wings – or in this case, two days later – there is the day that even such luminaries as the Chicago Manual of Style (CMS) and the Associated Press Stylebook (AP) disagree on; Presidents Day or Presidents’ Day. Sanity 1 – Satan 1.

I know that our readers don’t come here to be subjected to stress, pain, or irritation (other than the mild form suffered when we say something outrageous or wrong) so let me take away any worries you’re having about which form to use here and now. The Associated Press Stylebook says “Presidents Day” with no apostrophe; the Chicago Manual of Style says “Presidents’ Day” with the apostrophe right at the very end. So the Dudes say; so go with the one you prefer!

DIfferent ways of spelling Presidents Day
Yes                                                    Yes                                                No

So why the confusion – apart from Beelzebub’s delight in watching us all squabble and bicker? It’s really because of the way that nouns can, in some circumstances, behave as if they were adjectives. Specifically, it’s a type of noun called an attributive noun, which sounds like another Mephistophelian invention. For the most part, nouns are pretty solid, stalwart parts-of-speech, happy to be just what they are – low-frequency, limited meanings. A dog‘s a dog, a cat‘s a cat, and that’s about it. However, sometimes a noun will have the urge to buddy up to another noun to make a compound, and the one that goes first can change its behavior and act, temporarily, like an adjective.

Here are some examples of attributive nouns, where the first noun is being used to enhance the meaning of the second:

football player: Just using the noun player on its own may not be sufficient, so adding the noun football helps specify the type of player. Similarly we could have a baseball player, hockey player, and so on.

business lunch: Again, lunch on its own is OK in a generic sense but if you’re having lunch for the purpose of discussion business-related issues, then adding business as an attributive tightens up the meaning.

apple tree: Fairly obvious and by now needs no explanation.

If you want to do a quick check as to whether you’re seeing an attributive noun or an attributive adjective, try the following test:

Change <WORD 1><WORD2> to “The <WORD2> is <WORD1>”: does it make sense?

“The player is football,” “The lunch is business” and “The tree is apple” sound wrong. But if we had “aggressive player,” “free lunch,” and “tall tree,” applying the test would result in sensible sentences, therefore they are attributive adjectives, not attributive nouns.

All of this brings us back to why Presidents/Presidents’ Day is a challenge. If it is a day that “belongs” to Presidents, then the apostrophe should be used to indicate possession and therefore needs to be included at the end of the word. But if it’s a day “about” or “for” Presidents [4], then it’s being used as an attributive noun descriptor to enhance the meaning of “day,” and so needs no apostrophe.

The distinction is fine, and so is the interpretation – hence the disagreement between CMS and AP. But it is an instructive example of how words can shift not only their meaning but function, and even a humble noun can aspire to adjectivehood!

Notes
[1] Apostrophe comes from the Greek ἡ ἀπόστροϕος meaning “of turning away, or elision.” Often the apostrophe is used to mark where something is missing (elided) such as in can’t for cannot, the poetic o’er for over, or singin’ as a colloquialism for singing. It’s this sense of “missing something” that gave rise to its name as a punctuation mark.

[2] You’re right to guess that I put that one in on purpose, knowing full well that it’s somewhat un-PC. I could, of course, have used “My sister’s husband” and explained it as “because the husband belongs to my sister,” but that wouldn’t be as forceful in showing how grammar and punctuation rules regarding “possession” don’t care for social norms. Doubtless there are folks out there who would be all for having us change the language so as to avoid that notion of “owning” someone but that’s not going to happen. Grammatical possession is a little different from social possession.

[3] The  most frequently cited collective noun for bishops is, indeed, a bench. Others include a sea of bishops and a psalter of bishops.

[4] The Presidents in question are apparently George Washington and Abraham Lincoln, whose birthdays are Feb 22 and Feb 17 respectively. I say “primarily” because there is also the notion that it is a celebration of all US Presidents, and that this extended meaning is accepted by many people.

Erratum
1. Eagle-eyed reader, Trish, pointed out I used preceding rather than succeeding in the original sentence. Whoops!

28 Words to Boost Your Client’s Vocabulary – Maximum Bang for Buck

When developing a vocabulary set for an augmented and alternative communication (AAC) system – or indeed when deciding on what vocabulary to teach anyone – one of the most fundamental of measures you can use is frequency count; how often is a word used in a language? No-one can predict with 100% accuracy which words will be “best” for an individual, but if you’re going to take bets, you’re pretty safe to assume that words such as that, want, stop, and what are going to be used by everyone from ages 2 to 200. By the same token, you’d not be missing much if you didn’t spend too much time on words like ambidextrous, decalogue, and postilion [1].

In the field of AAC, this type of high frequency vocabulary that is used (a) across populations and (b) across situations is referred to as core vocabulary and it’s often contrasted with the phrase fringe vocabulary, which refers to words that are typically (a) low in frequency and (b) specific to isolated activities or situations. For a refresher on core and fringe – and an introduction to keyword vocabulary – check out my article entitled Small Object of Desire: The Monteverde Invincia Stylus fountain pen – and Keyword Vocabulary from two years ago.

The core/fringe distinction is now so embedded in the world of augmentative communication that it is rare to see any new app appear on the market that doesn’t use the phrase “core vocabulary” somewhere in its marketing blurb – even if it isn’t actually making good use of the core! And as core vocabulary is, by definition, common across ages, activities, situations, and pathologies, it’s not surprising that many AAC software offerings look the same, particularly with regard to the words being encoded [2].

But it’s worth taking a look at another level of frequency measurement, and that’s at the phrase level. Specifically, one area of research that seems to me to offer some value to Speech and Language pathologists and Educators working in vocabulary development is in the study of how phrasal verbs (PVs) are distributed.

PV 3

So what’s a phrasal verb? Well, simply put, it’s a phrase of two to three words that are yoked together, which include a verb and a preposition and/or adverb. Examples include, “I ran into Gretchen at the ATIA conference,” “I backed up my hard drive,” and “I came across an interesting article on phrasal verbs.” The English language is stuffed to the gills with these type of verbs, and a feature of them is that they tend to have multiple meanings.

To find out how polysemous a phrase can be, you can use the excellent WordNet online tool, a huge database of words and phrases that let you check out noun, verb, adjective, and adverb meanings. For example, would you believe that the simple phrase “give up” has 12 different meanings? Or that “put down” has 8 variations? It’s not surprising that learners of English find phrasal verbs quite challenging.

The other fascinating feature of phrasal verbs is summarized in a 2007 paper by Gardner and Davies, who point out that of you look at the 100 million word British National Corpus you find that;

…a small subset of 20 lexical verbs combines with eight adverbial particles (160 combinations) to account for more than one half of the 518,923 phrasal verb occurrences identified in the megacorpus. A more specific analysis indicates that only 25 phrasal verbs account for nearly one-third of all phrasal-verb occurrences in the British National Corpus, and 100 phrasal verbs account for more than one half of all such items. Subsequent semantic analyses show that these 100 high-frequency phrasal verb forms have potentially 559 variant meaning senses.

Read that again and see if you get the same tingle I did seeing those numbers. Over half the entire phrasal verbs found in the corpus can be accounted for by combining 20 verbs with 8 particles. In short, if you learn just 28 words, you’ve learned 50% of all the phrasal verbs you’ll need to use.

Let’s take a look at those Top 2o verbs first:

20 most frequent verbs in phrasal verbs

Table 1: Top 20 Verbs in PVs

And now the Top 8 particles:

Eight most frequently used particles in phrasal verbs

Table 2: Top 8 particles in PVs

All the verbs and prepositions as individual items are already high frequency, with the exception of perhaps the verbs point and set, which wouldn’t be on my list of “first words to teach.” However, the real bonus here is that not only do you get the benefit of teaching your client 28 high frequency words in isolation but if you then use them as phrasal verbs, your “bang for buck” is significant!

Here’s a link to a PDF of those 28 words: https://app.box.com/s/vng5hr2tctp87ufdjoyjvyv2ln8300yb

This frequency analysis of phrasal verbs by Gardner and Davies has recently been supported by and extended upon by Dilin Liu (2011) and by Mélodie Garnier and Norbert Schmitt [3] (2014). In their paper, The PHaVE List: A pedagogical list of phrasal verbs and their most frequent meaning senses, they point out that a limitation in Gardner and Davies’ analysis is that they failed to take into account the polysemy inherent in the phrases – like the 12 meanings of “give up.” In fairness to Gardner and Davies, they did, in fact, talk about the polysemous nature of PVs but didn’t offer any measure of the different frequencies with which the various meanings are used. They wrote that:

For instance, the list-high 19 senses of the PV break up … could be arranged from highest to lowest semantic frequency, thus prioritizing them for language learning. We acknowledge, however, that corpora of this nature are much easier talked about than constructed. (p.353).

Garnier and Schmitt are interested not just in identifying the frequency with which a phrasal verb occurs but also the most common senses of those PVs. They say that;

…our main purpose for creating the PHaVE List, which is to reduce the total number of meaning senses to be acquired to a manageable number based on frequency criteria.

On a pragmatic level, they want a learner not to have to learn every meaning of each PV but just focus on the most frequent, and therefore most useful meanings. Using the original list from Gardner and Davies, along with additions by Liu (2011), and including data from the Corpus of Contemporary American English (Davies, 2008), the duo created the PHaVE List; a list of the 150 most frequently used phrasal verbs, and 280 of the most frequently used meanings. So on the 12 potential meanings for “give up,” they use the following:

16. GIVE UP
Stop doing or having something; abandon (activity, belief, possession) (80.5%)
Example: She had to give up smoking when she got pregnant.

The general entry starts with a rank (in this case, 16th out of 150); the basic phrasal verb; a definition; a percentage frequency; and a specific example use. The complete list is made available as a download from the Sage journals website [4]. If you can get access to it, it is well worth the read and the download. And all the articles referenced in this article are good examples of how we can use corpus linguistics to help guide our practice of developing the vocabulary of our clients with language challenges.

References
Davies, M. (2008-). The Corpus of Contemporary American English: 425 million words, 1990-present. Available from Brigham Young University The Corpus of Contemporary America English, from Brigham Young University http://corpus.byu.edu/coca

Gardner, D., & Davies, M. (2007). Pointing Out Frequent Phrasal Verbs: A Corpus-Based Analysis. TESOL Quarterly, 41(2), 339-359.

Garnier, M., & Schmitt, N. (2014). The PHaVE List: A pedagogical list of phrasal verbs and their most frequent meaning senses. Language Teaching Research, 1-22.Published online before print http://ltr.sagepub.com/content/early/2014/12/08/1362168814559798.abstract

Liu, D. (2011). The Most Frequently Used English Phrasal Verbs in American and British English: A Multicorpus Examination. TESOL Quarterly, 45(4), 661-688.

Notes
[1] A postilion is the driver of a horse-drawn carriage, who sits posterior to the horses. The sentence “The postilion has been struck by lightning” is the basis of a wonderful little paper by the linguist David Crystal, published in 1995 in the journal Child Language Teaching & Therapy. Simply titled “Postilion Sentences,” Crystal defines a postilion sentence as “one which has little or no chance of ever being useful in real life. It could be used, obviously, because it is grammatically well-formed; but the contexts in which it would be natural to use it are either so restricted or so adult that the chances of a child encountering it, or finding it necessary to use it, are remote.” In the design of AAC systems, using pre-stored sentences may have some limited value but many “pragmatic utterances” turn out to be nothing more than postilions; unlikely to be used. This is why teaching sentences is neither language nor therapy.

Download Postilion sentences article

Enter a caption

[2] The now-common practice of using core vocabulary also makes it much harder to prove plagiarism – or as we Lancastrians would say, “nicking someone else’s ideas.” People, of course, don’t “steal” ideas – they are “inspired” by the work of others. But such inspiration inevitably leads to systems appearing almost clone-like in their structure. It’s only when you get to the fine details of how words are organized and encoded that you can separate the wheat from the chaff. And there’s a lot of chaff out there.

[3] If I haven’t mentioned it before, Norbert is the author of an excellent book on vocabulary research methods. Here’s the full reference: Schmitt, N. (2010). Researching vocabulary : a vocabulary research manual. Houndmills, Basingstoke, Hampshire ; New York, NY: Palgrave Macmillan. It’s full of useful information and lots of web links worth exploring, and worth the $30 you’ll spend on Amazon US – or the £20.99 in the UK.

[4] Just a reminder to all members of the Royal College of Speech and Language Therapists that you membership benefits includes access to a number of Sage journals online, and Language Teaching Research is one of those. In fact, you have access to over 700 (yes, count ’em!) titles, including my personal favorites Child Language Teaching and Therapy, Clinical Linguistics & Phonetics, English Today, and the riveting Scandinavian Journal of Occupational Therapy. OK, so I lied about the last one being a “favorite” 🙂