Tag Archives: syntax

But Why Are Irregulars “Irregular?”

The English language is a glorious bastard child. Like the English themselves, its words and grammar are the result of the promiscuous and incestuous interbreeding that has been going on since the Angles, Saxons, and Jutes decided that they’d like an island vacation rather than sprawl out topless on the beaches of 5th century Europe – just as their current descendants do. Add to the mix the vocabulary of the Picts and Scots, along with a smattering of the ancient Welsh and Irish, and you’ve got yourself a language that turns out to be more wanton and debauched than a Roman orgy hosted by Caligula in a particularly creative mood.

As a result of this linguistic licentiousness, speech pathologists and English Language teachers find themselves having to teach a host of irregular, eccentric, and downright capricious words and grammatical structures. And there’s no finer example of this than something we call “the irregular verbs.” In fact, the very name “irregular verbs” tells us all we need to know; that here is a bunch of words so odd that we’ve just given up on them and tossed them into a huge bucket labeled “irregular.”

Irregular verbs cartoonOccasionally, you might hear the uncomfortable question…

“But Miss, Miss, Miss, why is it went and not goed? Why is it saw not seed? And why can’t I say taked instead of took?”

As pragmatists, 99% of us will just say, “Because it is” and then focus on the job at hand – teaching the exception to the rule. But 1% of us – and I count me as “one of us” – really does wonder “But why IS it went and saw and took rather than goed and seed and taked?” After all, when we invent a new verb such as to google or to tweet, it only takes a few weeks until folks have googled and tweeted or maybe even Facebooked. We know the rules; we apply the rules; we’re done!

Well, much as we all like to think we are hip, modern, trendy, and capable of being innovative game-changers who think outside of the box and shake up current thinking, as far as language goes, we’re tied to our undeniable linguistic history – the ghosts of the philological past are still haunting our etymological present. And like prehistoric flies trapped in amber, some of the words we use are really just fossils from an earlier age.

Back in the mid 1990s, Eva Grabowski and Dieter Mindt published a paper [1] that listed the most frequently used irregular verbs. They didn’t just sit in an office and google “most frequently used irregular verbs” but went back to basics and used the data from two pretty big (at the time) corpora: The BROWN corpus of American English [2], and the LOB corpus of British English [3]. Using real data rather than the “best guesses” of lexicographers was a huge step forward. For those of you who like FREE STUFF, you can click below to get a PDF copy of the top 100 irregular verbs by frequency. And why would you want it? Well, if you’re going to teach irregulars, starting with those used most makes a lot of sense.

Link to 100 most frequent irregular versb100 most frequently used irregular verbs

So let’s take the top of the list item, the verb to say, and crack open the amber to extract its etymological DNA.

Old English, and its Germanic predecessors, had more verb forms than modern English. Today, if you invent a new verb, such as to twerk, you only need to add three different endings to make it sound right: +s, +ing, or +ed.

“Miley can twerk, She twerks too much. Yesterday she twerked, I think she’s twerking too much.”

But Old English was a much tougher, with most verbs having around 14 different forms. And some verbs were strong while others were weak. It wasn’t that the strong ones would bully the weaker ones but the strong verbs would change their forms in a much more dramatic fashion than the punier weak ones. A strong verb would change its base form by muscling in new vowels. A commonly cited example of a strong verb is to sing, where you get sing, sang, and sung, with each form differing by the vowel [4]. Similarly ring, rang, and rung, or swim, swam, and swum. In contrast, the reason weak verbs are so-called is because they merely add an ending to their base form rather than man-up and ram those new vowels between the consonants.

I’m over-simplifying a little. There’s something of a sliding scale from “very strong” to “milquetoast weak,” and Old English scholars talk about 7 classes of strong verbs and 3 classes of weak ones. You have to think that with such a complex system, being a grammar teacher back in the 5th century CE must have paid more than it does today.

Having just explained the distinction between strong and weak verbs [5], take a look again at the verb to say. Is it strong or weak? Well, it’s so weak I’m surprised it hasn’t locked itself in a bathroom for fear of being hassled by to begin and to go! All that happens is that a /d/ sound gets added to the base form of /seɪ/, and the vowel changes ever-so-slightly by getting a tad shorter to leave /sɛd/. It’s technically from an Old English Class 3 weak verb that began life as secgan, meaning “to say,” and now has the pitiful pair of say/said left.

Number two on the list of irregulars, to make, is really pretty similar to to say, and so we should skip hastily on to the much more interesting to go, which has the disarmingly bizarre went as one of its forms. Why not, indeed, *goed?

Well, Old English did, in fact, have a *goedeode. But there was also another verb around in the 5th century that meant “to wander around or go slowly,” and that was wendan. You still hear people talk about “wending their way around” but other than that, the word wend is pretty rare. So between Old English and Middle English (that’s between the 5th and 15th centuries) the word oede got pushed out by wend, the past tense of wendan, and the devoicing of that final /d/ sound to a /t/ gave us the now-familiar went. For those who are geekily curious, this is called suppletion in the world of historical linguistics, and it’s where one word is used as the inflected form of another, but where both words come from different origins. Ever wonder why things go from bad to worse – or worst? Suppletion. Or why things go from good to better and best? Suppletion. Hey, it’s not just a verb thing!

Before I wind up this work and wend my weary way to bed, there’s one other question that might still be nagging at you; why is it these particular irregulars that are irregular and not others? Why say, and make, and go, and come, and take, and see…? It’s because of their frequency! When we started shifting from using those many different types of strong/weak verbs in Old English to the more relaxed syntax of “+s,” “+ing,” and “+ed,” the words that were used most  often had a built-in inertia – a resistance to change. We very easily – and perhaps it’s better to say unconsciously – take new verbs like tweet and twerk and add those three endings to them, but if we wanted to change went to goed [6] or see to seed, we’d have a harder time because it just sounds so wrong! So although we know that many new words are coined and used every day, there’s a core of  thousands of other words that are protected from change by a lexical inertia that anchors them firmly into our language and presents a formidable resistance to change.

So next time you’re focusing on teaching the irregulars, just remember that you’re also providing a small but fascinating lesson on the history of the English language!

Notes
[1] Grabowski, Eva & Mindt, Dieter (1995. “A corpus-based learning list of irregular verbs in English.” ICAME Journal 19, 5-22.

[2] Francis, W. Nelson, Kucera, Henry, & Mackie, Andrew W. (1982). Frequency analysis of English usage : lexicon and grammar. Boston: Houghton Mifflin.

[3] Hofland, Knut, & Johannson, Stig (1982). Word frequencies in British and American English. The Norwegian Computing Center for the Humanities, Bergen, Norway: Longman.

[4] Using vowel variation as a form of morphology is called ablaut. It’s from the German prefix ab- meaning “out of” or “away from” and laut meaning “sound.” So it refers to that notion of taking a sound away and replacing it with another.

[5] In today’s era of political correctness with the insistence on not hurting anyone’s feelings – ever, I can see the day coming when there will be pressure to re-define strong and weak verbs as robust versus relaxed. In that way, verbs like to chant and to hum no longer have to feel threatened by to sing.

[6] As every parent knows, kids will, in fact, quite happily “regularize” irregular forms when they are learning to talk. It is not unusual for kids to actually use irregular forms like went before they use regular, but erroneous, forms like goed. This overregularization is, in a sense, a good thing because it shows that a kiddo is learning to apply the more common rules of morphology – even if the words are technically wrong.

Errata
Thanks to eagle-eyed reader Mark Durham, we made a couple of corrections to the original text on 5/14/15; n two instances. we originally published tweak and here instead of twerk and hear. Both of these illustrate that relying solely on the built-in WordPress spell checker has some risks. It is, of course, better than not using it at all, but because both tweak and here are “good” words, the spell checker happily leaves them alone. So the teachable moment is “treat your spell checker as a friend who offers suggestions but not necessarily all the answers.”

ColorBrewer: Utilizing cartography software for color coding

It seems that I am getting a reputation for being a teensy-weensy bit doryphoric [1] and that may have some truth in it insofar as I hate – with a passion – the tendency for people to use the word “utilize” rather than “use” simply because the former sounds more erudite. It’s not, in fact, erudite; it’s just plain wrong. As I’ve said in previous posts, “utilize” means “to use something in a manner for which it was not intended.” So I can “use” a paper clip to hold a set of pages together; but I can “utilize” it to scoop wax out of my ears or stab a cocktail olive in my vodka martini (shaken, not stirred).

Colorado beetle

Doryphoric

So when I titled this post with “utilizing cartography software” I really do mean that and I’m not trying to sound clever by using a four-syllable word (utilizing) over the simpler two-syllable using. No siree, I say what I mean and I mean what I say: utilize. The software in question is online at ColorBrewer: Color Advice for Cartography and its original purpose was to help map makers choose colors that provide maximum contrast. Let’s create an example. Suppose you have a map of the US and you want to use colors to show the average temperatures as three data sets; below 50F, 51F-65F, and above 65F. You can use three colors in one of three different ways:

  • (a) Sequential: Three shades of a chosen color from light to dark to indicate low to high values. e.g. Sequential color
  • (b) Diverging: Three colors that split the data equally in terms of the difference between the colors, but with the mid-range being related to a degree of difference between the extremes. Divergent color coding
  • (c) Qualitative: Three colors that split the data into three distinct groups, such as apples/oranges/bananas or trains/boats/planes – or for the statisticians out there, any nominal level data. Color coding qualitative

For a map of temperature averages, you’d choose the sequential coding so as to show the degree of change. Here’s what such a map might look like:

Three data point colors

Three data point colors

Compare this with a version whereby we chose to have six data points rather than three i.e. less that 45F; 46F-50F; 51F-55F; 56F-60F; 60F-65F; above 65F.

Six data points colors

Six data points colors

What the software does that is interesting is that it automatically generates the colors such that they are split into “chromatic chunks” that are equally different. The lowest and highest color values for each map are the same but the shades of the intermediate colors are changed. If you were to choose a set of 10 data points, the software would split those up equally.

Of course, as the number of data points increases, the perceptual difference between them decreases i.e. it becomes harder to see a difference. This is one of the limitations of any color-coding system; the more data differentials you want to show, the less useful colors become. You then have to introduce another way of differentiating – such as shapes. So if you had 20 shades of gray, it’s hard to see difference, but with 20 shades of gray and squares, triangles, rectangles and circles, you now have only 5 color points for each shape.

One of the areas where color coding is used in Speech and Language Pathology is AAC and symbols. In the system of which I am an author [2] color coding is used to mark parts of speech. But suppose you were going to invent a new AAC system and wanted to work out a color coding scheme, how might you utilize the ColorBrewer website?

If you’re going to design your system using a syntactic approach (and I highly recommend you do that because that’s how language works!) you could first identify a color set for the traditional parts of speech; VERB, NOUN, ADJECTIVE, ADVERB, PRONOUN, CONJUNCTION, PREPOSITION, and INTERJECTION [3]. This looks suspiciously like a nominal data set, which corresponds to the Qualitative coding method mentioned at (c) above. So you go to the ColorBrewer site and take a look at the panel in the top left:

ColorBrewer Panel

ColorBrewer Panel

You can set the Number of data classes to 8, the Nature of your data to qualitative, and then pick yourself a color scheme. If you chose the one in the graphic above, you see the following set recommended:

Eight color data setFor the sake of completeness, here are all the other options:

ColorDataQualSet2You can now choose one of these sets knowing that the individual colors have been generated to optimize chromatic differences.

So let’s assume we go for that very first one that starts with the green with the HTML color code #7FC97F [4]. I’m going to suggest that we then use this for the VERB group and that any graphics related to verbs will be green. Now I can move to step 2 in the process.

Verbs can actually be graded in relation to morphological inflection. There are a limited number of endings; -s, -ing, -ed, and -en. Knowing this, I can go back to the ColorBrewer site and use the sequential setting to get a selection of possible greens. This time I changed the Number of data classes to 5 and the Nature of your data to Sequential. Here’s what then see as a suggested set of equally chromatically spaced greens:


ColorDataQualPanel2

This now gives me the option to code not just verbs but verb inflections, while chromatically signaling “verbiness” by green. Here’s a symbol set for walk and write that uses the sequential – or graded – color coding:

Color-coded symbols

Color-coded symbols

If you want an exercise in AAC system design, knowing that ADJECTIVES also inflect like verbs using two inflectional suffixes, -er and -est, you can try using the ColorBrewer to create color codes [5].

There are probably many other ways to utilize the site for generating color codes. For example, you might want to create colors for Place of Articulation when using pictures for artic/phonology work, and seeing as there are a discrete number of places, it should be easy enough. Why not grab yourself a coffee and hop on over the ColorBrewer now and play. But only use it if you’re creating a map. Please!

Notes
[1] A doryphore is defined by the OED as “A person who draws attention to the minor errors made by others, esp. in a pestering manner; a pedantic gadfly.” It comes from the Greek δορυϕόρος, which means “spear carrier,” and it was originally used in the US as a name for the Colorado beetle – a notable pest. This beetle was known as “the ten-striped spearman,” hence the allusion to a spear carrier.  To then take the noun and turn it into an adjective by adding the -ic suffix meaning “to have the nature of” was a piece of cake – and a great example of using affixation to change a word’s part of speech. As always, you leave a Speech Dudes’ post far smarter than you entered it!

[2] Way back in 1993 I was invited to join the Prentke Romich Company’s R&D department as one of a team of six who were tasked with developing what became the Unity language program. The same basic program is still used in PRC devices and the language structure has been maintained such that anyone who used it in 1996 could still use it in 2014 on the latest, greatest hardware. The vocabulary also uses color coding to mark out Parts-of-Speech but not exactly like I have suggested in this article. Maybe next time…?

[3] The notion of 8 Parts-of-Speech (POS) is common in language teaching but as with many aspects of English, it’s not 100% perfect. For example, words like the, a, and an can be categorized under Adjectives or added to a class of their own called Articles or – by adding a few more – Determiners. So you might see some sources talking about 9 Parts-of-Speech, and I like to treat these as separate from adjectives if only because they seem to behave significantly differently from a “typical” adjective. Another confounding factor is that some words can skip happily between the POS and create minor havoc; light is a great example of this. The take-away from this is that sometimes, words don’t always fit into neat little slots and you need to think about where best to put them and how best to teach them.

[4] In the world of web sites, colors are handled in code by giving them a value in hexadecimal numbers – that’s numbers using base 16 rather than the familiar base 10 of regular numbers. Black is #000000 and white is #FFFFFF. When you’re working on designing web pages, it’s sometimes useful to be able to tell a programmer that you want a specific color, and if you can give them the precise hex code – such as #FF0000 for red and #0000FF for blue – then it makes their job easier and you get exactly what you need. You can also something called RGB codes to described colors, based on the way in which the colors (R)ed, (G)reen and (B)lue are mixed on a screen. Purple, for example, is (128,0,128) and yellow is (255, 255, 0). Take a look at this Color Codes page for more details and the chance to play with a color picker.

[5] I suppose I should toss in a disclaimer here that I’m not suggesting that creating an AAC system is “simply” a matter of collecting a lot of pictures with colored outlines and then dropping them into a piece of technology. There is much more to it than that (ask me about navigation next time you see me at a conference) so consider this article just one slice of a huge pie.