Why the verb “to be” is so irregular
The answer is six thousand years old
There’s something spiritually edifying about spending time in cemeteries. If you’ve ever walked through an older cemetery, you may have come across a headstone that addresses you directly:
As you are, I was. As I am, you will be.
…or some variation.
This is a memento mori, a philosophical reminder of the fleeting nature of life and the inevitability of death. Painters used to achieve the same edifying effect by slipping incongruous skulls onto their canvases.
The saying has been around for a surprisingly long time. Romans were carving versions of it on their tombs two thousand years ago: one of the earliest examples is the haunting viator, quod tu es, ego fui; quod nunc sum, et tu eris ‘traveller, what you are, I was; what I am now, you too will be.’
The English version I quoted above is a translation of the most memorable later compression: eram quod es, eris quod sum. You’ll often find it inscribed in earlier Protestant cemeteries.1 If you live in New England or in rural parts of Britain, you probably have a mossy, crumbling example close to home.
But the headstone version is edifying in another way: one that has nothing to do with mortality and everything to do with grammar.
The phrase has four forms of the single verb to be: are, was, am, and of course be itself. All four are forms of the same word, and yet they seem utterly unrelated, as if they had come from different words entirely. How did a single verb end up looking like this?
To answer this question, we need to take a stroll in a linguistic graveyard and pay our respects to the earliest ancestor of the English language. It has lain dead and mute for thousands of years, but you can still hear it echo every time you say to be.
You’re reading The Dead Language Society, where 45,000+ readers explore the hidden history of the English language. I’m Colin Gorrie: PhD linguist and your guide through 1,500 years of linguistic history.
This is the first instalment of A Deep History of English, a series that traces the story of the English language through the mysteries that remain in Modern English.
I publish every Wednesday. Paid subscribers get every issue, the full archive, and the content I’m most proud of: practical guides to reading historical texts yourself, honest takes on how language really works, and live book clubs where we read texts like Beowulf and (up next!) Sir Gawain and the Green Knight.
As you are, I was
Half of the world are cousins, linguistically speaking. Close to four billion people speak a language descended from this ancestor of English.2
These numbers are bolstered by languages such as English (with 1.5 billion speakers), Hindi-Urdu (800 million, usually counted together), and Spanish (600 million).
But the family contains hundreds of less widely-spoken languages as well: Latvian (1.5 million speakers), Welsh (840,000), Icelandic (350,000), alongside other, more obscure relatives. The living languages descended from this single ancestor number around 450, with nearly half of the total located in South Asia.
While some linguistic family resemblances are obvious — such as the close relationships between English and German, or French and Spanish — this larger family is harder to spot with the naked eye.
For example, it surprises many people to learn that German is more closely related to Bengali than it is to its neighbour Hungarian. On a superficial level, the languages of this family seem very different. But the similarities exist, just on a deeper, more structural level.
This family is called Indo-European, named for the fact that its hundreds of languages were traditionally spoken across an enormous swath of Eurasia: from India to Europe. But all of these languages had their origins in a single source.
Linguists call that source the Proto-Indo-European language, or PIE for short. This language was never written down, but we know it must have existed because its descendants — including English — still bear its distinctive features, even if they’ve sometimes been weathered by the passage of time.
That PIE existed, and roughly what it looked like, are not nearly as controversial as the entangled questions of when it was spoken, where, and by whom. Linguists, archaeologists, and, most recently, geneticists have spent decades trying to follow the trail back in time to the PIE homeland. After decades of debate the trail seems to lead — for the moment, at least — back to the grasslands north of the Black Sea around 4000 BC.3
They lived in what is called today the Pontic Steppe. It makes up the southern part of Ukraine and the neighbouring part of Russia. It is hot in summer, blisteringly cold in winter, and without shelter except among the courses of the rivers which cross it, running south to the sea.
The people who lived six thousand years ago grew no grain. Instead, they moved with the seasons, following the water in the summer, and in the winter, looking for ground where the snow was shallow enough for sheep to graze through it. They had cattle too: oxen to pull their wagons and, possibly, cows for milk. They drank mead, which they made out of honey they got through trade.
When one of them died, they carved no sayings on gravestones. Instead, they laid the body on its back with the knees raised, on a mat woven from the grasses of the steppe.
They rested the head on a pillow stuffed with aromatic herbs and sprinkled the body with red ochre. Beside the body were laid pots, some knucklebones, perhaps to be used as dice. Occasionally a bronze knife might be laid under the head. Then they raised a mound of earth over the chamber, just as Beowulf asked to be buried.
You can still see these mounds in the steppe today, and far beyond it. They’re called kurgans by archaeologists, although if you see one in the English countryside (or in Middle-Earth) you’ll probably call it a barrow.
They raided each other’s cattle, but a stranger at the threshold wasn’t necessarily an enemy: the peace between guest and host was for them a sacred bond. At a feast, the guests drank mead while a poet sang of glory and praised the generosity of their host.
Through the words of a poet, the glory of their great men spread, and they might hope to attain a measure of immortality. A bard could build a man’s name or break it. It’s fitting that it’s through words that we remember them too. They founded no cities, built no pyramids. The monument they left us was their language.
Sometime around the year 3300 BC, they carried it out of their home in the grasslands: west, east, south. Within a thousand years, their descendants, and their languages, had swept across a large part of Central and Eastern Europe and into Central Asia.
One of these migrations out of the steppe brought to Europe a branch of the Indo-European family — the Germanic languages — which would, much later, give rise to English.
But there were many more branches, which diverged into sub-branches, and, eventually, hundreds of individual languages stretching out across Eurasia: Sanskrit, Greek, Latin, Persian, Russian, Welsh, Armenian all descend from that single language spoken on the steppe 6000 years ago. All cousins.
And all bear traces of the linguistic signature of their ancient ancestor. One of them is the strangeness of the verb that English has inherited as to be.
Being, becoming, and staying the night
The irregularity of the English verb to be, with its various forms are, was, been, is so extreme that it seems like it has been cobbled together from entirely unrelated words.
It has. And we know exactly which ones.
Not because the speakers of PIE left us any record. They had no writing. But their language left traces in every one of its descendants. By comparing the many daughter languages, it’s possible to reconstruct what the ancestor language might have sounded like, and how its grammar worked.
When linguists show a reconstructed word, they are careful to precede it with an asterisk so that its hypothetical status is clear: this leads to forms like *h1esmi, a reconstructed PIE word meaning ‘I am.’ This is a form of the first word that makes up our patchwork word to be.
About that *h1. Reconstruction has its limits. It is not possible to reach back 6000 years and reconstruct every sound with perfect fidelity. Some sounds are simply beyond our ability to reconstruct with certainty. In the case of PIE, there is a small group of sounds which vanished too early to leave traces in most of PIE’s daughter languages. We know they were there, but we don’t know exactly what they sounded like.
Linguists call them the laryngeals, and write them with the symbols *h1, *h2, *h3, which is a way of saying: I know there were three of them, I know they were made somewhere in the back of the throat, and that’s all I know.
We see the laryngeal *h1 at the beginning of our first PIE word *h1esmi ‘I am.’ If you remove that *h1, not to mention the s and the i, you’re left with em, not far from the English word am. This is no accident. You’re hearing, in a word you say hundreds of times a day without a second thought, a word barely disguised from its ancestor spoken on the Pontic Steppe six thousand years ago.
The form *h1esmi ‘I am’ can be broken down into two parts: *h1es- and -mi. The PIE language worked like most of its descendants, in that words were composed of roots and endings. Just as the English verb work becomes works when it’s a he, she, or it doing the working, PIE verbs changed their endings depending on the subject of the sentence. The ending corresponding to I is -mi.
Since PIE grammar works mostly by adding endings to roots, scholars tend to talk about roots rather than words. So our first root is *h1es-, which is the proper equivalent of to be in PIE. It’s the source of the Modern English forms am, is (from *h1esti), and probably are.4 You might recognize a descendant of the form *h1esti if you’ve ever come across the Latin word est ‘he/she/it is.’5 The *h1es- root is also the source of the Latin word essentia ‘being,’ which gives us the English words essence and essential.
But *h1es- is not the source of the word be itself, or of its derivatives been or being. For that, we need to look at another PIE root: *bhuh-, which meant ‘become; grow.’6
The initial consonant *bh, which probably sounded like a b with a breathy release, was prone to change in PIE’s daughter languages. It lost the breathiness in English be. In Greek, *bh changed into another sound which we spell ph.
We can see a descendant of *bhuh- in the Ancient Greek word phýsis, which gives us the English word physics. The Greek phýsis originally meant something like ‘nature’: the way things are. In Latin, the *bh came out as an f-sound. The root *bhuh- came out as the initial component of futurus ‘what is to be.’
The footprint of a single PIE root, in other words, is still visible in be, physics, and future, three words you would never think to connect.
But neither *h1es- nor *bhuh- can give us was or were. For the past-tense forms of to be, we need to turn to one more root: *h₂wes-, which meant ‘dwell; spend the night.’
This root wasn’t as prolific as the other two, but it may be the source of two names for goddesses: the Greek Hestia, goddess of the hearth and household, and her Roman equivalent Vesta. In each case, the word seems to have originally meant ‘dwelling’ or ‘hearth,’ and was likely later applied to the goddess who presided over the hearth and home.7
One root for being, one for becoming, and one for dwelling. These were, in the days of the speakers of PIE, entirely separate words. Yet as one branch of PIE gradually developed into English, they fused together.
And the reason lies in a quirk of PIE grammar which I find genuinely strange, even after years of studying it.
Half a verb
We’re used to the idea that verbs have a present and a past tense. This is how it works in English: the present I am corresponds to the past I was.
It’s conceptually the same relationship as in the pairs sing/sang, teach/taught, and work/worked, even if the way it’s expressed in each of these pairs is different. In fact, it’s close to a law: every verb in the English language has both a present and a past tense.8
The very idea of a verb without a past tense seems strange. But there is at least one verb in English which truly has no past tense: beware. You can’t say that he bewared of the dog, and for no good reason. It’s not as if the concept of bewaring isn’t something you can do in the past.
There’s also a verb with a past tense and no present: quoth, meaning ‘said,’ as in Quoth the Raven “Nevermore.” Again, this isn’t for any reason: the synonymous verb say is happy to be used in the present tense.
PIE was a whole language full of bewares and quoths. A given verb root could only be used in certain tenses. Sometimes there were workarounds — you could build a new verb from the root by adding a suffix — but for some roots there was nothing you could do. The verb simply had no way to express that tense.
The root *h1es- ‘be’ was one of the restrictive types. It could be used in the present tense, such as *h1esmi ‘I am’ and *h1esti ‘he/she/it is.’ But it had no way to make the past tense.
The root *bhuh- ‘become; grow’ was the opposite: it could make the past tense but it had no present-tense forms. The root *h₂wes- ‘dwell; stay the night’, on the other hand, could do it all.9
So PIE’s daughter languages had a problem: how do you say ‘I was’ when your verb for ‘to be’ has no past tense? The branch that would become English solved it by pairing up two separate roots: present-tense forms from *h1es-, past-tense forms from *h₂wes-.
A third verb, from *bhuh-, carried on alongside them for a while. In the earliest forms of English, it was used for future states and general, proverbial truths. It wasn’t until the later Middle Ages that all three finally merged into one.
Other branches made different choices. Latin fused *h1es- with *bhuh- instead, which is why its present est and its perfect fuit ‘was’ look nothing alike.
But the result, in English, is the chaos of to be: am from one root, was from another, be from a third, all fused into a single verb that still carries the mark of its distant origins on the steppe.
As I am, you will be
To be is the most frequent verb in the English language. It’s also the most irregular. And it’s the most irregular because it’s the most frequent.
Every generation of English speakers exerts a pressure to smooth out the irregularities they inherit, which is why we say helped and climbed rather than holp and clomb. But the most common words resist. They are heard so often that even the most irregular verb ends up getting transmitted perfectly from one generation to the next.
It’s in words like to be, which we’re never more than a sentence or two away from saying, that we retain the closest connection with our most distant linguistic ancestors. The strange features of their grammar are reflected in the strange features of our own.
In the patchwork of am, was, and be, we’re hearing echoes of words spoken 6000 years ago in the grasslands of the Pontic Steppe, by people long gone, who once lived, feuded, got drunk off honey mead, and told epic poems.
We have nothing to remember them by but their words, which are now our words.
As you are, I was. As I am, you will be.
Four forms from three ancient roots, carried by the mouths of the living for six thousand years, and carved into stone to give voice to the dead.
Further reading
Anthony, David (2007). The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World.
Fortson, Benjamin (2009). Indo-European Language and Culture: An Introduction.
Kroonen, Guus (2013). Etymological Dictionary of Proto-Germanic.
Lass, Roger (1999). The Cambridge History of the English Language, Vol. 3: 1476–1776.
Lazaridis, Iosif et al. (2025). “The Genetic History of the Southern Arc: A Bridge between West Asia and Europe.” Nature 639.
Spinney, Laura (2025). Proto: How One Ancient Language Went Global.
Ringe, Don (2006). From Proto-Indo-European to Proto-Germanic.
Catholic headstones tended towards the requiescat in pace ‘rest in peace’ genre; they channeled their memento mori energies elsewhere.
Ethnologue counts 3.39 billion speakers of Indo-European languages, so “half” is rounding up a bit.
The very latest research leads back even farther, pushing an ancestor of PIE to the North Caucasus and Lower Volga around 4400 BC (Lazaridis et al. 2025). If this line of research is correct, there is a prequel to be told about the language family before it came to the steppe.
How the form are came into English is a mystery. Some scholars suspect Norse influence, while others think it descends from yet a fourth verb root hiding within to be.
Other descendants of *h1es- in Latin — although harder to recognize — show up in eris ‘you will be’ quod sum ‘I am,’ eram ‘I was’ quod es ‘you are.’
The first h in *bhuh- indicates that the b had a breathy release. The second h, on the other hand, is a laryngeal. Due to the shape of this root, we know not which of the three it was, so it’s written without a subscript number. This is the kind of thing that keeps some historical linguists up at night while normal people are sleeping, blissfully unaware that there’s even a problem.
Some scholars debate the connection of *h₂wes- to Hestia. Their reasons are technical: they expect the root to come out differently in Greek. The alternative is to say that we don’t know, which many linguists (like the rest of humanity) find hard to do.
Depending on your perspective, the modal verbs will, can, shall, etc. can be understood as having no past tense, although you could argue that would, should, could, etc. fill that role.
More precisely, *h1es- had no way to form the PIE tenses which merged to create the English past tense, that is the aorist or perfect tenses. It did have a way of expressing being in the past, using a tense called the imperfect. The root *bhuh- had an aorist and perfect but no present or imperfect. The way tenses worked in PIE was very different from how they work in English, and the nomenclature is complicated. The branch of PIE that became English simplified the tense system of PIE down to only two: present and past.



That's a great explanation. I've always wondered about how strangely "to be" is conjugated in all the languages I know. I'm currently relearning Scottish Gaelic for a hiking trip in the Hebrides this summer. (I'm just hoping to find someone to say ANYTHING to. Bless my heart.)
Anyway, their forms of "to be" are strange in similar ways. They have 2 separate verbs to choose from. One for predicate objects, like "John is a teacher", and another for adjectives "I'm tired." Then in past and present, you have a basic form for all subjects, a negated form, an affirmative and a negative question.
One of those basic present forms is is "Is" and it's affirmative question is "An" the other verb is "Tha" (pronounced ha) and it's question is "à bheil?" These look on the surface to fit right into the PIE pattern you laid out here.
Scottish Gaelic, however, is VSO, which must be pretty rare among Indo-European languages. I wonder how THAT would happen.
Enjoyed this, as always!
Apparently “bewared” was used in the 19th century: https://en.wiktionary.org/wiki/bewared. I realise that it will have been a neologism (now not very neo), but that reinforces your point really that English is resilient enough to happily create neologisms along standard lines for any word it wants to. It also provides a counter-point to your point about common words being irregular: if we more often needed to refer to people bewaring in the past, we would no doubt adopt bewared. Perhaps the tendency to retain antique irregular forms is sharply bimodal: the most common words stay antique because they are so naturalised; the least common words stay antique because no one needs or cares to modernise them. (This also applies to quoth.)