What came before English?

Playback speed

Share post at current time

0:00

Transcript

What came before English?

The story of Proto-Indo-European

Colin Gorrie

Jun 04, 2025

If you're interested in the history of the English language, which I’m assuming you are if you subscribe to this Substack, you've probably spent at least a little bit of time pondering its very deepest roots.

Over the last couple of months, we’ve been exploring various periods in the history of English, and some of the nearer branches of its family tree, such as Gothic and Frankish. But we’ve never quite answered the question: Where did the English language actually come from, right at the very beginning — or, at least, as far back as we can fathom?

Following this trail leads us back beyond Old English, beyond the earliest written texts, and across thousands of years of unwritten linguistic history — a history which connects the English language with its relatives spoken from Ireland to India and beyond. It’s a trail that ultimately leads us back to a language called Proto-Indo-European.

Understanding the Proto-Indo-European language is like having a key that unlocks almost every language historically spoken in Europe, including Latin, Ancient Greek, and Old English. Almost everything strange about these languages starts to make sense when you understand where they came from.

And the same is true of culture: the fascinating alien world we see glimpses of in ancient poems like the Iliad and Beowulf, where heroes slay dragons, guest and host are bound by a sacred bond, and the gods themselves take part in great wars — this world descends from the cosmos imagined by the speakers of Proto-Indo-European.

It’s an incredibly rich subject, but until now, you’ve needed to have a grasp of linguistics, archaeology, and genetics to understand the full picture. But that’s all changed with the release of the book Proto: How One Ancient Language Went Global, the first accessible introduction to Indo-European studies in years.

Today, we're incredibly fortunate to be able to explore this topic with the author, Laura Spinney, who will be our guide through this world of ancient languages and cutting-edge science.

You can watch my interview with Laura by clicking on the video above. Below is a transcript of our conversation, lightly edited for clarity.

You're reading The Dead Language Society. I'm Colin Gorrie, linguist, ancient language teacher, and your guide through the history of the English language and its relatives.

Subscribe for a free issue every Wednesday, or upgrade to support my mission of bringing historical linguistics out of the ivory tower and receive two extra Saturday deep-dives per month.

This weekend, we’ll explore how linguists figure out what a language spoken thousands of years ago sounded like. We’ll get into the nuts and bolts of linguistic reconstruction and the comparative method, which is like one of the biggest and most interesting puzzles you’ll ever get to play. And I won’t just tell you. I’ll show you how to do it yourself, with plenty of examples for you to puzzle out along the way.

Colin Gorrie: Well, today we have here with us Laura Spinney, the author of Proto: How One Ancient Language Went Global. Laura, thank you so much for joining us today.

Laura Spinney: Thank you for inviting me.

CG: Let's start off: one ancient language that “went global.” What's the language?

LS: The language is Proto-Indo-European, the name that we give to the hypothetical common ancestor of the entire family of Indo-European languages, which is the largest language family in the world today, whether you measure it by number of speakers or geographical range. It’s spoken by nearly half of humanity as a first language.

CG: Do we speak one of these?

LS: We do. English is in the Germanic branch. There are twelve main branches, of which one is Germanic.

CG: From the perspective of someone who knows a little bit about the history of the English language, how does English fit into this whole picture? One of the Germanic languages, as you say. But how does that all fit together?

LS: I mean, you’re the expert in Old English. English is one of the descendants of Proto-Germanic, so it’s the common ancestor of the Germanic branch, which is one of the twelve branches that we think descended from Proto-Indo-European. Two of those branches having completely died out, having no living [descendants], those being Anatolian and Tocharian.

CG: So this brings us over a huge, huge part of the world, doesn't it?

LS: Enormous.

CG: How far did these Indo-European languages extend?

LS: Ok, so I think we have to make a distinction between their spread after 1492, which I don't go into in the book. And that's essentially when they leave the “Old World” and spring into the New, in the age of empire, in the age of colonialism, when there were ocean-going ships to carry them across the seas.

So now, the Indo-European languages are spoken on every inhabited continent. The map is quite impressive. But let’s say that before that time, they were spoken throughout Eurasia, with some large pockets of other things.

But that was their domain, they didn't go further than that. But, within that domain, they were spoken from the Atlantic coast of Ireland to the northwest of China. That's Tocharian, which has died out. And, of course, including the subcontinent: all the Indic and Iranic languages of what's now India, Pakistan, and Afghanistan.

CG: I remember I had a fun experience one time with two friends of mine, one from the Netherlands, one from Iran. And they were asking me some things about what I did, and I said, “Well, to explain, why don't you each tell me the word for daughter in your language?”

[Editor’s note: Their answers, by the way, were Dutch dochter and Persian doxtar, pronounced remarkably similarly for languages so far removed in space.]

And then they realised, “Ok, this is something a bit interesting.”

LS: And it's always a rather sweet moment, this sort of eureka moment with that kind of “aha!” of discovery. And “Wow, there's a mystery there that I need to understand,” and I guess that little moment is one of the things that drew me into this as a complete novice to begin with.

CG: So what brought you into this wild world of Indo-European studies?

LS: Well, from my very personal point of view, I'm a science journalist. And if you were to look back at my writings, you'd see I had a fairly eclectic path. But I would say that is the privilege of a journalist. So I have been interested in language before, within the very broad realm of the biological sciences.

But I was always interested, I suppose, from a more neuroscientific and psychological point of view. And then I was in Santa Fe, in New Mexico, in the summer of 2022, just as we were coming out of the worst of the pandemic. And I was lucky enough to have been included on their summer course for teaching complexity science as one of two journalism fellows.

And it was really quite an extraordinary experience because I was surrounded by incredibly clever people, people much cleverer than me, from all kinds of domains, who were there to learn about complexity and the maths and the physics of complexity. And so I was milling around with them, and they were from the social sciences, biological sciences, physical sciences. And there were linguists in the mix.

And I realized that there had been this huge watershed in the field of historical linguistics generally, but also particularly the Indo-European story, because of what had been contributed to it by paleogenetics, by ancient DNA, this revolution in biology of the last ten years.

Now the obvious question is: how can genetics change a story about language? But I think the link, at its most basic, is really simple, which is that prior to history, prior to the written part of our past, the only way we have of tracing people's movements is through archaeology and genetics. And migration has always been a major, if not the main, motor of language change and dispersal.

Because when people move, they, at least to begin with, carry their languages with them. And that was probably truer in prehistory than it is now, when we have schools and the internet and so on.

So, if you can understand the paths taken by prehistoric migrants, then you've got a very big window opened onto your language dispersal, and there's several examples of where we know that the branching of linguistic family trees maps really quite well onto the paths of those migrations through the pre-historic world.

And the same has emerged to be true of the Indo-European languages as well.

CG: Interesting.

LS: Not always. There are exceptions, but it's a very useful rule of thumb.

CG: It's a new source of data that we may not have had access to before. And even if it doesn't tell the whole story, it tells us something important.

LS: Exactly. To continue with how I got to the point of writing a book: When I started looking into this fascinating, still fascinating to me, subject, I realized that things were moving so fast that even the veterans in the field would say nobody's got an overview of it, that nobody can map it all together from linguistics, archaeology and genetics in a very cohesive way that's up to date at the same time. And certainly not for a general reader.

And yet, to me, as a general reader, it seemed to have massive intrinsic interest. So I thought, well, maybe this is the type of situation where a science journalist can actually contribute something value-added, because I can talk to the people in the field, I can read enormously, I can go to their meetings, and I can try to weave a narrative out of what's coming out that will at least capture what's happening. To present a kind of State of the Union for people at this moment in history.

Now, surely the field will move on and my book will be out of date soon-ish. Except, I think that most linguists would probably agree that the broad contours of the story are in place now, so there are not going to be major changes. Although there certainly will be in smaller parts of the story.

CG: I want to go down maybe a side path here. What you're saying about linguists agreeing that the major broad strokes are in place, I think this is perhaps something that linguists have felt for a while, and that the archaeological community may have had slightly different ideas.

How have these two different groups — or three different groups now with genetics — all talked to each other? Do they agree? How do they inform each other?

LS: The first thing to say is they do talk to each other; they don't always understand each other. The way I put it in my book is that they're barbarians to each other, where the original meaning of barbarian is ‘people who don't speak the same language.’ What they're saying sounds like “bar, bar,” or “blah, blah” to each other. And I think there's a bit of that.

Their definitions don't necessarily accord, and they're very much aware of that, but they also see the great value in having, as you touched upon, a new source of information. So, for each discipline, two other independent sources of data against which they can check or test their theories. So they realize, I think, the value of this situation is in its ensemble, even if they don't always understand each other.

The most obvious way I can think of expressing how they don't necessarily speak the same language is in what they mean by identifying a human group. Archaeologists will identify a group or some kind of community that itself identifies as one, through material culture, through the objects they pull out of the ground, recurring assemblages of objects.

So that's their broad definition of identity. But archaeological cultures in that sense can rise and fall. They can vanish from the archaeological record one day, whereas the genes carried by those people will flow on. They might be diluted, they might be mixed with other genes, but they're still there in some sense.

So geneticists have a different concept of identity of a group. And linguists a different concept again, because languages can change by descent vertically, but also by contact, horizontal contact between languages, as you know very well, borrowing and so on. So, there are two ways there in which languages can change.

And yes, three very different ways of talking about human identity, but three very complementary ways and essential ways, I would say, because we do have different ways of identifying ourselves. And language is intrinsic to the way we identify ourselves, but it's not all. There's also our parents, and where we take our ethnic or genetic stock from. And then there's the culture we belong to, and the religion we belong to, and so on. So identity is multifarious, and somehow we have to accommodate all of these definitions. And that's what they're struggling to do, I think.

CG: I couldn't have put it better myself: there’s three views on the same thing. And our challenge with something so distant in the past is how to figure out how to translate between them.

LS: There's this rule of thumb that migrants carry their languages with them, but they didn't always. Sometimes they abandoned them and took up the language of the place they were in.

There’s many, many exceptions to this rule. There again, you see this dynamic — the three can inform each other: genes, culture and language, but they don't necessarily travel together. And that's essential to understand, and why I laid it out at the beginning of my book.

CG: To focus in on the linguistic side of that triangle for a second, when we talk about the Proto-Indo-European language, you mentioned that it's reconstructed. What does that mean for us? Why do we see all these little asterisks before all the words?

[Editor’s note: Typically, Proto-Indo-European words are written in a daunting, almost algebraic notation, along with an asterisk in front of them, such as *h₃rḗǵs — this word is usually glossed ‘king, ruler’, although there will be more to say about that, and the asterisk, soon.]

LS: Proto-Indo-European, as I said, is hypothetical because linguists assume that it existed. That a cluster of dialects that belonged to one linguistic identity (of course, there are complications over the definitions of dialects and languages), but they belong to this group that we call Proto-Indo-European. But that language was spoken, we assume, we know, long before writing because there is no such language written down.

The consensus would be that it has been dead for at least 4000 years. And, although writing was invented roughly 5000 years ago, not in the part of the world where Proto-Indo-European was spoken. It got there eventually, it didn't take too long. But that was not a language that was written down.

So how do we understand anything about it at all? Well, the historical linguists have been at that job for a very long time now. And they essentially do it by what they call the comparative method, which is where they compare related words, expressions, grammatical structures, and so on across languages from the same family. And, because they know broad principles of how branches diverge from a common ancestor, the sounds that differ between them, they can compare across those branches and try to reconstruct what the original form sounded like.

That's it in a nutshell. And, doing this, they have elucidated some aspects of how that language worked, what it sounded like, and a sizable, although still skeletal, part of its vocabulary.

CG: So you aren't going to find a course on conversational Proto-Indo-European?

[Editor’s note: I have, in fact, heard that this has been attempted, although I couldn’t find any links to active courses, alas.]

LS: No. And to come to a point about the asterisk, because maybe we’d better explain, the convention is simply to put an asterisk before a reconstructed word to indicate that it has never been documented, that it is hypothetical.

CG: How hypothetical do you see these hypotheses? We don't have direct evidence. These are ideas, these are theories about what may have been. But are they wild, speculative theories, or are they theories that have quite a lot of grounding to them?

LS: You mean about the reconstructions themselves?

CG: Yes.

LS: There's a spectrum of certainty, I suppose. I mean, the idea is that the more cognates or related forms of a given word you find in different branches, the older it is, the more certain you can be that it belonged to the proto-language, the parental language.

But how many branches you need to be sure of that, and what body of evidence you need to be sure of that, is contested, and always has been. People argue over it. So there are plenty of disputes, and there are some things which it's quite hard, which are relatively intangible and difficult to get at, harder than other aspects.

So, for example, the meanings of words can shrink and expand over time, and words can take on second meanings. The example I give in my book, which is often cited, is mouse, which English takes from Latin mūs and, which, you know, originally referred to a small furry rodent, but now also refers to the device we use to move around our computer screen.

When the Romans used it, they certainly didn't mean that. So we have to be very careful in mapping back onto ancient times the meaning of words from what they mean now.

[Editor’s note: Technically, English mouse descends from the Old English word mūs rather than the identical Latin word mūs, but the point stands.]

CG: When we talk about ancient times, you mentioned that we are at least 4000 years removed from whenever this language may have been spoken. What's the latest thinking on roughly where and when Proto-Indo-European may have been spoken?

LS: You mentioned that linguists and archaeologists thought differently about this, and one of the great impacts of the ancient DNA revolution has been to tip the balance of evidence very much in favour of the linguist's theory over the archaeologists' theory. So, just to go back, there have been many theories of where this language was spoken and when and by whom. By the turn of the 21st century, I suppose there were two remaining on the table, two main ones.

The first was that they were spoken in Anatolia, what is now modern Turkey, 8000 or 9000 years ago, and that they spread out there with the first farmers after the Neolithic Revolution, moving east and west as those populations grew and as they needed more land, and so they took their languages with them.

The rival theory is one that's known as the steppe hypothesis, because it places the homeland, the birthplace of the Indo-European languages in the steppe, north of the Black and Caspian seas, in what is now Ukraine and Russia, 5000 years ago, so several millennia later. And the idea is that there also, in fact, was a group of people who invented a very successful way of life, economy, and that this led to a population explosion. Again, they radiated east and west, and took their languages with them, but at a later time/date.

Now, when the ancient DNA community showed in 2015 that there had been a very important turnover in the European gene pool around 5000 years ago that was taken as corroboration for the steppe hypothesis that there was a major influx of people into Europe at that time, and also eastwards in parts of Asia, and that this influx of people was a sort of genetic marker of the arrival of the Indo-European languages. So now the steppe hypothesis I’d say is definitely leading the game.

The Anatolian hypothesis is not dead. But the person who came up with it, Colin Renfrew, the Cambridge archaeologist who died only last year, himself retracted, and said that his friend Marija Gimbutas, who's the most prominent proponent of the steppe hypothesis in the 20th century — and his friend, by the way — he said that she'd been broadly right all along.

Now, there are some parts of the story that are still not solved and that need to be ironed out, but that she was broadly right. And I think that that is also the consensus today.

CG: So, according to the steppe hypothesis, we have the Indo-European languages radiating out from this point north of the Black and Caspian Seas. And they radiate both west and east. What brings them, what brings these languages? Is it just people wandering? Is it a search for resources? How did these spread?

LS: Yes. So, there is good archaeological evidence of a population explosion amongst those people, but they were nomadic, so it probably wasn't on the scale of the farmers earlier, who discovered this way of producing food that meant that a given piece of land could support many more of them. But it was a population explosion nevertheless.

And so they did move east and west. We now have genetic evidence for that. But the conundrum, when that was shown ten years ago by two studies in Nature, was: How did they pull it off? Precisely because there had already been this influx of farmers several millennia earlier. Nobody denies that. That's definitely true. The genetics also attest to that. There were farmers all over Europe.

If we just focus on the western end of this for the moment, it's difficult to talk about what happened on both sides at the same time.

But the population of Europe is estimated roughly to have been about 7 million 5000 years ago. So that's 7 million farmers, fairly settled in villages and settlements and speaking completely unrelated languages, because nobody disputes that the farmers took their languages into Europe when they went, 8000 years ago, or starting earlier.

But the steppe hypothesis holds that those languages were not Indo-European. So the Indo-European languages come in later with the steppe nomads, and they imposed them on those farmers. But how? They were numerically far inferior. And the numbers are very, very vague, but the thinking is that they didn't number more than several tens of thousands at the peak of the migration from the steppe.

And by the way, the archaeologists are fairly convinced that the Yamnaya, which is the name of the archeological culture to which those nomads belonged, probably didn't go further west than Hungary, the extreme western end of the steppe. So yes, the languages were carried further and their genes, but probably by their descendants.

So how did they do it?

Ten years ago, if you'd asked, probably the leading theory would have been violence. Somehow these people were violent and they imposed their genes through rape, pillage, even genocide, ethnic cleansing, and sort of tore across Europe in this orgy of violence. But the thinking has changed on that really quite radically over the last ten years. And this is one of the things that was so interesting to me. It's changed on several fronts.

I think now the thinking is that the way they managed to do it was much more multifactorial. And just to show how extraordinary this was that they pulled it off, I give the parallel in my book of the city of New York, which at the beginning of the 20th century became the most populous city in the world, had a population roughly the same as Europe at the end of the Neolithic, about 7 million people.

So there was a massive wave of Italian immigration into New York over the turn of the 19th/20th century, and it's as if those people coming in, Italian speakers, managed to switch the whole of New York to speaking Italian, rather than what really happened, which is that they assimilated and learned to speak English. Of course, they probably kept their language at home, but they assimilated to the official national language of the town.

And yet this is not what happened in Europe 5000 years ago. Why not? So the multifactorial explanation implicates plague. We now know with quite a good deal of certainty that there were plagues roaring across Europe at that time, probably just before the arrival of the nomads, although it's possible that they brought the disease or diseases with them.

One of them was likely a form of plague, although there's a discussion also about how closely that disease resembled plague as we knew it in the Middle Ages, for instance, whether it was as deadly, how it was transmitted and things like that. But anyway, it's likely that this plague or plagues affected the more sedentary farmers to a greater degree than the nomads coming in. Who, because they lived with their animals and may even have brought at least some of the diseases, probably had time to develop some kind of immunity to it.

So that's one part playing into this, because obviously it's easier to impose your language if you're moving into empty, vacated lands and just settling them. And no violence is necessarily involved. The other thing I think that people have realized is that turnover, while dramatic in the gene pool, took longer, at least in some parts of Europe, than had initially been supposed.

So, in Britain, it probably took in excess of 12 generations. Now, if you imagine that length of time, we're talking sort of let's say, 250–300 years, in that time, you could easily imagine people trickling in, trading politely with the locals, not marrying for some time, then marrying, having children and a much more gradual imposition of the languages, again without violence.

Now, there are many other explanations that have been put forward, and all might be relevant, all might have had their role to play. I really like the idea that soft power had its role to play, that these newcomers coming from the steppe had something about them in their way of life that gave them a certain prestige that appealed to the locals and that persuaded them, in some way or other, to adopt the incoming language.

And one of the things that we know about those people, in part from archaeological evidence, in part from linguistic evidence, from the reconstruction of their languages, their vocabularies, and also from their mythologies and the stories they told is that they put quite a price on hospitality, and they probably did so because they came from the steppe. They came from a way of life where they were moving around in this vast space, these vast grasslands, all year round. They never stood still.

So they were separated from their related groups and clans for long periods of time over vast spaces. And they had to develop institutions for maintaining social cohesion, or at least reducing the risk of conflict. And so hospitality was a big part of that. And we can tell from reconstructing some of their words that there was an expectation of reciprocity, for example, that probably, or at least possibly, feasting was involved, that there were bards involved, that there was storytelling involved.

It is possible these people brought all those traditions to Europe that are now preserved in the written cultures of later Indo-European traditions, the Norse traditions, the Iranic traditions, the Celtic traditions — feasting, storytelling, bards, drinking — but probably later on, probably not the first nomads. That's another thing we can explore, if you're interested.

So this idea that they were good at building alliances and maintaining alliances, and if they married into and formed sort of military alliances with the local people, which the gene evidence suggests they did. Then perhaps they drew them also into their hospitality, into their feasting, into their storytelling. And somehow through these happy meetings, the languages spread.

CG: There's certainly something beguiling about the literature that they handed down. Things like Homer, or close to my heart, Beowulf. You see some of these traditions as they've been developed and elaborated and perhaps changed over the years, but there are still some very interesting archaic remnants that are very appealing: the idea of the duty of the guest to the host and the host to the guest, the kinds of storytelling, the bard in the mead hall in Beowulf, all of these things are so interesting.

LS: Yeah. And the idea in Beowulf of the bad guest. You know, that is one of the worst sins you can commit, to be a bad guest and not to follow the rules of hospitality.

CG: Some of the best literature is about bad guests. The Iliad, the classic bad guest. Grendel [in Beowulf], as you say, another awful guest.

LS: Yeah. And this is one of the worst things you can imagine in the Indo-European tradition, one of the worst insults that you can make.

CG: And so even today, with our remove of thousands of years, these stories are interesting. And you can sort of see why the Yamnaya or their descendants may have had some sort of cachet. They might have had a “cool” factor.

LS: I think you can see something much more basic as well, which is possibly that's the first time we've illustrated it — how language and myth are themselves an archive of what happened in the past, even in the unwritten past.

CG: So shall we talk about “undying fame”?

LS: Absolutely. What would you like to talk about?

CG: Well, I'll introduce this because this is a really fun concept. This phrase that translates to ‘undying fame’ or ‘imperishable fame’ shows up in all sorts of different Indo-European descendant traditions in the poetry. You see it in Vedic, you see it in Homer, you see it in the Germanic cultures as well. You can even reconstruct the phrase ‘undying fame,’ although it sometimes changes in the descendant languages.

And this is a real concern of heroes in these archaic or ancient poetic traditions.

LS: Yes. There is this idea, both from linguistics and archaeology, that these people had an expansionist mindset and that they had a culture of raiding, of taking other people's land. Not that there was a real sense of land ownership in the modern sense, but taking land which was frequented by others, taking their animals, taking their women, but that this was all regulated.

So there are words that have been reconstructed for loot and raiding, but also for blood price, words related to restitution. So some sense in which they were always looking outward. And, in fact, we can talk about the tradition of the war bands, that young men were raised to look outwards and to expand the dominion of their people.

But it was tightly regulated so that everything didn't degenerate into violence. And the hospitality was also part of that patching-up side of things, so that this society actually worked and functioned, and that you could cross other people's territory, which you needed to do if you were nomadic with your herds without provoking a war each time. So, there had to be mechanisms for both. But, yes, you can find plentiful evidence of this, both in the language and in the archaeology.

The sticking point in this story is just how violent those people were. I mean, one interesting window on this comes from ethnography. So, you know, comparisons have to be very carefully done because, again, it’s difficult to project back onto the ancient world from the modern one.

But you can compare processes, conditions that give rise to certain ways of behaving in the modern and ancient world, and say that if those conditions are there, it's more likely that certain things came out of it. So, modern ethnographers tell us that nomadic pastoralist peoples tend to be, in general, on average, slightly more violent than sedentary farming types simply because their wealth is mobile.

So they may have been violent in that sense, in that there was this constant give and take and raiding and restitution and so on, all of which is coded for, all of which has mechanisms for resolving it. But they weren't violent in the sense that I said was discussed earlier, that they're going to come into Europe and commit genocide.

So there's that distinction there, which is one that has fueled a lot of discussion. But I think that you can make that distinction, that they were perhaps more violent than farmers of the same era, but not necessarily violent enough to commit a genocide.

CG: And their violence may have been contained within social institutions and rules.

LS: To some extent, it was expected.

CG: So maybe we can touch a little bit more on some of these social institutions. You mentioned this institution of the raiding band. How did that work?

LS: The idea from piecing it together from later Indo-European traditions is that young men were organized socially into age sets, and that this created a horizontal link. It may have been just an elite mechanism; it may not have been the whole population. But anyway, they grew up together. They went through the various stages of life together, and that, in their youth, mid- to late teens, they went through a rite of passage and they formed these war bands for seven or so years, and they were the ones who in a way drove the expansion.

This was the making of them as men. In fact, the linguists and the comparative mythologies tell us that there was a sense in which they died in the eyes of society to be reborn later on as fully fledged warriors and men. But they had to go and prove themselves, and they proved themselves by going out, pushing out the boundaries of the territory, but also the wealth in terms of cattle or animals, mobile wealth again, also women. They are associated with sexual promiscuity in the mythologies of the Indo-European traditions that come down to us.

And there's also, interestingly, an association with dogs and wolves. So when we talk about dogs of war, that's probably the ultimate origin of that concept. And there's even archaeological evidence of rituals that involved the sacrifice of dogs and wolves that may have marked the initiation of young boys into this phase of their lives.

CG: I'm thinking, in connection with this, of the many appearances of ‘wolf’ elements in names in different descendant languages. Beowulf is a good example. For those who've read it, if you haven't, Beowulf is a great hero. And his name either means something like ‘bee-wolf’ or ‘barley-wolf.’ The first part's debated, but the second part is really clear, it’s ‘wolf.’

And you see it in ancient Greek as well, this wolf association. And what's really interesting from a Germanic perspective is this word in Old English wearg. The equivalent word in Old Norse means ‘wolf,’ but in Old English, it means ‘outlaw.’ So you can still see these connections.

LS: Excellent example.

CG: So we have this raiding age-set, which is kind of like people who all graduated in the same class today, if I can make a comparison. And that was a horizontal mechanism of binding people together. What was the vertical dimension of society like? Was it a strongly hierarchical society that we can reconstruct?

LS: Yeah. So there's this concept of the *h₃rḗǵs, which later turns into rēx, and words for ‘king,’ but in Proto-Indo-European, I think the consensus would be that it might have meant something closer to a sort of priest, someone who regulated or determined what was right, who may have been the one who mediated in any kind of conflict over property or whatever.

So, yes, hierarchical, the concept of dominant and subordinate clan chiefs. Words that have to do with oaths, and the sort of serments [ed. oaths] of allegiance that one might make as a more lowly clan chief to a higher one. Also, protection conferred by the higher lord to the lower one.

So, yes, very much a hierarchy, very patriarchal. And not to say that women couldn't or didn't have positions of power, but perhaps different forms of power exerted differently. But we have pretty strong evidence, let’s say, that it was a society where women moved out of their families’ into their husbands’ — I don't like talking about marriage: I was told off for this the other day because marriage being a Christian concept — but into their partners’ or mates’ households. For example, one piece of evidence for that that's linguistic comes from the fact that there are many words for women's in-laws and none for a man's. So that's an example of how linguistic evidence can help us to piece together the structure of that society and how it functioned.

CG: I really love the example you gave, talking about this *h₃rḗǵs or however you want to pronounce that. I’ll put it in the captions!

LS: With the asterisk!

CG: Right, *h₃rḗǵs. It's intense. And exactly how you pronounce it, even practitioners will debate or they will just say, “the rēx word.” It comes out to mean ‘king,’ but the words you used in describing the functions of this office were really appropriate because you said regulate and to tell what's right.

And these two words are also descended from that exact same root. It’s this ‘straight, direct, rule.’ All of these things are in that semantic cloud of the ‘regulator’, the ‘aligner’ is the *h₃rḗǵs.

LS: Maybe we should say that the pronunciation of these words is very dubious. I always give the example, I had a friend, sadly deceased, who was a professor of medieval French history. And she said to me, how can we possibly know what this language sounded like when we don't know what Occitan sounded like in the 14th century, from village to village in the Midi, in the south of France?

And she had a point: we're talking about a time again before writing, we’re probably looking at dialect chains, so there was variation within Proto-Indo-European. And then, of course, we're looking back over at least 4000 years. So there's plenty of room for disagreement over how it sounded.

CG: And I think it's good to always have that humility when you're approaching these things. This is why I almost hesitate to even, to even utter the *h₃.

LS: Maybe I can, as a non-linguist, I can wander in and give my amateur pronunciation. But yes, just to give an estimate of what it sounded like sometimes can help.

CG: It’s hard with this very algebraic notation, with these *h₃'s and the diacritics on all of the consonants.

LS: It can be quite daunting.

CG: It can.

We talked a little bit about the European side, but there's one story that I think is really interesting and not very well known, which is the other side: the expansion of Indo-European languages into Asia and, in particular, you mentioned, into what is now western China. So, what is going on there?

LS: I know. It's an amazing story. One of the earliest, not the earliest, and the first branchings of the family away from Proto-Indo-European is probably one of the most contested parts of the story. But the kind of consensus, at least until very recently, has been that the first language branch to split off from that root was Anatolian. The second branch was Tocharian, which is one of the branches I mentioned earlier on that has completely died out.

But Tocharian was probably two languages, imaginatively named Tocharian A and Tocharian B, for which we have a corpus of textual evidence from the 5th to the 10th centuries AD/CE. So it's not very much to go on. It's a very fragmentary corpus, and it comes down to us from a time just before those languages were extinguished and long after the supposed Proto-Tocharian language was spoken, the common ancestor of those languages, which somehow reached northwest China.

But the comparison of the inscriptions, the texts preserving Tocharian A and Tocharian B, allows linguists to say quite a lot about Proto-Tocharian and also just to say with very good confidence that it's definitely Indo-European. You can see that in the core vocabulary, the words for ‘cow,’ for ‘ox,’ for ‘milk.’ For the core vocabulary, as I refer to it in the book, about members of the nuclear family and things like that are clearly Indo-European.

So what does that tell us? That tells us that at some point, somebody carried an Indo-European language from the place that we think it was born, in the region of the Black Sea, over the Urals and thousands of kilometers to the east, to what is now the northwest of China.

Already, there must be an extraordinary story behind that. It's, again, a very patchy story. Lots of it has yet to be filled in. But I suppose if I summarize the thinking on that: there was a bunch of Yamnaya, speaking some eastern dialect, or dialects, of this dialect chain that we call Proto-Indo-European — in the eastern part of the range, probably closer to the Caspian Sea, let's say, but still west of the Ural Mountains — who carried it east a very long way.

And, by the way, they are thought to have done so quite a long time, probably at least a thousand years, before the next departure of Indo-European languages to the east, which would be the Indo-Iranic branch, because there's very significant sound differences between Tocharian and Indo-Iranian. So it's thought to have left quite a lot beforehand. So it's a different story, it's not the same story. And what I love about this story is that, why would people who were probably mainly on foot, with wagons drawn by oxen, set off on this vast journey thousands of kilometres into the unknown?

There's some evidence that they did it quite quickly, that there was sort of one group of people who left and perhaps completed the journey. Estimates vary because radiocarbon dating can't be terribly precise in this context. But some people say, as quickly as two years, up to a hundred years. Maybe around ten years would gather the most acceptance at this point in time.

But we do know that there are cousins — not first cousins, but let's say perhaps second or third cousins — buried at either end of that expansion, so in the Don Valley and in the Altai Mountains, 3000km apart. So they must have undertaken that journey in a relatively short period of time.

Now, why would they do that? We don't know, because they can't tell us and they never wrote it down. But that doesn't mean we can't say anything about it. And this is where it gets really fascinating, I think. You might suppose, as very often with prehistoric migrations, that climate change played a role, which is quite possible.

But climate changes vary enormously across the steppe, then as now. So let's say they were suffering a drought to the west of the Urals. They would probably only have had to cross the mountains to solve their problems and to find enough grass to feed their herds. So it probably wasn't just that.

Now we know that the Yamnaya — from, for example, grave goods have been found under their burial mounds, under their kurgans — that they were expert metalworkers. The Altai has an unusual combination of copper and tin, which are the two ingredients of bronze. So they may have had scouts who told them, “Here are untapped seams of the metals that you covet over here, if you can just get here.”

But another idea, which is perhaps the most intriguing, is that they left for ideological reasons, that there was some kind of religious or political or ideological schism with the main body of their community of the Yamnaya, and they left to to seed a new community and living a purer life, or to return to some older form of the religion, perhaps. And you can actually see that archeologically in slight differences between the burial rites back west and in the Altai Mountains of their immediate descendants.

So, that's the idea that I loved, by no means proven. And there are certainly other theories in the mix, but that some sort of small splinter group set out with their wagons, with their oxen, with their elderly, with their children, and took their lives in their hands, because the steppe is an extremely hostile place, and started a new community in the Altai Mountains thousands of kilometres away.

CG: Wow. And then their descendants even journeyed farther than that.

LS: Yes.

CG: And lived a very different life.

LS: We have to get them from the Altai in southern Siberia, down to the Tarim Basin, which is this natural basin in the northwest of China, where the textual evidence was found, because basically the Tocharians were people who lived very much later. So the descendants of anybody who brought that language in, the distant descendants, they were people who built towns in the oases surrounding the Taklamakan desert, which is in the middle of the Tarim Basin.

And these cities were like pearls along the silk routes, which bifurcated to go around the desert on their way from China to Europe.

So that's who the Tocharians were in the Chinese equivalent of the Middle Ages. But who were they millennia earlier? They may have been nomads who came first to the Altai mountains, and then whose descendants migrated south towards the Tarim basin.

The full evidence trail is still lacking in parts. We haven't traced that migration all the way archaeologically or genetically, but parts of it are in place, tantalizingly.

CG: That was, in my mind, one of the most interesting stories in the book, because you can imagine so many human dramas, and yet there's so much mystery as to how these languages got there and how their lifestyle changed so much, and the language changed so much as well.

LS: It also illustrates how extraordinarily much you can say about these people without writing, while also, of course, being very uncertain about it.

CG: So, we're reaching somewhat the end of our time. I want to take stock and ask you, where do you think this field is going now that you've immersed yourself in it for so long? What are the big unsolved questions that you think are going to be worked on over the next five to ten years?

LS: There's a lot of infilling to do within the main contours of the story, that’s for sure. And what do I mean by infilling? The infilling is already gigantic questions, as you will recognize as an expert in Old English. So, for example, exactly how the Indo-European languages that came into Europe then split into the Baltic, Slavic, Germanic, Celtic, Italic branches still has to be worked out in detail.

For example, some say that there was an Italo-Celtic language that separated first from a Germanic language and then the Italo-Celtic language split into two branches itself, and so on.

But all of that is slightly up in the air still at the moment. In fact, there are papers coming out as we speak that are going to help shed light on those parts of the story. One on Germanic has just come out, I think — has it just been published or is it still a preprint? I'm not sure — one coming on Celtic. So still exciting times. But, again, not violating the steppe hypothesis, really, just adding more detail.

One of the big questions, I think, is the relationship between where does Anatolian fit in this whole story? And what came before Proto-Indo-European? Because no language comes out of nothing.

And there's a sort of jockeying around at the moment with different people using different tools, trying to work out what happened there at the beginning of the story, a bit like physicists trying to work out what happened at the birth of the universe before the Big Bang. But it’s a little bit — how do I put this? — semantic or terminological.

But there are some people who say Anatolian wasn't the eldest daughter of Proto-Indo-European; it was the sister. But basically, whether it was the eldest daughter, or the sister, it doesn't really matter. There's a very close relationship there at the beginning of the tree. The details will be worked out, but I don't think they'll change the story enormously.

In more human terms, one of the questions that intrigues me is what prompted one community, because we know that the Yamnaya right at the beginning, as early as we can see them genetically, were very closely related on their father's side, the men were. So the chromosomes form a very tight cluster.

So the idea is that there might have been some kind of brotherhood that left their ancestral river valley somewhere around the Dnieper, or the Don, or perhaps a bit further east, and adopted, embraced this revolutionary new way of life, becoming completely nomadic. Because previously, their ancestors had taken their herds out of those valleys when they needed to, because the grass was grazed down. But they always returned, so the valleys remained settled all year round.

The Yamnaya took this transhumance to the next level and were nomadic all year round. So what pushed them to do that? And to set in train the most extraordinary story, really, at least linguistically speaking, which, of course, shapes our world today to an enormous extent?

What prompted them to do that? And there are a number of theories about that: again, maybe some kind of ideological schism, maybe some kind of institutional convention, which was known to those people, which is what happened in times of, say, famine, or overpopulation, or difficult circumstances where the safety valve was the *h₃rḗǵs, the rēx word, says, “Right, you young men, you go off, you find new lands, you settle and you make a new life,” relieving the pressure back home.

And there is evidence for this vēr sacrum [ed: sacred spring,] as it was called in Latin, this kind of convention that comes down in the daughter traditions. So there's some evidence that something like that might have happened, or it could have been an epidemic. And this bunch, this brotherhood, were the only survivors, and perhaps even they had some kind of mutation that distinguished them, which conferred on them a kind of immunity, which is why they survived. And then their descendants carry this immunity into Europe, which gives them the upper hand relative to the more vulnerable farmers who were there.

So, there are all sorts of theories, again, that speak to that very, very early phase of the story. Whether we'll ever resolve it is anybody's guess. It's not impossible, but a lot of the evidence that might help us to resolve it is probably in the east of Ukraine, which is currently a war zone. So, there's no more evidence coming out of there for now, that’s for sure. That doesn't mean it never will. But that's where it is stuck for now, in a very tragic sort of historical irony.

CG: And then what's next for you?

LS: I am a journalist, so I am busying myself making my living as a journalist. I have to tell you, though, I just so loved this story. I still feel very attached to it and very fascinated by it. And I can't quite bring myself to let it go, so I don't know: there’s plenty of other parts of our linguistic history beyond Indo-European that this toolkit of archaeology, and genetics, and linguistics is helping to elucidate now.

So maybe there's more for me to write about there. But “I don't know yet,” is the short answer to your question.

CG: Well, if people want to check back later and catch up with you, where can they find you?

LS: I'm not going anywhere. I have a website, and anyone can reach me through that.

CG: Well thank you, Laura, it's been a pleasure.

LS: It’s been a great pleasure talking to you. Great questions.