In late 2012 I was sent to a language seminar. The speaker was English linguist, academic, and author David Crystal; and I played photographer during his talk. It’s during that event that I first heard how at the current rate of language attrition (one language every two or three weeks was disappearing, by Crystal’s account), the world would lose more than half of its linguistic diversity in the next few decades. Thousands of languages, gone, forever.

Fast forward 4 years and much of the same message is heard when the media picks up the story. It’s true that at the time I didn’t really go into much detail about the information Crystal shared; linguistics wasn’t and still isn’t my strong suit. But his point was, I must admit, sort of powerful. That evening I felt that, in a way, I was living the highlight of what language would ever be on planet Earth: the top of the bell curve, just developed enough as a species to realise how many linguistic systems we have to communicate amongst ourselves but also developed enough to start killing off less adaptable languages.

But if I – a guy with little interest in linguistics and language conservation – was shocked into reality in that late autumn evening in 2012, then why hasn’t the alarm been pulled full way by the greater part of academia? Given such bleak perspectives, haven’t enough languages died out so that the wider part of the press that covers world heritage issues start ramming the message home? The situation surely precedes 2012, after all. So that made me think that either the message wasn’t (isn’t) getting across for some reason, or the message itself has something wrong with it.

I now know the true picture is far more nuanced. Researching this post I’ve come to understand that even UNESCO doesn’t have an exact tally, but estimates that around 2500 languages around the world are at risk of dying out. With around ¼ of all the languages existing today spoken by communities smaller than 1000 people, the thinking is that the smaller the community, the greater the odds of it going extinct.


But that’s just one stand. The Catalogue of Endangered Languages (ElCat) has recently published results that show not only how it isn’t accurate to say that languages are dying out at a rate of one every two weeks, but that counting in weeks would be unrepresentative. By their account, around 3.5 languages become extinct per year – a much smaller rate than what I already knew. So we’re not in that dire of a situation, and yet a loss for humanity is looming on the horizon all the same.

Still, ElCat’s results are just another way of simplifying something dynamic and ultimately unpredictable. Since diffused costs are notoriously underestimated in value on the short term, maybe this would explain the lack of hype around the topic: we lose languages, but does it really hurt? Add to this how nobody knows precisely what and how many languages are out there and you get just enough mist around the conclusion for people to ignore the topic.

But I thought that surely there must be some value in the available data. Some hint at what gives certain languages more strength than others.

The birth, spread, and classifications of languages around the world

I thought it best to start at the very beginning. Although we can never know for sure which was the first language ever, research on modern languages’ origins seem quite comprehensive. That being said, linguists have identified three factors which influence the process of language birth:

  1. time (as languages change in time)
  2. separation (a physical distance which creates cultural specificity)
  3. contact (as mutual influence leads to blending and borrowing of everything from words to grammar) (a very nice language evolution simulator that’s based on contact can be found here).

Based on these three variables, it’s likely that we can piece together accurate images of how languages appear, evolve, and die out. That’s because languages act in many ways similar to how we would expect living organisms to act. And as a result of this, sometimes we end up with new languages appearing naturally and involuntarily over large periods of time. Sanskrit evolved into Hindi, Latin into the Romance languages, and so will most certainly some language of tomorrow stand on the shoulders of the English you and I speak today.

Some researchers even go so far as to claim that by using statistical data, we might get a pretty good guess at how much time it would take for all of the world’s languages to have evolved from just one: about 100.000 years, which would fit in with how humanity has developed, albeit very roughly. Interesting as this thought is, humanity wasn’t and hasn’t been tabula rasa whilst languages slowly evolved from one to another; it seems many languages appeared independently of each other.


By making use of data from Ethnologue (whose comprehensive database stands as the backbone of this post) I’ve found that much of the the world’s linguistic landscape can be simplified into 6 major families (along with 142 others, a few of which I’ll mention here), most of them distinct enough in terms of their origins that it’s believed that they have little or no common points of genesis.

Arguably the most widespread group of languages today is the Indo-European family. Spanning through almost all of Asia and Europe, this linguistic branch is believed to have descended from a nomadic people that lived sometime around the year 3000 BC in what is nowadays Ukraine. Also worth mentioning are the Semitic languages (which also span back as far as 3000 BC) which have their point of genesis in the Arabian peninsula.

Apart from Indo-European and Ural-Altaic from central Eurasia (both of which are spread out and fragmented) the rest of the world’s linguistic families keep to specific plots of land.


Sino-Tibetan is the largest Asian linguistic family, with China being the country most famous for harboring such languages. The southern shores of Asia host the Dravidian languages (where we find Cambodia, Laos, and Vietnam). Malayo-Polynesian languages can be found a bit more to the southeast (going as far as Madagascar, which is technically in Africa).

In mainland Africa, we see a very diverse language map. In the northern part of the continent, most languages are classed into a group called Afro-Asiatic (also known as Hamito-Semitic). Arabic stands out as most dominant from this language family. Lower south we find the Bantu languages group – with Swahili as its flagship.

Other groups around the world include Koreanic and Japonic (themselves representatives of the Altaic branch) – which tend to stand out as single language groups in themselves due to local diversity, and of course the Finno-Ugric languages of northern (Finland and Estonia) and central Europe (Hungary).

With all this diversity, it’s worth reiterating that, in large part due to colonialism (in the past) and globalization (in present times) Indo-European languages have spread out and are dominant in most places around the world.

The expansion of certain languages in days past has severely harmed indigenous speech all around the world, with natives in such places being either punished or killed for using their mother-tongues. And nowhere is this more visible is Australia, where a massive 184 native languages have disappeared in the wake of colonists populating the island-continent.


Language is almost always related in some way to power. Two of the oldest languages still lucky to be around today are Hebrew and Basque – both of them spread out in pockets where there was enough stability or effort put into maintaining them alive. Where such conditions were not present, other old languages either died out or evolved to face new circumstances.

According to research published by Charles Ferguson, languages ‘live’ based on three main conditions:

  1. graphization: having a writing system 
  2. standardization: rules that override local particularities
  3. modernization: adapting and being able to translate and sustain discourse about contemporary topics

What this means is that for a language to survive, it must have a written form which has rules clear enough that it can be used in the same way by people that don’t have any connection to each other, and that it must be able to live in the present time, so that it will not be replaced by other languages which might be more adept at explaining the changing landscape of each generation’s zeit.

Defining language is thus not a matter of particularisation, of saying what it actually is – but more a matter of clarifying what it is not.


The range of definitions is vast and complex, and as such, an effort to classify languages might come to points where taxonomies overlap. But one truth keeps coming up: not all languages serve utilitarian purposes. That is to say, not all forms of speaking are useful. Either because of high complexity or lack of culture compatibility, some languages just fail to adapt. And in such situations, it’s common to see the less adaptable language die out (hence Crystal and UNESCO’s shared perspective). But this isn’t unavoidable, at least not always.

Simplified versions of languages can appear for many reasons. Trade – which often requires a more basic and straightforward vocabulary – is probably the most influential factor that has, across the millennia, forced the apparition of pidgin (i.e. simplified) languages.

A prime example of this is pidgin English in New Guinea, which was at first an overly simplified version of English devised for facilitating communication between traders. But don’t expect to be able to haggle with someone if you’re not versed in the local speak, since I’ve read it has evolved in such a way that it’s now incredibly difficult to understand, with much of it taken from native context.

And this serves to prove the evolutionary perspective of language all the more. When new generations of children are actually raised within a pidgin language, it’s no longer called like that – it becomes an all-purpose language that starts to re-complexify itself so as to serve in all walks of life. This reiteration is, in turn, called a creole language.

Creole languages don’t just have one parent (like French has Latin) but necessarily a multitude of them. Such linguistic organisms take from multiple host languages elements which the people use as best they can.

English has itself given birth to such communication systems. Here are a few of them: Tok Pisin which is spoken in Papua New Guinea, Pitkern from Pitcairn Island in the South Pacific, Gullah in the United States (sometimes referred to as Geechee), Sranan or Sranan Tongo from Suriname, Singlish from Singapore, and many more.


And when there are natural and cultural barriers which create many micro-communities in small spaces, the results can be astounding: Tok Pisin stands alongside another 836 documented languages in the same country – which makes Papua New Guinea the most fertile country on the planet with respect to harboring language.

How can this be? Well, studies show that 330 people is the minimum viable population size for human languages to be sustainable; this comes with three risk components: range size, speaker population size, and speaker growth rate. With such relatively low requirements, languages have sprouted out across the whole planet in ways incredibly diverse – and the peppered geography of Papua New Guinea has created spots of human development just isolated enough to facilitate hundreds of languages that might not even know of their neighbors.


Yet as previously mentioned, not all languages are created equal. Sure, there are things you can say in creole languages that you can’t get across in the mother language. It’s easier to have simple discussions in creoles than in stand-alone languages since they tend to have easier rules.

But in most situations, it’s the natives that had to adapt to the language of the visitors, both in face-to-face interactions, as well as in subsequent economic relationships. In 2014 a team of Cambridge researchers detailed how the success of a place’s economy can be a death sentence for less developed / minority languages. This is because, as Tatsuya Amano worded it, “people are forced to adopt the dominant language or risk being left out in the cold – economically and politically”.


What’s more, “…as economies develop, so says Amano, there is increasing advantage in learning international languages such as English, but people can still speak their historically traditional languages. Encouraging those bilingualisms will be critical to preserving linguistic diversity.”

Heartwarming as his words are, I can’t help but think of this piece of expert advice as being a bit counter to how we should go about this.

Columbia University professor of linguistics John McWhorter estimates that in one century (99 years, adjusting for publication dates) it’s possible that only 600 of today’s languages will be still around. Since modernity understands languages as used in writing, with formal rules which make them – in a sense – real, literacy can and does threaten linguistic diversity.

What this means is that education with bilingualism as the main goal might just kill off native languages faster, since most don’t have formalized grammar or documented vocabularies, or even practical uses – all prerequisites that make a language teachable.

Since our ability to stretch our cognitive attention spans is ultimately limited, it becomes, in the words of McWhorter, a choice between a “larger language which should offer opportunity and smaller languages with (perceived) backwardness”. In such cases, it’s all too common for languages which can’t perform to die out in a generation or two.

Neither does economic mobility help maintain language diversity. The United States stands as an example of this – with over 200 languages spoken within its borders from people all across the world, it’s still the case that simpler and more worldview-prone languages (English and Spanish) dominate.


In fact, these worldview-prone languages have a name: international.


Modern humans have always sought out universal languages – for whatever reasons. Most of the time because it doesn’t take much to link misunderstanding to violence.

Around 200 artificial languages have been created since the 17th century. One of the first was created by a Bavarian in the 1880s, and named Volapük: a combination of German, French, and English. It didn’t catch on. Esperanto is another, much more famous example – and a much more successful one. Esperanto is spoken by about 2 million people today – that’s more than some national languages. But all of these were overtaken by a truly universalising language that came into its own around that time as a global linguistic power: English.


Although most of what we know as languages today are vigorous (i.e. languages that are spoken by everybody in a certain place because they can get along with it in just about any situation), most people today speak national languages – which, again, tend to overlap with international languages – more specifically, English.


So while it’s true that Chinese – by which I include all dialects into one super language – is spoken by most people around the world today, it’s due to its cultural particularity, lack of spread (China is big, but it’s still a single country), little use in science (not being a medium of knowledge is bad for language dispersion) and relatively high difficulty to read and write that it is not, and will never be, as popular as English.


Proof of this is the fact that information around the world circulates primarily in English. It might be shortsighted to make a prediction out of this since China is steadily becoming ever more technologized, but it seems that Mandarin is too insular to be used by a world that’s accustomed to easy conversations in English. 

Think about it: even though English ranks third worldwide in speaker count, it’s first by all possible measurements when it comes to spread, being present in over 100 countries today.

What’s more, after a person already knows English, which is in itself a very simple language to master, it becomes much more difficult for them to attain a second language which comes from a different linguistic group. Arabic is the hardest, Cantonese, Mandarin, Japanese and Korean top the list – English speakers rarely, if ever, master them with ease.


The numbers from Ethnologue seem to prove this right, at least for the most part: places that already have simple (or very particular) national languages tend to lack openness to others. After mapping out their language diversity index, a rule of thumb would be that the harder the single language you and your friends know best, the better your community is at learning or at least being open to new ones.

A modern perspective on linguistic diversity, and some sad thoughts

But, hey, there are many versions of English. What makes them all part of a single unit is the intelligibility of their members: even given geographic particularities and borrowed words, if another speaker can understand you, then you’re speaking English.

This makes the language I’m writing in right now possibly the one universal language humanity has always dreamed of since the myth of Babel came about: the long lost speak that would make us all understand each other. After all, it has everything specialists say it needs: it’s adaptable, spread out, able to facilitate everything from romance and trade to scientific research – and it’s easy to learn.

Yet as world demographic figures shift, economic ties strengthen and globalization plays out, it becomes an ever more evident fact that we will indeed lose most languages that are nowadays spoken. Since no two language systems are alike, with infinite nuances differentiating even the most common of discussions waged using similar terms in separate languages, we risk losing an infinity of thoughts if these languages will die out. That’s the fear David Crystal seeded in me at that seminar in 2012.

So maybe simplicity shouldn’t be something to strive for. While reading up on John McWhorter during my research, I came across a thought I wanted to use as an end-note for this post:

“(…) what’s peculiar about the Babel tale is the idea of linguistic diversity as a curse, not the idea of universal comprehension as a blessing.”

Much of the information I’ve used is ready-cleaned and free to access online. You can find everything in my sources. But if you still want to get in touch about this, reach out on Twitter.



