Languages or Dialects?

When I tell people that my mother tongue is Malayalam, first they look at me like I am playing a tongue-twister game and then a good % of them follow up with ” Oh, so that is an Indian dialect”. And I ever so patiently try to explain that Malayalam is not a “dialect”, it is a “language” on its own. Regardless of whether they nod in agreement after or without further discussion, a nagging thought always lingers in my mind whether they really agree that my beloved Malayalam is a language and not just a dialect.

For people who are more familiar with Chinese dialects ( they are usually the ones who have a problem accepting that India has several languages), I often present the argument that contrary to Chinese dialects, Indian languages don’t share the same script. This used to close the argument before. But yesterday I had a similar conversation with a colleague, who insisted that that is not a good enough reason. So, I decided to do some “research” on this and after some primitive googling came up with this.

Webster defines language as “a systematic means of communicating ideas or feelings by the use of conventionalized signs, sounds, gestures, or marks having understood meanings“. Dialect is defined as “a regional variety of language distinguished by features of vocabulary, grammar, and pronunciation from other regional varieties and constituting together with them a single language“.

The Linguitsics department at University of Delaware teaches that “A criterion of mutual intelligibility is often applied as a test of whether a pair of speakers is speaking two different languages or one or two dialects of the same language: if the two speakers can understand one another, then they must be speaking the same language.

While I can prove that Malayalam and say, Hindi are two distinct languages based on the definition above, it does not help my case that the North German German and South German and Bavarian German are just dialects, not distinct languages. So, further reading revealed that “A language is a dialect with an army and a navy” (Uriel Weinreich, eminent sociolinguist). Aha! I guess an attempt to define languages and dialects through logic is not going to take me too far. And not just that, languages can not only be languages, but also be dialects. So, while French and Italian are widely accepted as mutually unintelligible language, they both are also “Romance” dialects (Webster).

It also got me thinking, if we apply the rule of intelligibility, C and C+ and C++ (hmm..and many others) must be programming dialects and not programming languages.

If the rule of mutual intelligibility or different scripts or geographical boundaries does not work, what does? And who decides? And if the distinction is not so clear, is being labeled a dialect as much a stigma as we often associate it to be?

I am left with more questions than I started with.

Any thoughts/opinions/hearsay?

Posted in Culture & Languages, Musings on February 18, 2005

15 Responses

  1. Umesh P Nair says


    1. Dialects are normally distinguished in the spoken language only. Indeed, vocabulary is different, but the main difference is how the same word is spoken differently. People from different parts of China cannot speak with one another, but they all can understand a Chinese movie if and only if there is subtitles in Chinese! (True, chinese movies normally have Chinese subtitles)

    If you can understand a book in the other “thing”, it is another dialect, otherwise it is another language.

    2) Sometimes, there can be a third relationship. Something like inheritance. Like Tamil and Malayalam. Malayalam was formed from Tamil, so there was a time when all Malayalees could understand Tamil, but not vice versa. So, my definition breaks here, but that is because we failed to recognize this third type.

    At present, Tamil and Malayalam are different languages, because the present-day Tamil also is inherited from the old Tamil which was the mother of Malayalam.

    3) What is C+? I have heard of C and C++, but never C+. Did you mean C#? (C# is an illegimate offspring of C++ and Java, by the way.)

    The name C++ was chosen to indicate that it is ‘C incremented’, or ‘the next C’. ++ is the increment operator (which gives the next value in the sequence) in C.

    4) C and C++ has a Tamil-Malayalam like relation. Initially C++ was inherited from C. Anybody who knew C++ could understand C. Later they got diverged. The latest C and C++ standards are growing independently.

    5) “Different languages have different scripts. Different dialects have same script”? If this is true, Hindi is a dialect of Sanskrit, and all computer langauges are dialects of one another, because all of them are written in English!

    Thanks for the post.

  2. Srijith says

    I am not going into the linguistic domain (I know nothing about it!), but isn’t the fact that a national government recognizes Malayalam as a language reason enough to credit it for being a language? A lame reasoning if argued from a pure linguistic perspective, but good enough to make an argument.

    One of the definition of a dialect at is:

    – A language considered as part of a larger family of languages or a linguistic branch. Not in scientific use: Spanish and French are Romance dialects.

    Now that was all encompassing!

  3. Surya says

    1) The idea of the book test is great and seems to me to be generic enough for a lot of cases. I will definitely stress-test this out with other languages if and when I come across people who speak other languages.

    2) If there is a new third relationship for inheritance, a lot of langauges are going to fall under this category. For example, English, German, French and several other European languages have been inherited from Latin or Roman. And in fact, I have read somewhere that most languages in the world have their beginnings in 3 or 4 main language families. So, that might create some confusion. For example, if category 1 is languages, 2 is dialects and 3 is “inheritance” , then I could classify English under both 1 and 3. And since all Chinese dialects have the same origin, Cantonese or Hokkien could fall under 2 and 3.

    So, may be inheritance is a process. In the beginning, there is the language of Adam and Eve or sign language, and then it splits into a few main families (languages). From each language family, sprouts out a few offshoots, which starts off as different accents and then proceed to become dialects and eventually languages of their own right and perhaps, eventually superlanguages. (Superlanguages are what I would call languages that have other distinct languages which had origins in them).

    3) Thanks for pointing out the mistake. I meant C#. # and + are next to each other on the Deutsch keyboard :P

    4) You are right. C and C++ could have a Tamil-Mal relationship. But how abt someone who knows VB. They can probably understand C to a large extent. So, would that mean those that are mostly mutually understandable (and to what extend of comprehension ) are programming dialects? The simple explanation is probably that when computer languages were first started, they already knew the about the mess linguists were in and said ‘Screw it! we will call all of them languages’

    5) Hmm..Yes. point accepted. So people have been accepting my wrong defence for a long time now.

    Thanks for the comments. Was very insightful.

  4. Surya says


    >>” I am not going into the linguistic domain (I know nothing about it!), but isn’t the fact that a national government recognizes Malayalam as a language reason enough to credit it for being a language? A lame reasoning if argued from a pure linguistic perspective, but good enough to make an argument.”

    Thats precisely what Weinreich meant when he said “A language is a dialect with an army and a navy”. The clear boundaries of languages and dialects, if ever there was one, have been compromised by political clout over the ages. Sure, if you want to end an argument with the shot gun method, this reasoning works. But if you want to have a constructive conversation and walk away feeling satisfied, I wouldn’t take that argument.

  5. Reghu says

    The way I see it, language and dialect differ more or less the same way as genus and species in evolution/taxonomy… these are just conventions to perceive a distinction…

    But unlike life-forms, languages dont have a common evolutionary path or pattern… the way Madarin and Cantonese evolved might not be the way Malayalam broke away from Tamil… and languages also have the issue of re-combination…

    But coming back to answering the original question… we’d have to go case by case, and also the comparison needs to be done by someone familiar with the “languages” in question.

    In other words, being familiar with Malayalam and Tamil, I can say that they are separate languages… and I’ll leave it to the Chinese to decide whether theirs are dialects!

  6. Reghu says

    And now that I read the other comments, mebbe I shud add in something more…

    I hate it wen I see ppl (including me) use and OVERuse analogies… like I started with the “evolution” analogy for languages… the problem being that now we are trying to use “evolution” (me) and “computer languages” (umesh?) to prove points about “languages”… if we are trying to figure something out, lets go purely on the subject at hand, else we are just unnecessarily introducing ambiguities (which prolly were not in the original issue)…

    Some things that come to my mind…

    1. Distinctions can be geographical (Malayalam vs. Tamil)… Cultural (Urdu vs. Hindi) Political (cant think of a good e.g. :P maybe something like German vs. French) and many other…

    2. Though languages “evolve”, all languages need not be traced back to ONE common mother-of-all simply bcos languages can spring up out of thin air (like esperanto) and language more or less started After humans spread out across the globe.

    Since this is a very interesting topic, mebbe more thots on this later… in the meanwhile, check out which was a really interesting read about “written” language…

  7. Umesh P Nair says


    I didn’t give the analogy of computer languages. Surya did that, and I was just commenting on that.

    However, sometimes, analogies do help in understanding the issue, if it is a good analogy.


    – Umesh

  8. Surya says


    >> we’d have to go case by case, and also the comparison needs to be done by someone familiar with the “languages” in question.

    ..would we accept a small variant as a language as one just because the people there insisted it is one? So, if you apply for an MBA at say INSEAD and you see a tree language requirement, can you tell them, hey I know trichur Malayalam and TVM malayalam and they are two different languages..

    While your answer is probably the best one in an ideal age, people are not often rational when it comes to matters close to their heart (and mother tongue is often one of them).

    …About computer langugaes, it is not really an analogy. I was talking more of using the term “language” loosely. We call all programming langauges languages. My question was whether we are abusing the linguistic meaning of the word languages or are they languages even in the linguistic definition of the word.

    ..ur opinions abt using the concept of evolution are very apt..i absolutely agree..

    Thanks for the comment.

  9. Anonymous says

    Hi Surya,

    Interesting post…..i might have one or two small comments. This is stuff i’ve been learning recently, since i’ve been taking some linguistics courses at the University of Washington.

    1) Interestingly, China has a number of LANGUAGES….though they are officially classified as dialects by the chinese government. People in Eastern China can barely understand the Chinese of Beijing. There are a number of scripts, and ofcourse, Tibetian is quite far away from Chinese :-)

    2) Languages are sufficiently different from dialects in different grammar, construction, vocabulary. Like u said, the mutual intelligibility test is one factor. Another factor is the language root, and a distance from the root. Eg. Malayalam (though influenced by Sanskrit) is indo dravidian, Hindi is indo-european.

    3) Script and language are not to be mixed. Scripts evolve much after a language evolve, and often scripts are just borrowed.

    nice post.

  10. venkat says

    Dude, consider yourself lucky…I live in Brussels and have a tough time explaining how there is no language called “Hindu” which is spoken by “everyone in India”…


  11. Surya says


    Thanks for the comments. Good to hear from ppl who study linguistics..!
    But I have never yet heard anyone refer to different Chinese languages, only dialects. All my Chinese friends tell me that even if people from different parts of China cannot understand each other, they can understand each others books, or movie subtitles for that matter, perfectly well. Perhaps that is why the Goverment calles them dialects.

  12. Surya says

    Venkat, I totally empathise. I have often been asked if religion is Hindi. =)

    And I thought at least Brussels had a big enough Indian population for people to be aware..ah well.

  13. Samuel says

    “All my Chinese friends tell me that even if people from different parts of China cannot understand each other, they can understand each others books, or movie subtitles for that matter, perfectly well. Perhaps that is why the Goverment calles them dialects. ”

    The chinese school imposes this belief upon them as a nationalistic maneuver. In a linguistic sense, the chinese “dialects” are in every sense different langauges. They cannot understand each other. They all use the same script, which because it is logographic preserve much of the meaning cross-language. Then again, even the mongolian language, which is from a different family of languages, was once using the chinese script.

    I have no information about Malayalam but I know there are many different languages in India. Not all of them are really different languages though, like Hindi and Urdu (same language). Governments are interested in politics, not linguistics, and can hardly be counted on when presenting the languages of their region.

  14. lol 7 years later says

    ^Great explanation, Samuel!

    I’m of Chinese descent and the Chinese government brainwashes the mass to believe that we all speak “dialects”. I speak Cantonese at home, and if you place me within a group of people speaking Hokkien or Mandarin (varieties of other Chinese languages) I would have no idea what people are talking about. The special thing about written Chinese language is that it is ideographic, meaning that a character conveys an idea, so no matter what language we speak at least we can understand the written text (and other non-Chinese languages have used Chinese characters to represent their language too!) The languages under the Chinese family are mutually unintelligible. Technically all speaking languages are dialects. The way languages are labelled also have political and nationalistic reasons.

    India is a huge country with a massive amount of diversity in ethnicity, and languages. Most people aren’t deep into linguistics and are told by how they are taught to label languages whether they know it’s right or wrong (like calling Malayalam a dialect).

  15. Punjabi says

    Lol , its funny how you try to explain and they don’t get it. May be next time you can tell them Malayalam is a language with its own several dialects. I am Punjabi and when non – Indians argue with me like that I ask them to Google “dialects of Punjabi language” – problem solved.

