How to identify any language at a glance

A handy guide for amateur linguists

The key to identifying languages are in the distinctive letters and symbols.

Thanks to globalization, it's very likely that at some point you've found yourself faced with a line of text written in a language you couldn't quite identify. Maybe in the international section of a grocery store, or on Facebook, for example. "What the heck is this language?" you ask yourself.

To get the answer, often all it takes is a little character. One or two little characters, to be precise. Many languages written using the Latin alphabet have characters or combinations of characters that are unique only to that language. If you spot them, they can give you just the tip-off you need. Here are your lucky flags for some of the languages you're most likely to encounter in print:

Ã, ã: When you see this sign of a nasalized A (as in São Paulo), you're almost certainly looking at Portuguese, especially if the language looks a lot like Spanish.

Subscribe to The Week

Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.

SUBSCRIBE & SAVE
https://cdn.mos.cms.futurecdn.net/flexiimages/jacafc5zvs1692883516.jpg

Sign up for The Week's Free Newsletters

From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.

From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.

Sign up

Ă, ă: This A with a cup on the top is your surest way of knowing you're looking at Romanian (unless you're looking at Vietnamese, but read on for more about that). For further confirmation, look for Ț/ț and Ș/ș (that's T and S with a comma beneath).

Ģ, ģ; Ķ, ķ; Ļ, ļ; Ņ, ņ: In case Romanian was feeling special about having that T and S with a comma, the Latvians have four letters with commas that no one else has.

Ő, ő; Ű, ű: These vowels that look like their hair is standing on end are the most unambiguous signs of Hungarian. The clever Hungarians just combined ó and ö to make a letter that means "long ö," and did same with ű.

Ř, ř: This is the classic distinctive Czech letter. It stands for a sound so difficult to learn, even Czech kids take years to get it right. Another character that only Czech has is Ů/ů. (Does that have a familiar ring? Don't confuse it with å — see below.)

Ł, ł: If you see this letter (as in Łódź, and standing for a sound like "w"), you are very likely looking at Polish. For further confirmation, look for Ż/ż. (Polish also has many other accented characters, including ź — not the same as ż — but some other languages have these, too).

I, ı; İ, i: Of course, I and i are used everywhere, but in Turkish they're not the same letter. I is capital ı (no dot) and i is small İ (dot), which is why Istanbul is really İstanbul in Turkish. In case you're wondering, ı is said like the vowel in tick but much further back in the mouth. Only Turkic languages have this orthographic distinction, and Turkish is the only one of those you're likely to see. Another good flag for Turkish is ğ, which is silent (as in Erdoğan).

Å, å: This å looks like a seal balancing a ball on its head. It stands for a sound like the "o" in "or" and is the classic stereotypical Scandinavian letter, even though it normally appears only in Norwegian, Danish, and Swedish. How do you tell those languages apart? If it has å plus ø and æ it's Norwegian or Danish (more information below); if it has ö and ä (with crowns, like Swedish royalty) it's Swedish. To get from København (Copenhagen) in Denmark to Malmö in Sweden, you cross the Øresund if you're Danish or the Öresund if you're Swedish. If you thought Swedish uses ø, sorry — you get a zero.

ø, aa: Norwegian and Danish use ø, but so does Faroese, and all of them — along with Icelandic — also use æ. You can sometimes tell Danish from Norwegian because Danish sometimes uses aa (as in Kierkegaard) instead of å. You can identify Faroese and Icelandic with our next key letters.

Ð, ð; Þ, þ: These letters, which English also had a thousand years ago, stand for sounds we now spell with th (as in "this" and "thin"). They are the distinctive sign of Icelandic and Faroese — and, frankly, you're unlikely to see the latter. If you do, you'll know it by the fact it uses ø. Icelandic uses ö instead (as in jökull, which means "glacier").

And if you see a language made up of short words, with most of its vowels having one or two accents each, so the cumulative effect is like looking at someone with a lot of piercings, you're looking at Vietnamese. Here's an example taken from Wikipedia: Hà Nội là thủ đô của nước Cộng hoà Xã hội chủ nghĩa Việt Nam và cũng là kinh đô của rất nhiều vương triều Việt cổ.

There are also many languages you'll see using the Latin alphabet that don't have a single character to give them away. Here are some ways to tell some of them apart:

French, Spanish, and Italian: Spanish is the only one of these three that has ñ (though other unrelated languages also use it). Italian has common words è ("is") and e ("and"), which in French are est and et and in Spanish are es and y.

Dutch, German, and Afrikaans: Of these three close relations to English, only German uses Ä/ä, Ö/ö, and Ü/ü. Only Dutch frequently uses ij; Afrikaans uses y in the same places (e.g., Dutch mij and Afrikaans my mean "me"). German for "is" and "and" are ist and und, while in Dutch and Afrikaans they're is and en.

Irish, Scots Gaelic, and Welsh: Welsh is actually quite different from the other two. It uses lots of ll and ff and it uses w as a vowel (e.g., cwm). The two Gaelics (Irish and Scots) are easily identified because both have lots of bh, ch, dh, fh, gh, mh, ph, sh, and th (none of which are pronounced like you probably think they are), and both use accents on vowels, but only Scots uses grave (left-pointing) accents, like on à in Gàidhlig.

Finnish and Estonian: Finnish has long words and lots of double letters (as in moottoripyöräonnettomuus, which means "motorcycle accident"), making it look (and sound) like it's speeding past you. Almost none of it looks related to words you could recognize. If you see a language that looks a lot like Finnish but has words ending in b or g and has the character õ, it's Estonian.

Albanian and Xhosa: These two languages are completely unrelated, sound nothing alike, and are from different continents. But both use xh (as well as c and q) and if you don't know either of them you may be stumped by looking at them. But Albanian uses a lot of ë (as in Tiranë, the capital of Albania). A lot. Xhosa does not. (On the other hand, Xhosa and Zulu look very similar, and if you're not sure which you're looking at, maybe just ask someone.)

Slavic languages: I've told you about Czech and Polish, which are the easy ones. Other languages that use the Latin alphabet include Croatian and Slovak, and really, you might as well give up and ask. The ones that use the Cyrillic alphabet (like русский) include Serbian, Bulgarian, Belarusian, Ukrainian, and of course Russian. Some Central Asian languages do, as well. Remember: Just because it uses the Cyrillic alphabet doesn't mean it's Russian. (Also, just because a language uses Arabic script doesn't mean it's Arabic. It could be Farsi or Urdu, for instance, neither of which are actually related to Arabic.)

Bonus: How can you tell Chinese and Japanese apart? There's one special character that will give away Japanese every time, and it's only fair to tell you. Japanese uses three writing systems, only one of which is the same as Chinese uses, but unless you know them, you're out of luck. But Japanese makes frequent use of the character の, which is a grammatical particle and does not exist in Chinese (Chinese characters are never round).

To continue reading this article...
Continue reading this article and get limited website access each month.
Get unlimited website access, exclusive newsletters plus much more.
Cancel or pause at any time.
Already a subscriber to The Week?
Not sure which email you used for your subscription? Contact us
James Harbeck

James Harbeck is a professional word taster and sentence sommelier (an editor trained in linguistics). He is the author of the blog Sesquiotica and the book Songs of Love and Grammar.