Recently, I wrote about how to use special characters to identify languages that use Latin alphabets. But what about all those other alphabets? You can't be expected to memorize them all, of course. And you don't have to. Just keep an eye out for their signature looks — and maybe use one or two mental tricks to help you remember. Here are the languages you're most likely to see, with some tips on how to identify them.
South and Southeast Asia
There are many languages in Indian and Southeast Asia that have distinctive writing systems, but they're all descended from the Brahmi script of some 2,300 years ago, and they all have the same general principle of a character standing for a consonant plus a vowel, with extra marks to specify what vowel. We can divide these languages into three completely unscientific sets based on how they look: those with a line across the top, those that use a lot of circles, and those with hooks or rings on their ends.
Devanagari is the name of the alphabet; it is used for Hindi, Sanskrit, and quite a few other languages. Devanagari is pronounced like "Dave a nuggery." This is by far the most common and best known language with a line across its top. Its letters are generally rounded or square rather than triangular.
Gurmukhi: ਗੁਰਮੁਖੀ ਲਿਪੀ
This is the alphabet used by Punjabi. It looks rather like the Devanagari but it has a couple of letters that look like mechanical claws reaching down to grab something: ਲ (sounds like "l") and ਨ (sounds like "n"). Need a pun to remember it by? Think Pun-grabby!
This script uses a lot of curving downward strokes that can look like knife edges. It is also used to write a few other languages such as Sikkimese and Ladakhi, but if you see it, you're probably looking at Tibetan.
Assamese and Bengali: বাংলা লিপি
Both these languages have letters pointing to the left like ব (think Bengal tiger teeth), and both have squigglies like ল . But only Assamese has the triangle with a second line inside ৰ (standing for "r" like in English "rare"), and it uses it a fair bit. Think of that letter as resembling the A in Assamese.
Gujarati: ગુજરાતી લિપિ
This script looks rather like a free-flowing version of the Devanagari without the line across the top. Note the hooks flowing over the top like the letter G and an upside-down letter J.
Now we're getting into the languages that feature a lot of circles. This language of South India has a name that, in the Latin alphabet, is spelled the same way in either direction, and many of the letters in its own alphabet are symmetrical sets of loops and bumps, such as ത and ഋ.
Kannada: ಕನ್ನಡ ಅಕ್ಷರಮಾಲೆ
You probably use letters from this South Indian alphabet without realizing it. "ಠ ಠ what?" you might ask. There, you did it. The ಠ symbol is the Kannada letter for "t." Many of the other letters in this alphabet have the same "eyebrow" on the top. If you think of eyebrows as expressing a candid opinion, think of it as a Kannada opinion.
Odia: ଉଇକିପିଡ଼ିଆ ରୁ
The script for this language of East India includes many letters that look like balloons, some containing shapes like m, others with an extra curved line, which makes them look like they're moving.
Sinhala: සිංහල අකුරු
This language of Sri Lanka has a syllabic script with curvy, curly letters, several of which look like variants of w swinging loops.
Tamil: தமிழ் எழுத்து முறை
This language of Sri Lanka, South India, and elsewhere stands out for having some boxy letters alongside the loops. This includes several letters that, by coincidence, look like they contain T for Tamil.
Telugu: తెలుగు లిపి
A key distinguishing mark of this South Indian language is that sometimes there are one or two telling echoes on a letter — curved lines to the right of a curved letter: రు ము.
The language of Myanmar (formerly Burma) has a script made mostly of open circles like a fancy jewelry chain, but the real giveaway is the big round-cornered boxes on some of them, which might look like they indicate capitals but actually indicate an added "y" sound.
Once we're east of Myanmar, we move from the alphabets with lots of circles to the ones with hooks or rings on their ends. The script for Khmer, the national language of Cambodia, has a lot of angled hooks on its top, like little hands gesturing "Goodbye!" or "C'mere!"
Lao is the language of Laos. Its letters have lots of lovely little loops, and some have bigger flatter loops on top too.
The Thai script is very similar to the Lao script — if you can read one, the other will be very easy to learn — but its lines are straighter. Most of its letters have one tight little loop at the end, as though they're tied (Thai'd) to something.
The Chinese writing system is the most widespread writing system in East Asia. Vietnamese used to use it; Korean still does in some contexts; and Japanese uses it too, but augments it with two other writing systems. Chinese characters were designed to be written with straight or gently curving brush strokes, which means even something that depicts a circle is more like a box — here's the sun: 日. There are actually two versions of the Chinese characters now; in the People's Republic of China, some characters have been simplified, while in Taiwan and some other places, the traditional forms remain. Here's a traditional horse: 馬. Here's a simplified one: 马.
The dead giveaway for Japanese is this character: の. It shows up often because it's an important grammatical particle (pronounced "no"). It's round, which Chinese characters never are. It's part of the hiragana character set, which is unique to Japanese. Japanese also has another character set, called katakana, which includes a character you have probably seen or even used without knowing where it came from: ツ. (It stands for the sound "tsu.")
The Korean script is a very well-thought-out syllabic script called hangul that was invented in the 1440s by King Sejong. It is designed to indicate the shape and position of the lips and tongue for the different sounds that combine into a syllable. Look for characters made up of combinations of circles, straight lines, and upward points.
Mongolian tends to use the Cyrillic alphabet now, but you might still see the Mongolian script on signs, books, CDs, and similar display items. It is extremely distinctive, because it is written vertically, top to bottom, and with a line down the right side. It's vaguely reminiscent of very sharp Arabic writing turned on its side, or the mane of a wild horse.
Eastern Europe and the Caucasus
The Cyrillic alphabet, which is named after Saint Cyril, is used to write Russian, many (but not all) other Slavic languages, and quite a few non-Slavic languages, as well. You probably know what it looks like, and may think of it has having "backwards N" and "backwards R" — but you should know that the letter и stands for an "ee" sound and the letter я stands for a "ya" sound. Other misleading shapes for English speakers include с, which is "s"; н, which is "n"; and р, which is "r" (it was based on the Greek alphabet, not the Latin one). The Cyrillic alphabet has variations that are clues for specific languages, and Robert Romanchuk, Associate Professor of Slavic at Florida State University, has helpfully let me in on a few: If it has comparatively many ascenders and descenders, in particular the letters ћ, ђ, and ј, it's probably Serbian; if it has comparatively many ascenders and dots (in particular ґ, і, and ї) but not descenders, it's probably Ukrainian; if it has ў, it's Belarusian; and while Russian has the letter ъ, it's not too common — if you see it more than once or twice, you're probably looking at Bulgarian.
You can probably recognize Greek, especially since some of its letters are used in math and in fraternity names (α, β, θ, π, ψ…).
Georgian: ქართული დამწერლობა
This language of the Republic of Georgia (or "Sakartvelo" in their own language) has an alphabet that looks like it could be from South India. Telltale letters are those that look like m, n, and o with an attached blade ლ ღ დ (standing for "l", "gh", and "d") and one that looks like a carrot წ (standing for an ejective "ts").
Armenian: Հայերենի այբուբեն
The Armenian alphabet will probably look to you like it's made mainly of variations on upside-down and right-side-up versions of squared-off r, m, and n, with ascenders and descenders (so think of Armenian).
The Middle East (and beyond)
Arabic: الحروف العربية
You probably recognize this flowing script, which is written from right to left. What you may not know is that it is used by many other languages, most of which are not even related to Arabic. Farsi (Persian) and Urdu are Indo-European languages, for instance (Urdu is mutually intelligible with Hindi), and Kazakh and Uyghur are Turkic. (Turkish used to be written with the Arabic script too, but now it uses a version of Latin.) Since these languages all have different sound systems to Arabic, they have added and modified characters that can tell you right away what language you're looking at — that is, if you're familiar with the Arabic script. If you're not, it might be too much to get a grip on, especially since each character has four different forms depending on where it is in a word. But some recurring combinations of strokes will help you tell apart the three languages you're most likely to see in the Arabic script: If it has lots of parallel vertical lines, especially at the beginnings of words, it's probably Arabic; if it has lots of words ending in a stroke that swings out and then back below the line, especially کے, it's very likely Urdu; and if it has neither of those characteristics, but you often see a vertical line to the right of a cup with a dot over it (آن), it's probably Farsi (Persian).
Hebrew: אלפבית עברי
You also likely recognize Hebrew's right-to-left writing, which looks like carefully brush-painted letters (just look for the upward point on the top left of each one). You might take a few moments longer to recognize it in some of the modern type faces, though. Also, while it formally requires small dots and lines to indicate vowels, they're often left off in common contexts.
This syllabic alphabet is used for Amharic (Ethopian), Tigrinya, and a few other languages of the area. It always looks as if it has been drawn freehand, even in computer typefaces, and as such has many round-ish but not exactly round shapes connected to straight-ish but not exactly straight lines. Many characters look like they have heads, eyes, arms, or legs — you get a sense of it from this quote from Wikipedia: አማርኛ ፡ የኢትዮጵያ ፡ መደበኛ ፡ ቋንቋ ፡ ነው።
This alphabet is used for writing Berber languages, which are spoken in the Sahara region. It looks like a mixture of Latin and Greek letters, geometric symbols, and shapes from spatial intelligence tests.