this post was submitted on 22 Jun 2026

12 points (75.0% liked)

No Stupid Questions

48601 readers

2393 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)

Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.

Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.

Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.

Rule 4- No self promotion or upvote-farming of any kind.

That's it.

Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.

Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.

Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.

Rule 8- All comments should try to stay relevant to their parent content.

Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.

Rule 10- Majority of bots aren't allowed to participate here. This includes using AI responses and summaries.

Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 3 years ago

MODERATORS

L3s@lemmy.world

technopagan@lemmy.world

jeffw@lemmy.world

L3s@hackingne.ws

L4s@lemmy.world

Why do two languages get mistaken for one another? (piefed.social)

submitted 1 day ago* (last edited 1 day ago) by SilentStriker@piefed.social to c/nostupidquestions@lemmy.world

24 comments fedilink hide all child comments

This is evident when I show what handwritten Japanese (Kanji only without any Kana) looks like, they still mistake it for Mandarin (due them being logographic), the same applies towards google searches too, as when I type a Japanese word in Kanji (despite having the UI and browser set in Japanese or English) I still get results in Mandarin since all the websites contain the TLD .cn or .tw when I am looking for Japanese websites ending with (.jp).

If a person is clueless about distinguishing the differences between languages (especially ones that look similar when written even though they're different, kind of like when writing in French & English but they're still different languages), then they fall into the trap of "Is that French?" or vice versa for example, when in fact it's written in English. Does this word all look the "same" to you or not when telling the difference between 日本語 or 中文?.

You get the point, I still get comments equivalent to "is that Chinese?" when there's kana present within the sentence (which Mandarin does not have, as they write entirely in Hanzi). Some words are written the same but pronunciation is very different as they're unrelated languages. Does the same thing happen to let's say Norwegian & Danish (or any other European language) since both pairs use similar alphabets and have an identical writing system?

From Japanese or Mandarin, there are characters that look the same but have different pronunciations altogether like:

-	日本語	中文
擲弾兵	てきだんへい	Zhì dàn bīng
艦隊	かんたい	Jiànduì
陸軍	りくぐん	Lùjūn
神社	じんじゃ	Shénshè
地獄	じごく	Dìyù

top 24 comments

sorted by: hot top controversial new old

[–] QuizzaciousOtter@lemmy.dbzer0.com 1 points 18 hours ago

Isn't it kinda obvious that people aren't particularly good at recognizing languages they don't know? Obviously, this affects all languages.

[–] CombatWombat@feddit.online 16 points 1 day ago

I don't have a concrete answer to your question, but it does sound an awful lot like you're getting xckd 2501'd:

[–] EponymousBosh@awful.systems 4 points 1 day ago (1 children)

I'm a native English speaker with auditory processing problems. I have occasionally mistaken spoken German for "English that my brain didn't want to cooperate with."

[–] dzsimbo@sopuli.xyz 2 points 1 day ago

When I hear drunk Finnish as a Hungarian, my brain goes into overdrive to try to decode it. I swear that if I were drunk enough, I could understand it. I do not understand one squeek of sober Finnish though.

[–] mech@feddit.org 11 points 1 day ago

Yes, people often confuse one language for another if they don't know either language.

[–] Infrapink@thebrainbin.org 7 points 1 day ago

Yes, languages get mistaken for each other all the time when one is not familiar with the writing system, and sometimes even when one is. I have struggled to understand posts in Spanish before realising they're actually in Portuguese, which I don't speak. (Also I'm pretty sure Norwegian and Danish are actually the same language).

Can you tell the difference between Telugu and Kannada just by looking at them? How about Arabic and Persian? How about Arabic and Ottoman Turkish? Persian and Kurdish? Yoruba and Xhosa? Ukrainian and Kazakh? Sumerian and Akkadian? Actually, could you tell the difference between Akkadian and Old Persian? They are both written with cuneiform characters, but the characters themselves are apparently as different as hiragana and hangul.

If your sentence is written entirely in Chinese characters, there is no way for somebody unfamiliar with them to determine whether it's Japanese, Mandarin, or Hokkien. And if somebody hasn't seen enough Japanese text to figure out the difference between kana and Chinese characters, they still won't be able to tell the difference.

As to why Google can't tell, Google doesn't actually understand anything. It's based on a massive database of which characters and combinations of characters come next to each other (and there's also some Markov stuff to account for common spelling mistakes). If your search string is made entirely of Chinese characters, it's going to get hits on websites written with Chinese characters, many of which will be in Mandarin. Google.com probably isn't able to detect your UI or browser language settings. To ensure you get results from Japan, try using google.jp instead, as it will prioritise Japanese results.

[–] TrippinMallard@lemmy.ml 2 points 1 day ago

Historically it's because languages borrow from each other. Japan borrowed characters from China and vice versa later on.

[–] DokPsy@lemmy.world 11 points 1 day ago (1 children)

As an English speaker who's dabbled in other Germanic and Latin languages, absolutely. I dislike Dutch specifically because it will either start to read like English or German then fall off the deep end real quick.

Portuguese (at least Brazilian) looks and sounds like a mashup between Spanish and French.

This is kind of to be expected when you look at the history of how these languages evolved. The reason Japanese and Mandarin would be so easily confused is that the writing system was imported from China and there are a lot of words that either still look like the parent words or very similar. The same for the Latin and Germanic languages as well as their related offshoots. The history of invasion, language mixing, adaptations, and standardization has produced languages that are in various levels of mutual understanding.

[–] SilentStriker@piefed.social 3 points 1 day ago* (last edited 1 day ago) (2 children)

Even though there are words in both Japanese & Mandarin that are written similarly to one another or exactly the same but keep in mind pronunciation is different, such as 警察 (けいさつ) from Japanese but in Mandarin that’s pronounced as jǐngchá. Dutch and German are still different languages but are there also pronunciation differences for words that are spelled the same?

[–] DokPsy@lemmy.world 1 points 1 day ago

Not on the same level considering the difference in how the Eastern and Western languages are formed.

I'm generalizing a bit here into Western being Germanic or Latin based languages and Eastern being primarily Chinese.

Western languages typically use symbols that represent the component sounds of a language (phonemes) where Eastern languages use symbols for whole words or concepts (morphemes). So you would have a single symbol for house or tree instead of a series of symbols for the words.

This means that the word spelling would change in Western languages as the spoken language changes more rapidly and show a large difference between two closely related languages in their spelling for the same word.

Conversely, in Eastern languages, the spoken language is not as closely tied to the words used. (To translate into English as best I can: The character for house could be pronounced like house, home, building, cave, lean-to, castle, shed, etc depending on where it's being said). So, you'd have, after a few generations, the same character pronounced two very different ways with two different meanings

[–] HobbitFoot@thelemmy.club 1 points 1 day ago

are there also pronunciation differences for words that are spelled the same?

Through through tough thought, I can't think of any.

[–] Iunnrais@piefed.social 1 points 23 hours ago* (last edited 23 hours ago)

Others have answered the question regarding other languages being mistaken for each other by non-speakers (of course this happens). Just wanted to add that Google has had a problem discerning Japanese and Chinese for the longest time and it drives me nuts. This is something a computer should easily be able to distinguish— we’re not talking about human recognition, we’re using an entirely different block of Unicode!

The most infuriating was when Google Maps’s text-to-speech insisted on using the mandarin pronunciation for kanji when navigating IN JAPAN. I’m glad it no longer does that, but at the expense of still not using Japanese… if you have your phone set to English, now it’ll use only the English that appears in road signs, and pronounce the words according to English phonics rules. Not as bad, but still… why? Why not just allow Japanese pronunciation of place names while in Japan? Why must my desire to hear “turn right” also come with having Kinkakuji pronounced “kihnkeighkuhji”?

[–] just_another_person@lemmy.world 10 points 1 day ago* (last edited 1 day ago)

Kanji is a derivative of older Mandarin character sets. That's why they look alike. Modern Kanji is of course dissimilar, but the glyphs still look similar, so unless you know one or the other deeply, I wouldn't find it odd for people unfamiliar with either to be able to tell some of them apart.

The example characters you've provided do look fairly similar to me, though I'm somewhat familiar with Kanji and can normally tell the difference.

[–] Rentlar@lemmy.ca 6 points 1 day ago

In current day tech, East Asian characters are taken from a combined set called "CJK unified ideographs". When regional variants exist, the language it renders as depends on the font of the user's device.

There was a recent example that came up: 骨(bone) has the little square on the left in Simplified Chinese and on the right in Japanese. With Hiragana it's more obvious because of all the curvier letters, but with kanji only phrases even smartphones tend to mix it up.

I can't tell between Hungarian, Romanian, Bosnian, Albanian, Czech or Slovak, because I haven't really studied any of them or know any words. In the Cyrillic field, Belarusian, Russian, Ukranian (except for the existence of ï), Bulgarian, Serbian I probably couldn't be able to tell you what faced with a random paragraph of text.

[–] dan1101@lemmy.world 3 points 1 day ago (2 children)

Spanish and Portuguese

German and Russian

German and Dutch a bit

[–] trashcroissant@lemmy.blahaj.zone 4 points 1 day ago

Spanish is my first language and one of the most embarrassing moments of my life was asking two people if they were speaking Portuguese and they said no, Spanish.

Which would be okay because we all have different accents... But we were from the same country.

[–] Yaky@slrpnk.net 4 points 1 day ago (1 children)

German and Russian?! Maybe to a person not at all familiar with either Germanic or Slavic languages, but they are not as closely related at the other 2 pairs.

[–] dan1101@lemmy.world 3 points 1 day ago (1 children)

Yeah I almost didn't put this one because I don't get it, but several people I know have confused them.

[–] phr@discuss.tchncs.de 3 points 1 day ago

most people go by sound. tgere is a phenomenon that (european) portugese gets viewed as 'some slavic language' by laypeeps. it is just arbitrary what people will think your language is.

[–] False@lemmy.world 4 points 1 day ago

Is your question whether this happens to other languages? If so that's an easy "yes".

[–] Zwuzelmaus@feddit.org 3 points 1 day ago

Of course.

[–] GreenBeard@lemmy.ca 2 points 1 day ago

It's the logographic nature of the text that makes it more common in eastern languages. Modern Western written languages, for the most part, are strictly alphabetic so since the symbol doesn't represent concepts, only vocal Phonemes which when strung together represent a concept, it's really hard to misinterpret the symbols, and it really doesn't matter which language you're representing with them. This gets a little fuzzy in Celtic languages because they have a lot of sound combinations that don't exist in other languages, and there's some confusion at times when looking at Western script and Cyrillic because while they're both rooted in the Latin Alphabet, they both evolved separate ways of handling various sounds, but since similar words that share a common meaning (and often a common root word) in a lot of languages do in fact sound different, they also are often spelled differently enough that it becomes obvious quickly which language you're using. There are of course some words and short phrases that do in fact get written identically in multiple, closely related languages, and that can confuse machines and people, but it's far less common than with eastern languages.

[–] Yaky@slrpnk.net 1 points 1 day ago* (last edited 1 day ago)

Curiously enough, spoken Farsi sounds like Russian to me (I cannot understand it, but it must have the same sounds, phonemes, or pauses, not sure), and I am fluent in Russian.

[–] phr@discuss.tchncs.de 1 points 1 day ago

are you asking why google can not distinguish? that idk. it should be possible to discern if a text input is japanese or chinese by just looking at the characterset if it is not exclusively using the unified unicode characters.

for me, as a writing stan, it is possible to guess what language is written. the presence of kana makes it, indeed, trivialy easy. but i have learned both chinese and japanese a bit (only in writing though, lol), so that i might be lucky enough to find a character to say for sure "that's written different in chinese". people who don't know anything just see complex, chinese-ish characters and say chinese. (even wit kana present) the same ignorance is at work, when western people call a farsi or urdu text arabic, or anything written in cyrillic russian.

for languages written in latin, i e.g. usually have to look twice to see if a text is danish, swedish, or norwegeian, since i never learned any of these properly, and need to find the distinctive features.