Isn't it kinda obvious that people aren't particularly good at recognizing languages they don't know? Obviously, this affects all languages.
No Stupid Questions
No such thing. Ask away!
!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.
The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:
Rules (interactive)
Rule 1- All posts must be legitimate questions. All post titles must include a question.
All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.
Rule 2- Your question subject cannot be illegal or NSFW material.
Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.
Rule 3- Do not seek mental, medical and professional help here.
Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.
Rule 4- No self promotion or upvote-farming of any kind.
That's it.
Rule 5- No baiting or sealioning or promoting an agenda.
Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.
Rule 6- Regarding META posts and joke questions.
Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.
On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.
If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.
Rule 7- You can't intentionally annoy, mock, or harass other members.
If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.
Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.
Rule 8- All comments should try to stay relevant to their parent content.
Rule 9- Reposts from other platforms are not allowed.
Let everyone have their own content.
Rule 10- Majority of bots aren't allowed to participate here. This includes using AI responses and summaries.
Credits
Our breathtaking icon was bestowed upon us by @Cevilia!
The greatest banner of all time: by @TheOneWithTheHair!
I don't have a concrete answer to your question, but it does sound an awful lot like you're getting xckd 2501'd:

I'm a native English speaker with auditory processing problems. I have occasionally mistaken spoken German for "English that my brain didn't want to cooperate with."
When I hear drunk Finnish as a Hungarian, my brain goes into overdrive to try to decode it. I swear that if I were drunk enough, I could understand it. I do not understand one squeek of sober Finnish though.
Yes, people often confuse one language for another if they don't know either language.
Yes, languages get mistaken for each other all the time when one is not familiar with the writing system, and sometimes even when one is. I have struggled to understand posts in Spanish before realising they're actually in Portuguese, which I don't speak. (Also I'm pretty sure Norwegian and Danish are actually the same language).
Can you tell the difference between Telugu and Kannada just by looking at them? How about Arabic and Persian? How about Arabic and Ottoman Turkish? Persian and Kurdish? Yoruba and Xhosa? Ukrainian and Kazakh? Sumerian and Akkadian? Actually, could you tell the difference between Akkadian and Old Persian? They are both written with cuneiform characters, but the characters themselves are apparently as different as hiragana and hangul.
If your sentence is written entirely in Chinese characters, there is no way for somebody unfamiliar with them to determine whether it's Japanese, Mandarin, or Hokkien. And if somebody hasn't seen enough Japanese text to figure out the difference between kana and Chinese characters, they still won't be able to tell the difference.
As to why Google can't tell, Google doesn't actually understand anything. It's based on a massive database of which characters and combinations of characters come next to each other (and there's also some Markov stuff to account for common spelling mistakes). If your search string is made entirely of Chinese characters, it's going to get hits on websites written with Chinese characters, many of which will be in Mandarin. Google.com probably isn't able to detect your UI or browser language settings. To ensure you get results from Japan, try using google.jp instead, as it will prioritise Japanese results.
Historically it's because languages borrow from each other. Japan borrowed characters from China and vice versa later on.
As an English speaker who's dabbled in other Germanic and Latin languages, absolutely. I dislike Dutch specifically because it will either start to read like English or German then fall off the deep end real quick.
Portuguese (at least Brazilian) looks and sounds like a mashup between Spanish and French.
This is kind of to be expected when you look at the history of how these languages evolved. The reason Japanese and Mandarin would be so easily confused is that the writing system was imported from China and there are a lot of words that either still look like the parent words or very similar. The same for the Latin and Germanic languages as well as their related offshoots. The history of invasion, language mixing, adaptations, and standardization has produced languages that are in various levels of mutual understanding.
Even though there are words in both Japanese & Mandarin that are written similarly to one another or exactly the same but keep in mind pronunciation is different, such as 警察 (けいさつ) from Japanese but in Mandarin that’s pronounced as jǐngchá. Dutch and German are still different languages but are there also pronunciation differences for words that are spelled the same?
Not on the same level considering the difference in how the Eastern and Western languages are formed.
I'm generalizing a bit here into Western being Germanic or Latin based languages and Eastern being primarily Chinese.
Western languages typically use symbols that represent the component sounds of a language (phonemes) where Eastern languages use symbols for whole words or concepts (morphemes). So you would have a single symbol for house or tree instead of a series of symbols for the words.
This means that the word spelling would change in Western languages as the spoken language changes more rapidly and show a large difference between two closely related languages in their spelling for the same word.
Conversely, in Eastern languages, the spoken language is not as closely tied to the words used. (To translate into English as best I can: The character for house could be pronounced like house, home, building, cave, lean-to, castle, shed, etc depending on where it's being said). So, you'd have, after a few generations, the same character pronounced two very different ways with two different meanings
are there also pronunciation differences for words that are spelled the same?
Through through tough thought, I can't think of any.
Others have answered the question regarding other languages being mistaken for each other by non-speakers (of course this happens). Just wanted to add that Google has had a problem discerning Japanese and Chinese for the longest time and it drives me nuts. This is something a computer should easily be able to distinguish— we’re not talking about human recognition, we’re using an entirely different block of Unicode!
The most infuriating was when Google Maps’s text-to-speech insisted on using the mandarin pronunciation for kanji when navigating IN JAPAN. I’m glad it no longer does that, but at the expense of still not using Japanese… if you have your phone set to English, now it’ll use only the English that appears in road signs, and pronounce the words according to English phonics rules. Not as bad, but still… why? Why not just allow Japanese pronunciation of place names while in Japan? Why must my desire to hear “turn right” also come with having Kinkakuji pronounced “kihnkeighkuhji”?
Kanji is a derivative of older Mandarin character sets. That's why they look alike. Modern Kanji is of course dissimilar, but the glyphs still look similar, so unless you know one or the other deeply, I wouldn't find it odd for people unfamiliar with either to be able to tell some of them apart.
The example characters you've provided do look fairly similar to me, though I'm somewhat familiar with Kanji and can normally tell the difference.
In current day tech, East Asian characters are taken from a combined set called "CJK unified ideographs". When regional variants exist, the language it renders as depends on the font of the user's device.
There was a recent example that came up: 骨(bone) has the little square on the left in Simplified Chinese and on the right in Japanese. With Hiragana it's more obvious because of all the curvier letters, but with kanji only phrases even smartphones tend to mix it up.
I can't tell between Hungarian, Romanian, Bosnian, Albanian, Czech or Slovak, because I haven't really studied any of them or know any words. In the Cyrillic field, Belarusian, Russian, Ukranian (except for the existence of ï), Bulgarian, Serbian I probably couldn't be able to tell you what faced with a random paragraph of text.
Spanish and Portuguese
German and Russian
German and Dutch a bit
Spanish is my first language and one of the most embarrassing moments of my life was asking two people if they were speaking Portuguese and they said no, Spanish.
Which would be okay because we all have different accents... But we were from the same country.
German and Russian?! Maybe to a person not at all familiar with either Germanic or Slavic languages, but they are not as closely related at the other 2 pairs.
Yeah I almost didn't put this one because I don't get it, but several people I know have confused them.
most people go by sound. tgere is a phenomenon that (european) portugese gets viewed as 'some slavic language' by laypeeps. it is just arbitrary what people will think your language is.
Is your question whether this happens to other languages? If so that's an easy "yes".
Of course.
It's the logographic nature of the text that makes it more common in eastern languages. Modern Western written languages, for the most part, are strictly alphabetic so since the symbol doesn't represent concepts, only vocal Phonemes which when strung together represent a concept, it's really hard to misinterpret the symbols, and it really doesn't matter which language you're representing with them. This gets a little fuzzy in Celtic languages because they have a lot of sound combinations that don't exist in other languages, and there's some confusion at times when looking at Western script and Cyrillic because while they're both rooted in the Latin Alphabet, they both evolved separate ways of handling various sounds, but since similar words that share a common meaning (and often a common root word) in a lot of languages do in fact sound different, they also are often spelled differently enough that it becomes obvious quickly which language you're using. There are of course some words and short phrases that do in fact get written identically in multiple, closely related languages, and that can confuse machines and people, but it's far less common than with eastern languages.
Curiously enough, spoken Farsi sounds like Russian to me (I cannot understand it, but it must have the same sounds, phonemes, or pauses, not sure), and I am fluent in Russian.
are you asking why google can not distinguish? that idk. it should be possible to discern if a text input is japanese or chinese by just looking at the characterset if it is not exclusively using the unified unicode characters.
for me, as a writing stan, it is possible to guess what language is written. the presence of kana makes it, indeed, trivialy easy. but i have learned both chinese and japanese a bit (only in writing though, lol), so that i might be lucky enough to find a character to say for sure "that's written different in chinese". people who don't know anything just see complex, chinese-ish characters and say chinese. (even wit kana present) the same ignorance is at work, when western people call a farsi or urdu text arabic, or anything written in cyrillic russian.
for languages written in latin, i e.g. usually have to look twice to see if a text is danish, swedish, or norwegeian, since i never learned any of these properly, and need to find the distinctive features.
