Difference between revisions of "Onna"
Line 179: | Line 179: | ||
==Confusion== |
==Confusion== |
||
− | Human, even a native Japanese speaker, looking at <big><big>[[⼥]], [[女]],[[女]] </big></big> cannot guess, which of them is [[X2F25]], which is [[X5973]] and which is [[XF981]]. |
+ | Human, even a native Japanese speaker, looking at characters <big><big>[[⼥]], [[女]],[[女]] </big></big> cannot guess, which of them is [[X2F25]], which is [[X5973]] and which is [[XF981]]. |
Some software also confuse these characters; for example, the default Mediawiki replaces character [[XF981]] to character [[X5973]] without any warning. This may cause problems at automatic treat of data: in some cases the two objects are the same, and sometimes they are not. Similar confusion takes place at the careless use of term "[[equality]]" applied to triangles in [[geometry]]. |
Some software also confuse these characters; for example, the default Mediawiki replaces character [[XF981]] to character [[X5973]] without any warning. This may cause problems at automatic treat of data: in some cases the two objects are the same, and sometimes they are not. Similar confusion takes place at the careless use of term "[[equality]]" applied to triangles in [[geometry]]. |
Revision as of 20:21, 26 May 2021
Onna (おんな) is sound that in Japanese may denote woman, Human female. [1]
Unicode
At least three Unicode characters are related to sound Onna
and the picture shown in Figure 1.
These characters are
⼥ X2F25,
女 X5973,
女 XF981.
The PHP program below shows the Utf8 encoding of each or these three characters:
<?php function unichr($dec) { if ($dec < 128) { $utf = chr($dec); } else if ($dec < 2048) { $utf = chr(192 + (($dec - ($dec % 64)) / 64)); $utf .= chr(128 + ($dec % 64)); } else { $utf = chr(224 + (($dec - ($dec % 4096)) / 4096)); $utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64)); $utf .= chr(128 + ($dec % 64)); } return $utf; } function mb_str_split($str) { // split multibyte string in characters // at all positions except the start: ^ // and the end: $ $pattern = '/(?<!^)(?!$)/u'; return preg_split($pattern,$str); } function uniord($a) { $M=strlen($a); $p=ord($a[0]); if($M==1) return $p; $p-=194; $p*=64; $p+=ord($a[1]); if($M==2) return $p; $p-=2050; $p*=64; $p+=ord($a[2]); return $p; } $a=unichr(0x2f25); $a.=unichr(0x5973); $a.=unichr(0xF981); echo "$a\n"; $N=strlen($a); echo "The array has $N bytes; here is its splitting:\n"; for($n=0;$n<$N;$n++) { printf("%02x ",ord($a[$n]) ); } echo "\n"; $b = mb_str_split($a); var_dump($b); $M=count($b); #mb_internal_encoding("UTF-8"); for($m=0;$m<$M;$m++) { printf("\n"); $c=$b[$m]; $u=uniord($c); printf("Unicode character number %05d id est, x%04X\n",$u,$u); $d=strlen($c); echo "Picture: $c uses $d bytes. These bytes are:\n"; for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n])); printf("in the hexadecimal representation and\n"); for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n])); printf("in the decimal representation\n"); } ?>
This program uses portable PHP functions unichr.t, mb_str_split.t and uniord.t. The output is
⼥女女 The array has 9 bytes; here is its splitting: e2 bc a5 e5 a5 b3 ef a6 81 array(3) { [0]=> string(3) "⼥" [1]=> string(3) "女" [2]=> string(3) "女" } Unicode character number 12069 id est, x2F25 Picture: ⼥ uses 3 bytes. These bytes are: xE2 xBC xA5 in the hexadecimal representation and 226 188 165 in the decimal representation Unicode character number 22899 id est, x5973 Picture: 女 uses 3 bytes. These bytes are: xE5 xA5 xB3 in the hexadecimal representation and 229 165 179 in the decimal representation Unicode character number 63873 id est, xF981 Picture: 女 uses 3 bytes. These bytes are: xEF xA6 x81 in the hexadecimal representation and 239 166 129 in the decimal representation
Confusion
Human, even a native Japanese speaker, looking at characters ⼥, 女,女 cannot guess, which of them is X2F25, which is X5973 and which is XF981.
Some software also confuse these characters; for example, the default Mediawiki replaces character XF981 to character X5973 without any warning. This may cause problems at automatic treat of data: in some cases the two objects are the same, and sometimes they are not. Similar confusion takes place at the careless use of term "equality" applied to triangles in geometry.
In a text, that assumes any kind of citing, for example, copypasting to frame of a brouser or any search engine, the characters, corresponding to sound onna should be specified as X2F25, X5973, XF981 rather than ⼥, 女, 女: the software confuse the last two characters.
For year 2021, this confusion is recognized and described [2][3][4].
Examples
Dictionary Jisho suggest examples with sound onna [5]
References
- ↑ 1.0 1.1 https://jisho.org/search/%23kanji%20%E5%A5%B3 https://jisho.org/search/%23kanji_女 woman, female Kun: おんな、 め On: ジョ、 ニョ、 ニョウ Jōyō kanji, taught in grade 1 JLPT level N5 151 of 2500 most used kanji in newspapers On reading compounds 女 【ジョ】 woman, girl, daughter, Chinese "Girl" constellation (one of the 28 mansions) 女王 【ジョオウ】 queen, female champion 処女 【ショジョ】 virgin, maiden 一女 【イチジョ】 one daughter, eldest daughter, first-born daughter 女王 【ジョオウ】 queen, female champion 女房 【ニョウボウ】 wife (esp. one's own wife), court lady, female court attache, woman who served at the imperial palace, woman (esp. as a love interest) 老若男女 【ロウニャクナンニョ】 men and women of all ages 天女 【テンニョ】 heavenly nymph, celestial maiden, beautiful and kind woman 女房 【ニョウボウ】 wife (esp. one's own wife), court lady, female court attache, woman who served at the imperial palace, woman (esp. as a love interest) 女官 【ジョカン】 court lady, lady-in-waiting Kun reading compounds 女 【おんな】 female, woman, female sex, female lover, girlfriend, mistress, (someone's) woman 女形 【おんながた】 onnagata, male actor in female kabuki roles, female partner (in a relationship) 醜女 【しゅうじょ】 homely woman, plain-looking woman, female demon 囲い女 【かこいおんな】 mistress 雌 【め】 female, smaller (of the two), weaker, woman, wife 女神 【めがみ】 goddess, female deity 早乙女 【さおとめ】 young female rice planter, young girl 醜女 【しゅうじょ】 homely woman, plain-looking woman, female demon
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=2F25 ⼥ 2F25 KANGXI RADICAL WOMAN Han Script id: allowed confuse: 女 , 女
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=5973 女 5973 CJK UNIFIED IDEOGRAPH-5973 Han Script id: restricted confuse: 女 , ⼥
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=F981 女 F981 CJK COMPATIBILITY IDEOGRAPH-F981 Han Script id: allowed confuse: 女 , ⼥
- ↑ https://jisho.org/search/%E5%A5%B3%20%E3%81%8A%E3%82%93%E3%81%AA%20%23words?page=2 女 おんな #words .. Words — 107 found おんなおや 女親 Links Noun 1. mother; female parent Details ▸ おんなざか 女坂 Links Noun 1. the easier of two slopes Details ▸ おんなきょうだい 女兄弟 Links Noun 1. sisters; female siblings Other forms 女姉妹 【おんなきょうだい】 Details ▸ おんなかぶき 女歌舞伎 Links Noun 1. girls' kabuki Details ▸ ..