Onna

From TORI
Revision as of 08:36, 26 May 2021 by T (talk | contribs)
Jump to: navigation, search

OnnaDeaw.png

Drawing of X2F25, X5973 or XF981 [1]

Onna (おんな) is sound that in Japanese may denote woman, Human female. [1]

Unicode

At least three Unicode characters are related to sound Onna and the picture shown in Figure 1.
These characters are X2F25, X5973, XF981.

The PHP program below shows the Utf8 encoding of each or these three characters:

<?php 
function unichr($dec) {
  if ($dec < 128) {
    $utf = chr($dec);
  } else if ($dec < 2048) {
    $utf = chr(192 + (($dec - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  } else {
    $utf = chr(224 + (($dec - ($dec % 4096)) / 4096));
    $utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  }
  return $utf;
} 

function mb_str_split($str) {
   // split multibyte string in characters
   // at all positions except the start: ^
   // and the end: $
   $pattern = '/(?<!^)(?!$)/u';
   return preg_split($pattern,$str);
}

function uniord($a) 
{
  $M=strlen($a);
  $p=ord($a[0]);                    if($M==1) return $p;
  $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
  $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
}

$a=unichr(0x2f25);
$a.=unichr(0x5973);
$a.=unichr(0xF981);
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";

for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";

$b = mb_str_split($a);

var_dump($b);
$M=count($b);

#mb_internal_encoding("UTF-8");

for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, x%04X\n",$u,$u);
$d=strlen($c);
echo "Picture: $c uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>

This program uses portable PHP functions unichr.t, mb_str_split.t and uniord.t. The output is

⼥女女
The array has 9 bytes; here is its splitting:
e2 bc a5 e5 a5 b3 ef a6 81 
array(3) {
  [0]=>
  string(3) "⼥"
  [1]=>
  string(3) "女"
  [2]=>
  string(3) "女"
}

Unicode character number 12069 id est, x2F25
Picture: ⼥ uses 3 bytes. These bytes are:
xE2 xBC xA5 in the hexadecimal representation and
226 188 165 in the decimal representation

Unicode character number 22899 id est, x5973
Picture: 女 uses 3 bytes. These bytes are:
xE5 xA5 xB3 in the hexadecimal representation and
229 165 179 in the decimal representation

Unicode character number 63873 id est, xF981
Picture: 女 uses 3 bytes. These bytes are:
xEF xA6 x81 in the hexadecimal representation and
239 166 129 in the decimal representation

Confusion

Human, even a native Japanese speaker, looking at , , cannot guess, which of them is X2F25, which is X5973 and which is XF981.

Some software also confuse these characters; for example, the default Mediawiki replaces character XF981 to character X5973 without any warning. This may cause problems at automatic treat of data: in some cases the two objects are the same, and sometimes they are not. Similar confusion takes place at the careless use of term "equality" applied to triangles in geometry.

In a text, that assumes any kind of citing, for example, copypasting to frame of a brouser or any search engine, the characters, corresponding to sound onna should be specified as X2F25, X5973, XF981 rather than , , : the software confuse the last two characters.

For year 2021, this confusion is recognized and described [2][3][4].

Examples

Dictionary Jisho suggest examples with sound onna [5]

References

  1. 1.0 1.1 https://jisho.org/search/%23kanji%20%E5%A5%B3 https://jisho.org/search/%23kanji_女 woman, female Kun: おんな、 め On: ジョ、 ニョ、 ニョウ Jōyō kanji, taught in grade 1 JLPT level N5 151 of 2500 most used kanji in newspapers On reading compounds 女 【ジョ】 woman, girl, daughter, Chinese "Girl" constellation (one of the 28 mansions) 女王 【ジョオウ】 queen, female champion 処女 【ショジョ】 virgin, maiden 一女 【イチジョ】 one daughter, eldest daughter, first-born daughter 女王 【ジョオウ】 queen, female champion 女房 【ニョウボウ】 wife (esp. one's own wife), court lady, female court attache, woman who served at the imperial palace, woman (esp. as a love interest) 老若男女 【ロウニャクナンニョ】 men and women of all ages 天女 【テンニョ】 heavenly nymph, celestial maiden, beautiful and kind woman 女房 【ニョウボウ】 wife (esp. one's own wife), court lady, female court attache, woman who served at the imperial palace, woman (esp. as a love interest) 女官 【ジョカン】 court lady, lady-in-waiting Kun reading compounds 女 【おんな】 female, woman, female sex, female lover, girlfriend, mistress, (someone's) woman 女形 【おんながた】 onnagata, male actor in female kabuki roles, female partner (in a relationship) 醜女 【しゅうじょ】 homely woman, plain-looking woman, female demon 囲い女 【かこいおんな】 mistress 雌 【め】 female, smaller (of the two), weaker, woman, wife 女神 【めがみ】 goddess, female deity 早乙女 【さおとめ】 young female rice planter, young girl 醜女 【しゅうじょ】 homely woman, plain-looking woman, female demon
  2. https://util.unicode.org/UnicodeJsps/character.jsp?a=2F25 ⼥ 2F25 KANGXI RADICAL WOMAN Han Script id: allowed confuse: 女 , 女
  3. https://util.unicode.org/UnicodeJsps/character.jsp?a=5973 女 5973 CJK UNIFIED IDEOGRAPH-5973 Han Script id: restricted confuse: 女 , ⼥
  4. https://util.unicode.org/UnicodeJsps/character.jsp?a=F981 女 F981 CJK COMPATIBILITY IDEOGRAPH-F981 Han Script id: allowed confuse: 女 , ⼥
  5. https://jisho.org/search/%E5%A5%B3%20%E3%81%8A%E3%82%93%E3%81%AA%20%23words?page=2 女 おんな #words .. Words — 107 found おんなおや 女親 Links Noun 1. mother; female parent​ Details ▸ おんなざか 女坂 Links Noun 1. the easier of two slopes​ Details ▸ おんなきょうだい 女兄弟 Links Noun 1. sisters; female siblings​ Other forms 女姉妹 【おんなきょうだい】 Details ▸ おんなかぶき 女歌舞伎 Links Noun 1. girls' kabuki​ Details ▸ ..

Keywords

Chinese, Confusion, Female, Japanese, Onna, Utf8, Utf8table, UtfH, Woman, X2F25, X5973, XF981, ,