From TORI
Revision as of 19:37, 26 May 2021 by T (talk | contribs)
Jump to: navigation, search

Both (& # X 2 F 2 6 ;) and (& # X 5 B 5 0 ;) seem to redirect here.

(X2F26) is Unicode character number 12070.

(X2F26) is interpreted as Kanji radical in order to distinguish it from similar character (& # X 5 B 5 0 ;) that is interpreted as Kanji liberal or Kanji compound (although it has only one component).[1]

X5B50.png

Drawing of ,X2F26 and/or ,X5B50 [2]

Encoding

In Utf8, character (X5B50) is encoded with 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation.

Encoding of X5B50 and that of the similar character X2F26 can be revealed with the PHP program below:

<?php 
function unichr($dec) {
  if ($dec < 128) {
    $utf = chr($dec);
  } else if ($dec < 2048) {
    $utf = chr(192 + (($dec - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  } else {
    $utf = chr(224 + (($dec - ($dec % 4096)) / 4096));
    $utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  }
  return $utf;
} 

function mb_str_split($str) {
   // split multibyte string in characters
   // at all positions except the start: ^
   // and the end: $
   $pattern = '/(?<!^)(?!$)/u';
   return preg_split($pattern,$str);
}

function uniord($a) 
{
  $M=strlen($a);
  $p=ord($a[0]);                    if($M==1) return $p;
  $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
  $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
}

//$a=unichr(0x2f25);
//$a.=unichr(0x5973);
//$a.=unichr(0xF981);

$a="⼦子";
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";

for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";

$b = mb_str_split($a);

var_dump($b);
$M=count($b);

#mb_internal_encoding("UTF-8");

for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, [[X%04X]]\n",$u,$u);
$d=strlen($c);
echo "Picture: [[$c]] ; uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>

This program uses the portable PHP functions unichr.t, mb_str_split.t, uniord.t.

The ouptput is

⼦子
The array has 6 bytes; here is its splitting:
e2 bc a6 e5 ad 90 
array(2) {
  [0]=>
  string(3) "⼦"
  [1]=>
  string(3) "子"
}

Unicode character number 12070 id est, [[X2F26]]
Picture: [[⼦]] ; uses 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation

Unicode character number 23376 id est, [[X5B50]]
Picture: [[子]] ; uses 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation

Semantic

BabyGirlRed02.jpg
, X2F26 or , X5B50

In Chinese and Japanese languages, character X5B50 (, , & # X 5 B 5 0;) has sense "child", "baby", Human of age under 17 years (See figure at left).

Character (X2F26) has similar meaning.

Phonetic

In Japanese language, character X5B50 (, , & # X 5 B 5 0;) is pronounced as "ko", こ.

This sound has also other meanings. It may indicate a lake, natural or artificial water reservoir.

Confusion

Characters (X2F26), X5B50 look similar. This may cause confusions. There are alerts about this congusion [3][4][5].

Usually, a Human, even the native Japanese speaker, looking at , cannot guess, which of them is X2F26 and which is X5B50. For citing, the alphanumerical names X2F26 and X5B50 seem to be better, than and . For this reason, the four similar articles X2F26, X5B50, , are loaded.

References

  1. Some software use different fonts for (X2F26) and (X5B50) to make them distinguishable by a Human.
  2. https://jisho.org/search/%23kanji%20%E5%AD%90 子 3 strokes Radical: child, seed 子 Parts: 子 child, sign of the rat, 11PM-1AM, first sign of Chinese zodiac Kun: こ、 -こ、 ね On: シ、 ス、 ツ Jōyō kanji, taught in grade 1 JLPT level N5 72 of 2500 most used kanji in newspapers Words starting with 子 Words ending with 子 Words containing 子 External links Stroke order ..
  3. http://www.unicode.org/Public/security/revision-03/confusablesSummary.txt
    1. Summary: Recommended confusable mapping for IDN
    2. File: confusablesSummary.txt
    3. Version: 2.1-draft
    4. Generated: 2010-04-13, 01:33:25 GMT
    5. Checkin: $Revision: 1.29 $
    6. For documentation and usage, see http://www.unicode.org/reports/tr39/
    .. 子 ⼦ // (‎ ⼦ ‎) 2F26 KANGXI RADICAL CHILD .. // (‎ 子 ‎) 5B50 CJK UNIFIED IDEOGRAPH-5B50 ..
  4. https://util.unicode.org/UnicodeJsps/character.jsp?a=2F26 ⼦ 2F26 KANGXI RADICAL CHILD Han Script id: allowed confuse: ..
  5. https://util.unicode.org/UnicodeJsps/character.jsp?a=5B50 子 5B50 CJK UNIFIED IDEOGRAPH-5B50 Han Script id: restricted confuse: ..

Keywords

Baby, Child, Chinese, Japanese, Kanji, PHP, SomeU, Utf8table, UtfH, X2f26, X5B50,