X5B50

From TORI
Jump to navigation Jump to search

X5B50 () is Unicode character number 23376.

X5B50 () is interpreted as Kanji compound (although it has only one compponent) or "Kanji liberal", in order to distinguish it from similar character X2F26 () that is interpreted as Kanji radical. Some software use different fonts for these characters, to allow a Human to distinguish them.

X5B50.png

Drawing of X2F26 () and/or X5B50 () [1]

Encoding

In Utf8, character X5B50 (子 , , & # X 5 B 5 0;) is encoded with 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation.

Encoding of X5B50 and that of the similar character X2F26 can be revealed with the PHP program below:

<?php 
function unichr($dec) {
  if ($dec < 128) {
    $utf = chr($dec);
  } else if ($dec < 2048) {
    $utf = chr(192 + (($dec - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  } else {
    $utf = chr(224 + (($dec - ($dec % 4096)) / 4096));
    $utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  }
  return $utf;
} 

function mb_str_split($str) {
   // split multibyte string in characters
   // at all positions except the start: ^
   // and the end: $
   $pattern = '/(?<!^)(?!$)/u';
   return preg_split($pattern,$str);
}

function uniord($a) 
{
  $M=strlen($a);
  $p=ord($a[0]);                    if($M==1) return $p;
  $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
  $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
}

//$a=unichr(0x2f25);
//$a.=unichr(0x5973);
//$a.=unichr(0xF981);

$a="⼦子";
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";

for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";

$b = mb_str_split($a);

var_dump($b);
$M=count($b);

#mb_internal_encoding("UTF-8");

for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, [[X%04X]]\n",$u,$u);
$d=strlen($c);
echo "Picture: [[$c]] ; uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>

This program uses the portable PHP functions unichr.t, mb_str_split.t, uniord.t.

The ouptput is

⼦子
The array has 6 bytes; here is its splitting:
e2 bc a6 e5 ad 90 
array(2) {
  [0]=>
  string(3) "⼦"
  [1]=>
  string(3) "子"
}

Unicode character number 12070 id est, [[X2F26]]
Picture: [[⼦]] ; uses 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation

Unicode character number 23376 id est, [[X5B50]]
Picture: [[子]] ; uses 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation

Semantic

VietnameseGirl1301846424fragme.jpg
X2F26 (), X5B50 ()

In Chinese and Japanese languages, character X5B50 (, , & # X 5 B 5 0;) has sense "child", "baby", Human of age under 17 years.

Character X2F26 () has similar meaning.

Phonetic

In Japanese language, character X5B50 () is pronounced as "ko", こ.

Character X2F26 () has similar pronunciation.

This sound has also other meanings. It may indicate a lake, natural or artificial water reservoir.

Confusion

Characters (X2F26), X5B50 look similar. This may cause confusions. There are alerts about this congusion [2][3][4].

Usually, a Human, even the native Japanese speaker, looking at characters , cannot guess, which of them is X2F26 and which is X5B50.
For citing, the alphanumerical names X2F26 and X5B50 seem to be better, than and .
For this reason, four similar articles X2F26, X5B50, , are loaded.

References

  1. https://jisho.org/search/%23kanji%20%E5%AD%90 子 3 strokes Radical: child, seed 子 Parts: 子 child, sign of the rat, 11PM-1AM, first sign of Chinese zodiac Kun: こ、 -こ、 ね On: シ、 ス、 ツ Jōyō kanji, taught in grade 1 JLPT level N5 72 of 2500 most used kanji in newspapers Words starting with 子 Words ending with 子 Words containing 子 External links Stroke order ..
  2. http://www.unicode.org/Public/security/revision-03/confusablesSummary.txt
    1. Summary: Recommended confusable mapping for IDN
    2. File: confusablesSummary.txt
    3. Version: 2.1-draft
    4. Generated: 2010-04-13, 01:33:25 GMT
    5. Checkin: $Revision: 1.29 $
    6. For documentation and usage, see http://www.unicode.org/reports/tr39/
    .. 子 ⼦ // (‎ ⼦ ‎) 2F26 KANGXI RADICAL CHILD .. // (‎ 子 ‎) 5B50 CJK UNIFIED IDEOGRAPH-5B50 ..
  3. https://util.unicode.org/UnicodeJsps/character.jsp?a=2F26 ⼦ 2F26 KANGXI RADICAL CHILD Han Script id: allowed confuse: ..
  4. https://util.unicode.org/UnicodeJsps/character.jsp?a=5B50 子 5B50 CJK UNIFIED IDEOGRAPH-5B50 Han Script id: restricted confuse: ..

Keywords

Baby, Child, Chinese, Japanese, Kanji, PHP, SomeU, Utf8table, UtfH, X2f26, X5B50,