Difference between revisions of "子"

Revision as of 20:11, 26 May 2021

子 (X5B50) is Unicode character number 12070.

子 (X5B50) is interpreted as Kanji liberal or Kanji compound in order to distinguish it from ⼦ (X2F26) that is interpreted as Kanji radical.^[1]

Drawing of X2F26 and X5B50 ^[2]

Encoding

In Utf8, character 子 (X5B50) is encoded with 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation.

In HTML, this character can be generated with
& # X 5 B 5 0 ; and also with
& # 1 2 0 7 0 ;
For activation, the spaces should be removed from the two lines above; then, each of them generates the same character X2F26 (⼦⼦).

Encoding of ⼦ and that of the similar character 子 can be revealed with the PHP program below:

<?php 
function unichr($dec) {
  if ($dec < 128) {
    $utf = chr($dec);
  } else if ($dec < 2048) {
    $utf = chr(192 + (($dec - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  } else {
    $utf = chr(224 + (($dec - ($dec % 4096)) / 4096));
    $utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  }
  return $utf;
} 

function mb_str_split($str) {
   // split multibyte string in characters
   // at all positions except the start: ^
   // and the end: $
   $pattern = '/(?<!^)(?!$)/u';
   return preg_split($pattern,$str);
}

function uniord($a) 
{
  $M=strlen($a);
  $p=ord($a[0]);                    if($M==1) return $p;
  $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
  $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
}

//$a=unichr(0x2f25);
//$a.=unichr(0x5973);
//$a.=unichr(0xF981);

$a="⼦子";
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";

for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";

$b = mb_str_split($a);

var_dump($b);
$M=count($b);

#mb_internal_encoding("UTF-8");

for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, [[X%04X]]\n",$u,$u);
$d=strlen($c);
echo "Picture: [[$c]] ; uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>

This program uses the portable PHP functions unichr.t, mb_str_split.t, uniord.t.

The ouptput is

⼦子
The array has 6 bytes; here is its splitting:
e2 bc a6 e5 ad 90 
array(2) {
  [0]=>
  string(3) "⼦"
  [1]=>
  string(3) "子"
}

Unicode character number 12070 id est, [[X2F26]]
Picture: [[⼦]] ; uses 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation

Unicode character number 23376 id est, [[X5B50]]
Picture: [[子]] ; uses 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation

Semantic

⼦ (X2F26) or 子 (X5B50)

In Chinese and Japanese languages, Character ⼦ (X2F26) has sense "child", "baby", Human of age under 17 years.

The similar meaning is related also with character 子 (X5B50).

Phonetic

In Japanese language, character ⼦ (X2F26;) is pronounced as "ko", こ. (The same as character 子 (X5B50))

This sound has also other meanings. It may indicate a lake, natural or artificial water reservoir.

Confusion

Characters X2F26 (⼦), X5B50 (子) look similar. This may cause confusions. There are alerts about this congusion ^[3]^[4]^[5].

Usually, a Human, even the native Japanese speaker, looking at characters ⼦子, cannot guess, which of them is X2F26 and which is X5B50. For citing, the alphanumerical names X2F26 and X5B50 seem to be better, than ⼦ and 子. For this reason, four similar articles X2F26, X5B50, ⼦, 子 are loaded.

References

↑ Some softwares use different fonts for ⼦ (X2F26) and 子 (X5B50) in order to make these kanjies distinguishable by Humans. (without copypasting them to a program that identifies the unicode characters).
↑ https://jisho.org/search/%23kanji%20%E5%AD%90 子 3 strokes Radical: child, seed 子 Parts: 子 child, sign of the rat, 11PM-1AM, first sign of Chinese zodiac Kun: こ、 -こ、ね On: シ、ス、ツ Jōyō kanji, taught in grade 1 JLPT level N5 72 of 2500 most used kanji in newspapers Words starting with 子 Words ending with 子 Words containing 子 External links Stroke order ..
↑
http://www.unicode.org/Public/security/revision-03/confusablesSummary.txt
1. Summary: Recommended confusable mapping for IDN
2. File: confusablesSummary.txt
3. Version: 2.1-draft
4. Generated: 2010-04-13, 01:33:25 GMT
5. Checkin: $Revision: 1.29 $
6. For documentation and usage, see http://www.unicode.org/reports/tr39/
.. 子⼦ // (‎ ⼦ ‎) 2F26 KANGXI RADICAL CHILD .. // (‎ 子 ‎) 5B50 CJK UNIFIED IDEOGRAPH-5B50 ..
↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=2F26 ⼦ 2F26 KANGXI RADICAL CHILD Han Script id: allowed confuse: 子 ..
↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=5B50 子 5B50 CJK UNIFIED IDEOGRAPH-5B50 Han Script id: restricted confuse: ⼦ ..

Keywords

Baby, Child, Chinese, Japanese, Kanji, PHP, SomeU, Utf8table, UtfH, X2f26, X5B50,

[1] Some softwares use different fonts for ⼦ (X2F26) and 子 (X5B50) in order to make these kanjies distinguishable by Humans. (without copypasting them to a program that identifies the unicode characters).

[2] ttps://jisho.org/search/%23kanji%20%E5%AD%90 子 3 strokes Radical: child, seed 子 Parts: 子 child, sign of the rat, 11PM-1AM, first sign of Chinese zodiac Kun: こ、 -こ、ね On: シ、ス、ツ Jōyō kanji, taught in grade 1 JLPT level N5 72 of 2500 most used kanji in newspapers Words starting with 子 Words ending with 子 Words containing 子 External links Stroke order ..

[3] ttp://www.unicode.org/Public/security/revision-03/confusablesSummary.txt
Summary: Recommended confusable mapping for IDN

File: confusablesSummary.txt

Version: 2.1-draft

Generated: 2010-04-13, 01:33:25 GMT

Checkin: $Revision: 1.29 $

For documentation and usage, see http://www.unicode.org/reports/tr39/
.. 子⼦ // (‎ ⼦ ‎) 2F26 KANGXI RADICAL CHILD .. // (‎ 子 ‎) 5B50 CJK UNIFIED IDEOGRAPH-5B50 ..

[4] Summary: Recommended confusable mapping for IDN

[5] File: confusablesSummary.txt

[6] Version: 2.1-draft

[7] Generated: 2010-04-13, 01:33:25 GMT

[8] Checkin: $Revision: 1.29 $

[9] For documentation and usage, see http://www.unicode.org/reports/tr39/

[4] ttps://util.unicode.org/UnicodeJsps/character.jsp?a=2F26 ⼦ 2F26 KANGXI RADICAL CHILD Han Script id: allowed confuse: 子 ..

[5] ttps://util.unicode.org/UnicodeJsps/character.jsp?a=5B50 子 5B50 CJK UNIFIED IDEOGRAPH-5B50 Han Script id: restricted confuse: ⼦ ..

[12] Summary: Recommended confusable mapping for IDN

[13] File: confusablesSummary.txt

[14] Version: 2.1-draft

[15] Generated: 2010-04-13, 01:33:25 GMT

[16] Checkin: $Revision: 1.29 $

[17] For documentation and usage, see http://www.unicode.org/reports/tr39/

[1]

[2]

[3]

[4]

[5]

@@ Line 159: / Line 159: @@
 Character [[&#X2F26;]] ([[X2F26]])
 has sense "child", "baby", Human of age under 17 years.
 The similar meaning is related also with character [[&#X5B50;]] ([[X5B50]]).