Difference between revisions of "子"

From TORI
Jump to navigation Jump to search
Line 159: Line 159:
 
Character [[⼦]] ([[X2F26]])
 
Character [[⼦]] ([[X2F26]])
 
has sense "child", "baby", Human of age under 17 years.
 
has sense "child", "baby", Human of age under 17 years.
  +
 
The similar meaning is related also with character [[子]] ([[X5B50]]).
 
The similar meaning is related also with character [[子]] ([[X5B50]]).
   

Revision as of 20:11, 26 May 2021

(X5B50) is Unicode character number 12070.

(X5B50) is interpreted as Kanji liberal or Kanji compound in order to distinguish it from (X2F26) that is interpreted as Kanji radical.[1]

X5B50.png

Drawing of X2F26 and X5B50 [2]

Encoding

In Utf8, character (X5B50) is encoded with 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation.

In HTML, this character can be generated with
& # X 5 B 5 0 ; and also with
& # 1 2 0 7 0 ;
For activation, the spaces should be removed from the two lines above; then, each of them generates the same character X2F26 ( ).

Encoding of and that of the similar character can be revealed with the PHP program below:

<?php 
function unichr($dec) {
  if ($dec < 128) {
    $utf = chr($dec);
  } else if ($dec < 2048) {
    $utf = chr(192 + (($dec - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  } else {
    $utf = chr(224 + (($dec - ($dec % 4096)) / 4096));
    $utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64));
    $utf .= chr(128 + ($dec % 64));
  }
  return $utf;
} 

function mb_str_split($str) {
   // split multibyte string in characters
   // at all positions except the start: ^
   // and the end: $
   $pattern = '/(?<!^)(?!$)/u';
   return preg_split($pattern,$str);
}

function uniord($a) 
{
  $M=strlen($a);
  $p=ord($a[0]);                    if($M==1) return $p;
  $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
  $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
}

//$a=unichr(0x2f25);
//$a.=unichr(0x5973);
//$a.=unichr(0xF981);

$a="⼦子";
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";

for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";

$b = mb_str_split($a);

var_dump($b);
$M=count($b);

#mb_internal_encoding("UTF-8");

for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, [[X%04X]]\n",$u,$u);
$d=strlen($c);
echo "Picture: [[$c]] ; uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>

This program uses the portable PHP functions unichr.t, mb_str_split.t, uniord.t.

The ouptput is

⼦子
The array has 6 bytes; here is its splitting:
e2 bc a6 e5 ad 90 
array(2) {
  [0]=>
  string(3) "⼦"
  [1]=>
  string(3) "子"
}

Unicode character number 12070 id est, [[X2F26]]
Picture: [[⼦]] ; uses 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation

Unicode character number 23376 id est, [[X5B50]]
Picture: [[子]] ; uses 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation

Semantic

BabyGirlRed02.jpg
(X2F26) or (X5B50)

In Chinese and Japanese languages, Character (X2F26) has sense "child", "baby", Human of age under 17 years.

The similar meaning is related also with character (X5B50).

Phonetic

In Japanese language, character (X2F26;) is pronounced as "ko", こ. (The same as character (X5B50))

This sound has also other meanings. It may indicate a lake, natural or artificial water reservoir.

Confusion

Characters X2F26 (), X5B50 () look similar. This may cause confusions. There are alerts about this congusion [3][4][5].

Usually, a Human, even the native Japanese speaker, looking at characters , cannot guess, which of them is X2F26 and which is X5B50. For citing, the alphanumerical names X2F26 and X5B50 seem to be better, than and . For this reason, four similar articles X2F26, X5B50, , are loaded.

References

  1. Some softwares use different fonts for (X2F26) and (X5B50) in order to make these kanjies distinguishable by Humans. (without copypasting them to a program that identifies the unicode characters).
  2. https://jisho.org/search/%23kanji%20%E5%AD%90 子 3 strokes Radical: child, seed 子 Parts: 子 child, sign of the rat, 11PM-1AM, first sign of Chinese zodiac Kun: こ、 -こ、 ね On: シ、 ス、 ツ Jōyō kanji, taught in grade 1 JLPT level N5 72 of 2500 most used kanji in newspapers Words starting with 子 Words ending with 子 Words containing 子 External links Stroke order ..
  3. http://www.unicode.org/Public/security/revision-03/confusablesSummary.txt
    1. Summary: Recommended confusable mapping for IDN
    2. File: confusablesSummary.txt
    3. Version: 2.1-draft
    4. Generated: 2010-04-13, 01:33:25 GMT
    5. Checkin: $Revision: 1.29 $
    6. For documentation and usage, see http://www.unicode.org/reports/tr39/
    .. 子 ⼦ // (‎ ⼦ ‎) 2F26 KANGXI RADICAL CHILD .. // (‎ 子 ‎) 5B50 CJK UNIFIED IDEOGRAPH-5B50 ..
  4. https://util.unicode.org/UnicodeJsps/character.jsp?a=2F26 ⼦ 2F26 KANGXI RADICAL CHILD Han Script id: allowed confuse: ..
  5. https://util.unicode.org/UnicodeJsps/character.jsp?a=5B50 子 5B50 CJK UNIFIED IDEOGRAPH-5B50 Han Script id: restricted confuse: ..

Keywords

Baby, Child, Chinese, Japanese, Kanji, PHP, SomeU, Utf8table, UtfH, X2f26, X5B50,