子
子 (X5B50) is Unicode character number 12070.
子 (X5B50) is interpreted as Kanji liberal or Kanji compound in order to distinguish it from ⼦ (X2F26) that is interpreted as Kanji radical.[1]
Encoding
In Utf8, character
子 (X5B50) is encoded
with 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation.
In HTML, this character can be generated with
& # X 5 B 5 0 ; and also with
& # 1 2 0 7 0 ;
For activation, the spaces should be removed from the two lines above; then, each of them generates the same character
X2F26 (⼦ ⼦).
Encoding of ⼦ and that of the similar character 子 can be revealed with the PHP program below:
<?php
function unichr($dec) {
if ($dec < 128) {
$utf = chr($dec);
} else if ($dec < 2048) {
$utf = chr(192 + (($dec - ($dec % 64)) / 64));
$utf .= chr(128 + ($dec % 64));
} else {
$utf = chr(224 + (($dec - ($dec % 4096)) / 4096));
$utf .= chr(128 + ((($dec % 4096) - ($dec % 64)) / 64));
$utf .= chr(128 + ($dec % 64));
}
return $utf;
}
function mb_str_split($str) {
// split multibyte string in characters
// at all positions except the start: ^
// and the end: $
$pattern = '/(?<!^)(?!$)/u';
return preg_split($pattern,$str);
}
function uniord($a)
{
$M=strlen($a);
$p=ord($a[0]); if($M==1) return $p;
$p-=194; $p*=64; $p+=ord($a[1]); if($M==2) return $p;
$p-=2050; $p*=64; $p+=ord($a[2]); return $p;
}
//$a=unichr(0x2f25);
//$a.=unichr(0x5973);
//$a.=unichr(0xF981);
$a="⼦子";
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";
for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";
$b = mb_str_split($a);
var_dump($b);
$M=count($b);
#mb_internal_encoding("UTF-8");
for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, [[X%04X]]\n",$u,$u);
$d=strlen($c);
echo "Picture: [[$c]] ; uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>
This program uses the portable PHP functions unichr.t, mb_str_split.t, uniord.t.
The ouptput is
⼦子
The array has 6 bytes; here is its splitting:
e2 bc a6 e5 ad 90
array(2) {
[0]=>
string(3) "⼦"
[1]=>
string(3) "子"
}
Unicode character number 12070 id est, [[X2F26]]
Picture: [[⼦]] ; uses 3 bytes. These bytes are:
xE2 xBC xA6 in the hexadecimal representation and
226 188 166 in the decimal representation
Unicode character number 23376 id est, [[X5B50]]
Picture: [[子]] ; uses 3 bytes. These bytes are:
xE5 xAD x90 in the hexadecimal representation and
229 173 144 in the decimal representation
Semantic
In Chinese and Japanese languages, Character ⼦ (X2F26) has sense "child", "baby", Human of age under 17 years. The similar meaning is related also with character 子 (X5B50).
Phonetic
In Japanese language, character ⼦ (X2F26;) is pronounced as "ko", こ. (The same as character 子 (X5B50))
This sound has also other meanings. It may indicate a lake, natural or artificial water reservoir.
Confusion
Characters X2F26 (⼦), X5B50 (子) look similar. This may cause confusions. There are alerts about this congusion [3][4][5].
Usually, a Human, even the native Japanese speaker, looking at characters ⼦ 子, cannot guess, which of them is X2F26 and which is X5B50. For citing, the alphanumerical names X2F26 and X5B50 seem to be better, than ⼦ and 子. For this reason, four similar articles X2F26, X5B50, ⼦, 子 are loaded.
References
- ↑ Some softwares use different fonts for ⼦ (X2F26) and 子 (X5B50) in order to make these kanjies distinguishable by Humans. (without copypasting them to a program that identifies the unicode characters).
- ↑ https://jisho.org/search/%23kanji%20%E5%AD%90 子 3 strokes Radical: child, seed 子 Parts: 子 child, sign of the rat, 11PM-1AM, first sign of Chinese zodiac Kun: こ、 -こ、 ね On: シ、 ス、 ツ Jōyō kanji, taught in grade 1 JLPT level N5 72 of 2500 most used kanji in newspapers Words starting with 子 Words ending with 子 Words containing 子 External links Stroke order ..
- ↑
http://www.unicode.org/Public/security/revision-03/confusablesSummary.txt
- Summary: Recommended confusable mapping for IDN
- File: confusablesSummary.txt
- Version: 2.1-draft
- Generated: 2010-04-13, 01:33:25 GMT
- Checkin: $Revision: 1.29 $
- For documentation and usage, see http://www.unicode.org/reports/tr39/
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=2F26 ⼦ 2F26 KANGXI RADICAL CHILD Han Script id: allowed confuse: 子 ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=5B50 子 5B50 CJK UNIFIED IDEOGRAPH-5B50 Han Script id: restricted confuse: ⼦ ..
Keywords
Baby, Child, Chinese, Japanese, Kanji, PHP, SomeU, Utf8table, UtfH, X2f26, X5B50,