Difference between revisions of "⼤"

Latest revision as of 21:01, 21 May 2021

岩の間の⼤男

⼤ is Unicode character number 12068 (see Utf8table).

Html input:
⼤ (& # 1 2 0 6 8 ;)
⼤ (& # x 2 F 2 4 ;)

Phonetic

⼤ may be pronounced as ダイ, "dai".

Semantic

⼤ may have sense "big", "large"; especially in combination 大きい.

大学 means "big school", id est, college or University; pronounced as だいがく. ^[1]

Antonyms

森の間の⼩男

Unicode Character 12073; ⼩, html input:
⼩ (⼩ (& # 1 2 0 7 3 ;))
⼩ (⼩ (& # x 2 F 2 9 ;))

may have opposite meaning: "little", "small", "petite", "pequeno", "klein".

The example is shown in figure at right.

In Japanese, often, ⼩ is followed with two hiragana symbols: ⼩さい.

Unicode Character 23567 小, html input:
小 (小 (& # 2 3 5 6 7 ;))
小 (小 (& # x 5 C 0 F ;))
also can be considered as antonym of ⼤ ^[2]

Characters
⼩ (⼩ (& # 1 2 0 7 3 ;)) and
小 (小 (& # 2 3 5 6 7 ;)) are easy to confuse.

Encoding

Character ⼤ is encoded with 3 bytes:
226 188 164

The encoding of ⼤ and related characters can be seen with the PHP code below:

<?php
function mb_str_split($str) {
   // split multibyte string in characters
   // Split at all positions, not after the start: ^
   // and not before the end: $
   $pattern = '/(?<!^)(?!$)/u';
   return preg_split($pattern,$str);
}

function uniord($a) 
 {
   $M=strlen($a);
   $p=ord($a[0]);                    if($M==1) return $p;
   $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
   $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
 }

$a='⼤ 大 ⼩ 小'; /* two pairs of different unicode characters separated with spacebars */

$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";

for($n=0;$n<$N;$n++)
{
printf("%02x ",ord($a[$n]) );
}
echo "\n";

$b = mb_str_split($a);

var_dump($b);
$M=count($b);

#mb_internal_encoding("UTF-8");

for($m=0;$m<$M;$m++)
{
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, x%04x\n",$u,$u);
$d=strlen($c);
echo "Picture: $c uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2x ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>

The output is:


The array has 15 bytes; here is its splitting:
e2 bc a4 20 e5 a4 a7 20 e2 bc a9 20 e5 b0 8f 
array(7) {
  [0]=>
  string(3) "⼤"
  [1]=>
  string(1) " "
  [2]=>
  string(3) "大"
  [3]=>
  string(1) " "
  [4]=>
  string(3) "⼩"
  [5]=>
  string(1) " "
  [6]=>
  string(3) "小"
}

Unicode character number 12068 id est, x2f24
Picture: ⼤ uses 3 bytes. These bytes are:
xe2 xbc xa4 in the hexadecimal representation and
226 188 164 in the decimal representation

Unicode character number 00032 id est, x0020
Picture:   uses 1 bytes. These bytes are:
x20 in the hexadecimal representation and
 32 in the decimal representation

Unicode character number 22823 id est, x5927
Picture: 大 uses 3 bytes. These bytes are:
xe5 xa4 xa7 in the hexadecimal representation and
229 164 167 in the decimal representation

Unicode character number 00032 id est, x0020
Picture:   uses 1 bytes. These bytes are:
x20 in the hexadecimal representation and
 32 in the decimal representation

Unicode character number 12073 id est, x2f29
Picture: ⼩ uses 3 bytes. These bytes are:
xe2 xbc xa9 in the hexadecimal representation and
226 188 169 in the decimal representation

Unicode character number 00032 id est, x0020
Picture:   uses 1 bytes. These bytes are:
x20 in the hexadecimal representation and
 32 in the decimal representation

Unicode character number 23567 id est, x5c0f
Picture: 小 uses 3 bytes. These bytes are:
xe5 xb0 x8f in the hexadecimal representation and
229 176 143 in the decimal representation

Confusion

With some softwares, character number 12068 (⼤) looks similar to character number 22823 (大) ^[3]^[4].
Html input:
大 (& # 2 2 8 2 3 ;)
大 (& # x 5 9 2 7 ;)

The similarity in the graphical representations of characters ⼤ and 大 may cause confusions.

The main difference is, Character 大 (& # 2 2 8 2 3 ;) can be interpreted as Chinese Kanji, while ⼤ is interpreted as Japanese one.

References

This article can be referred as https://mizugadro.mydns.jp/t/index.php/%E2%BC%A4

↑ https://ja.wikipedia.org/wiki/大学大学（だいがく、英: college、university）..
↑ https://en.wiktionary.org/wiki/%E5%B0%8F
↑ https://en.wikipedia.org/wiki/List_of_j%C5%8Dy%C5%8D_kanji
https://en.wikipedia.org/wiki/List_of_jōyō_kanji 大大 3 1 large ダイ、タイ、おお、おお-きい、おお-いに dai, tai, oo, oo-kii, oo-ini ..
↑ https://0g0.org/unicode/5927/ U+5927 Unicode文字 Unicode U+5927 大分類 CJK統合漢字 CJK Unified Ideographs - 3 数値文字参照大大 URLエンコード(UTF-8) %E5%A4%A7 URLエンコード(EUC-JP) %C2%E7 URLエンコード(SHIFT_JIS) %91%E5 ユニコード名 CJK UNIFIED IDEOGRAPH-5927 一般カテゴリ－ Letter, Other（文字，その他）文字化けする可能性のある文字 UTF-16 : ꓥ� Shift_JIS : 螟ｧ CP932 : 螟ｧ EUC-JP : 紊� Base64エンコード : 5aSn

https://en.wikipedia.org/wiki/List_of_jōyō_kanji The jōyō kanji system of representing written Japanese consists of 2,136 characters.

Keywords

Japanese, Kanji, SomeU, Unicode, UtfH, Utf8table,

⼤⼤ (& # 1 2 0 6 8 ;), 大大 (& # 2 2 8 2 3 ;), ⼩ (⼩ (& # 1 2 0 7 3 ;)), 小 (小 (& # 2 3 5 6 7 ;))

[1] ttps://ja.wikipedia.org/wiki/大学大学（だいがく、英: college、university）..

[2] ttps://en.wiktionary.org/wiki/%E5%B0%8F

[3] ttps://en.wikipedia.org/wiki/List_of_j%C5%8Dy%C5%8D_kanji
https://en.wikipedia.org/wiki/List_of_jōyō_kanji 大大 3 1 large ダイ、タイ、おお、おお-きい、おお-いに dai, tai, oo, oo-kii, oo-ini ..

[4] ttps://0g0.org/unicode/5927/ U+5927 Unicode文字 Unicode U+5927 大分類 CJK統合漢字 CJK Unified Ideographs - 3 数値文字参照大大 URLエンコード(UTF-8) %E5%A4%A7 URLエンコード(EUC-JP) %C2%E7 URLエンコード(SHIFT_JIS) %91%E5 ユニコード名 CJK UNIFIED IDEOGRAPH-5927 一般カテゴリ－ Letter, Other（文字，その他）文字化けする可能性のある文字 UTF-16 : ꓥ� Shift_JIS : 螟ｧ CP932 : 螟ｧ EUC-JP : 紊� Base64エンコード : 5aSn

[1]

[2]

[3]

[4]

Difference between revisions of "⼤"

Latest revision as of 21:01, 21 May 2021

Contents

Phonetic

Semantic

Antonyms

Encoding

Confusion

References

Keywords

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

.

Tools

@@ Line 1: / Line 1: @@
+<div style="float:right;margin:-60px -14ps opx 8px">
-[[⼤]] is [[Unicode]] character number 12068.
+[[File:Gulliver2.jpg|300px]]<br><big><big><center>岩の間の[[⼤]][[男]]</center></big></big>
+</div>
+[[⼤]] is [[Unicode]] character number 12068 (see [[Utf8table]]).
 Html input:<br>
@@ Line 6: / Line 10: @@
 ==Phonetic==
+[[⼤]] may be pronounced as ダイ, "dai".
 ==Semantic==
+[[⼤]] may have sense "big", "large"; especially in combination 大きい.
+大学 means "big school", id est, college or University; pronounced as だいがく.
+<ref>
+https://ja.wikipedia.org/wiki/大学
+大学（だいがく、英: college、university）..
+</ref>
+==Antonyms==
+<div style="float:right;margin:-34px -14px 0px 12px">
+[[File:Lilliput2.jpg|220px]]<br><big><center>
+森の間の[[⼩]]<!--さな!-->男</center></big>
+</div>
+Unicode Character 12073; [[⼩]], html input:<br>
+[[⼩]] ([[&#12073;]] (& # 1 2 0 7 3 ;))<br>
+[[⼩]] ([[&#x2F29;]] (& # x 2 F 2 9 ;))<br>
+may have opposite meaning: "little", "small", "petite", "pequeno", "klein".
+The example is shown in figure at right.
+In Japanese, often, [[⼩]] is followed with two hiragana symbols: [[⼩]]さい.
+Unicode Character 23567 [[小]], html input:<br>
+[[小]] ([[&#23567;]] (& # 2 3 5 6 7 ;))<br>
+[[小]] ([[&#x5C0F;]] (& # x 5 C 0 F ;))<br>
+also can be considered as antonym of [[⼤]]
+<ref>
+https://en.wiktionary.org/wiki/%E5%B0%8F
+</ref>
+Characters <br>
+[[⼩]] ([[&#12073;]] (& # 1 2 0 7 3 ;)) and <br>
+[[小]] ([[&#23567;]] (& # 2 3 5 6 7 ;))
+are easy to confuse.
+==Encoding==
+Character [[⼤]] is encoded with 3 bytes:<br>
+188 164
+The encoding of [[⼤]] and related characters can be seen with the [[PHP]] code below:
+<pre>
+<?php
+function mb_str_split($str) {
+   // split multibyte string in characters
+   // Split at all positions, not after the start: ^
+   // and not before the end: $
+   $pattern = '/(?<!^)(?!$)/u';
+   return preg_split($pattern,$str);
+}
+function uniord($a)
+ {
+   $M=strlen($a);
+   $p=ord($a[0]);                    if($M==1) return $p;
+   $p-=194;  $p*=64; $p+=ord($a[1]); if($M==2) return $p;
+   $p-=2050; $p*=64; $p+=ord($a[2]);           return $p;
+ }
+$a='⼤ 大 ⼩ 小'; /* two pairs of different unicode characters separated with spacebars */
+$N=strlen($a);
+echo "The array has $N bytes; here is its splitting:\n";
+for($n=0;$n<$N;$n++)
+{
+printf("%02x ",ord($a[$n]) );
+}
+echo "\n";
+$b = mb_str_split($a);
+var_dump($b);
+$M=count($b);
+#mb_internal_encoding("UTF-8");
+for($m=0;$m<$M;$m++)
+{
+printf("\n");
+$c=$b[$m];
+$u=uniord($c);
+printf("Unicode character number %05d id est, x%04x\n",$u,$u);
+$d=strlen($c);
+echo "Picture: $c uses $d bytes. These bytes are:\n";
+for($n=0;$n<$d;$n++) printf("x%2x ",ord($c[$n]));
+printf("in the hexadecimal representation and\n");
+for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
+printf("in the decimal representation\n");
+}
+?>
+</pre>
+The output is:
+<pre>
+The array has 15 bytes; here is its splitting:
+e2 bc a4 20 e5 a4 a7 20 e2 bc a9 20 e5 b0 8f
+array(7) {
+  [0]=>
+  string(3) "⼤"
+  [1]=>
+  string(1) " "
+  [2]=>
+  string(3) "大"
+  [3]=>
+  string(1) " "
+  [4]=>
+  string(3) "⼩"
+  [5]=>
+  string(1) " "
+  [6]=>
+  string(3) "小"
+}
+Unicode character number 12068 id est, x2f24
+Picture: ⼤ uses 3 bytes. These bytes are:
+xe2 xbc xa4 in the hexadecimal representation and
+188 164 in the decimal representation
+Unicode character number 00032 id est, x0020
+Picture:   uses 1 bytes. These bytes are:
+x20 in the hexadecimal representation and
+in the decimal representation
+Unicode character number 22823 id est, x5927
+Picture: 大 uses 3 bytes. These bytes are:
+xe5 xa4 xa7 in the hexadecimal representation and
+164 167 in the decimal representation
+Unicode character number 00032 id est, x0020
+Picture:   uses 1 bytes. These bytes are:
+x20 in the hexadecimal representation and
+in the decimal representation
+Unicode character number 12073 id est, x2f29
+Picture: ⼩ uses 3 bytes. These bytes are:
+xe2 xbc xa9 in the hexadecimal representation and
+188 169 in the decimal representation
+Unicode character number 00032 id est, x0020
+Picture:   uses 1 bytes. These bytes are:
+x20 in the hexadecimal representation and
+in the decimal representation
+Unicode character number 23567 id est, x5c0f
+Picture: 小 uses 3 bytes. These bytes are:
+xe5 xb0 x8f in the hexadecimal representation and
+176 143 in the decimal representation
+</pre>
 ==Confusion==
+With some softwares, character number 12068 ([[⼤]]) looks similar to character number 22823
+([[大]]) <ref>
+https://en.wikipedia.org/wiki/List_of_j%C5%8Dy%C5%8D_kanji <br>
+https://en.wikipedia.org/wiki/List_of_jōyō_kanji
+[[大]] 		大 	3 	1 		large 	ダイ、タイ、おお、おお-きい、おお-いに
+dai, tai, oo, oo-kii, oo-ini ..
+</ref><ref>https://0g0.org/unicode/5927/
+U+5927 Unicode文字
+[[Unicode]]
+U+5927
+大
+分類
+CJK統合漢字 CJK Unified Ideographs - 3
+数値文字参照
+&#x5927; &#22823;
+URLエンコード(UTF-8)
+%E5%A4%A7
+URLエンコード(EUC-JP)
+%C2%E7
+URLエンコード(SHIFT_JIS)
+%91%E5
+ユニコード名
+CJK UNIFIED IDEOGRAPH-5927
+一般カテゴリ－
+Letter, Other（文字，その他）
+文字化けする可能性のある文字
+UTF-16 : ꓥ�
+Shift_JIS : 螟ｧ
+CP932 : 螟ｧ
+EUC-JP : 紊�
+Base64エンコード : 5aSn
+</ref>.<br>
+Html input:<br>
+[[&#22823;]] (& # 2 2 8 2 3 ;)<br>
+[[&#x5927;]] (& # x 5 9 2 7 ;)
+The similarity in the graphical representations of characters [[⼤]] and [[大]] may cause confusions.
+The main difference is, Character [[&#22823;]] (& # 2 2 8 2 3 ;) can be interpreted as [[Chinese]] [[Kanji]],
+while [[⼤]] is interpreted as [[Japanese]] one.
 ==References==
+This article can be referred as
+https://mizugadro.mydns.jp/t/index.php/%E2%BC%A4
+<references/>
+https://en.wikipedia.org/wiki/List_of_jōyō_kanji
+The jōyō kanji system of representing written Japanese consists of 2,136 characters.
 ==Keywords==
-[[Kanji]]
+[[Japanese]],
+[[Kanji]],
+[[SomeU]],
+[[Unicode]],
+[[UtfH]],
+[[Utf8table]],
+[[⼤]] [[&#12068;]] (& # 1 2 0 6 8 ;),
+[[大]] [[&#22823;]] (& # 2 2 8 2 3 ;),
+[[⼩]] ([[&#12073;]] (& # 1 2 0 7 3 ;)),
+[[小]] ([[&#23567;]] (& # 2 3 5 6 7 ;))
+[[Category:U12068]]
+[[Category:Japanese]]
 [[Category:Kanji]]
+[[Category:SomeU]]
+[[Category:Unicode]]
+[[Category:UtfH]]
+[[Category:Utf8]]
+[[Category:⼤]]
+[[Category:大]]