Difference between revisions of "WalkU"
Line 132: | Line 132: | ||
<div style="margin:0px 0px 0px 30px; line-height:1.1em"> |
<div style="margin:0px 0px 0px 30px; line-height:1.1em"> |
||
==Keywords== |
==Keywords== |
||
− | |||
[[Confusion]], |
[[Confusion]], |
||
[[Japanese]], |
[[Japanese]], |
Revision as of 20:38, 5 September 2021
WalkU is set of the following five Unicode characters:
X2ECC ⻌ [1], KanjiRadical, Walk0, SIMPLIFIED WALK
X2ECD ⻍ [2], KanjiRadical, Walk1, WALK ONE
X2ECE ⻎ [3], KanjiRadical, Walk2, WALK TWO
X8FB6 辶 [4], KanjiLiberal, Walk3
XFA66 辶 [5], KanjiConfudal, Walk4
The system of names (Walk0,Walk1,Walk2,Walk3,Walk4) is based on the descriptions [2][3] of two of these characters by The Unicode Utilities. The similar Kanjis are numbered in the alphabetic order (order of increasing of their Unicode numbers).
Characters of WalkU may have similar pronunciation しんにょう and similar meanings: walk, move, advance.
Confusion
In century 21, the computer support of characters of Japanese is underdeveloped.
Up to year 2021, there is no united standard for the default font, that would allow each character to look the same at various computers, but different from other characters.
Apparently, the opposite cases take place: the character looks different with the default font of different operational systems, but similar to other characters at the same computer. This causes confuses.
In particular, such a confusion takes place with characters of set WalkU. Sometimes, they are considered as equivalent, and sometimes not. [6]
Characters X2ECC ⻌ , X2ECD ⻍ , X2ECE ⻎ , X8FB6 辶 , XFA66 辶 look similar.
Even a native Japanese speaker, watching characters
⻌,
⻍,
⻎,
辶,
辶,
is unlikely to answer:
Which of them is X2ECC?
Which of them is X2ECD?
Which of them is X2ECE?
Which of them is X8FB6?
Which of them is XFA66?
In such a case, the correct specification of the character is
"X2ECC ⻌ or X2ECD ⻍ or X2ECE ⻎ or X8FB6 辶 or XFA66 辶"
Term WalkU is defined to substitute this (long and complicated) construction with the single word.
The characters of WalkU and their encodings can be identified, revealed with PHP programs du1.t and ud.t. Example of the call:
php ud.t 2ECC 2ECD 2ECE 8FB6 FA66
In the most of cases, the difference between these characters is not so important, while the phonetic and semantic properties of these characters are similar. However, the mistakes appears, when one needs to search for the character, or to replace it through the text, or to perform more complicated operations with the document.
In the first semantic and phonetic approximation, all characters of WalkU are considered as equivalent. They can be pronounced as しんにょう, Sjinniou, or Sjinnyou, meaning go, to walk, to advance (for example, after to be stack in a traffic jam). After replacement of all characters of WalkU to one of these phonetic representation, the mistakes mentioned do not occur.
Tarja
In order to avoid confusions mentioned above, the special system of notations Tarja is suggested.
Tarja appears as kind of Japanese-English Pidgin. This language is similar to Japanese, but the ambiguous, confusive characters are excluded, substituted with Romanji and/or Hiragana words, basing on the graphic (JapanPi, JapanPsi), phonetic (Tori, Sakana, Chikara) or semantic (for ex.,Vecher) similarity.
Tarja is oriented to reading and analysis of Japanese texts.
First, the characters are translated from Japanese to Tarja.
This convention reduces the amount of options to be considered,
looking for some meaningful combination of interpetations of each Katakana or Kanji:
some katakana look as Kanji, but in Tarja, they appear as ascii or Hiragana.
Then, text in Tarja is analyzed and/or (if necessary) translated to other language(s).
Tarja is expected to be useful in analysis of ambiguous Japanese texts: the goal is to reveal the ambiguities, to describe them and avoid them in the meaningful texts.
Characters of WalkU can be translated from Japanese to Tarja in the following ways:
1. しんにょう. This word is allowed in Japanese.
2. Shinniou. This word is almost half-shorter (8 bytes instead of 15), but in absent in standard Japanese.
3. WalkU; character U may be replaced to 0,1,2,3 or 4.
4. The hexadecimal number of a WalkU character, id est, X2ECC or X2ECD or X2ECE or X8FB6 or XFA66.
Use in Japanese
In written Japanese languages of century 21, characters of walkU, by themselves, are not popular.
The image(s) of these characters are used by the font designers to make pictures of other, more complicated and more usual, kanjis.
For some reasons, often, more complicated Kanjis are more usual, than simple ones.
This observation is counter-intuitive.
From the common sense (ability to write and read, saving space due to the smaller font),
one could expect simple kanjis to be more popular.
Revealing and analysis of these reasons is good topic for the scientific research.
Gallery
References
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECC ⻌ 2ECC CJK RADICAL SIMPLIFIED WALK Han Script id: allowed confuse: 辶 , 辶 , ⻍ ..
- ↑ 2.0 2.1 https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECD ⻍ 2ECD CJK RADICAL WALK ONE Han Script id: allowed confuse: 辶 , 辶 , ⻌ ..
- ↑ 3.0 3.1 https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECE ⻎ 2ECE CJK RADICAL WALK TWO Han Script id: allowed confuse: none ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=8FB6 辶 8FB6 CJK UNIFIED IDEOGRAPH-8FB6 Han Script id: restricted confuse: 辶 , ⻌ , ⻍ ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=FA66 辶 FA66 CJK COMPATIBILITY IDEOGRAPH-FA66 Han Script id: allowed confuse: 辶 , ⻌ , ⻍ ..
- ↑ In the similar way, term Trillion causes confusion. For example, an offee takes the trillion loan from the government, while the trillion is qualified as 10^18, and then returns the debt assuming that the trillion is only 10^12. The results of such a business is described in article денег нет (in Russian): the offees have collected tons of cash and/or gold in tresories of their castles, but there is no money in Budget for the social programs.