Difference between revisions of "WalkU"
(Created page with "WalkJ is set of the following five Unicode characters: X2ECC ⻌ <ref name="u2ECC"> https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECC ⻌ 2EC...") |
(→Tarja) |
||
Line 95: | Line 95: | ||
2. [[Shinniou]]. This word is almost half-shorter (8 bytes instead of 15), but in absent in standard Japanese. |
2. [[Shinniou]]. This word is almost half-shorter (8 bytes instead of 15), but in absent in standard Japanese. |
||
− | 3. [[WalkJ]]. The last letter of this word can be interpreted as parameter J, that can take integer values |
+ | 3. [[WalkJ]]. The last letter of this word can be interpreted as parameter J, that can take integer values from 0 to 4. |
4. The hexadecimal number of a [[WalkJ]] character, id est, [[X2ECC]] or [[X2ECD]] or [[X2ECE]] or [[X8FB6]] or [[XFA66]]. |
4. The hexadecimal number of a [[WalkJ]] character, id est, [[X2ECC]] or [[X2ECD]] or [[X2ECE]] or [[X8FB6]] or [[XFA66]]. |
Revision as of 01:05, 5 September 2021
WalkJ is set of the following five Unicode characters:
X2ECC ⻌ [1], KanjiRadical, WalkZero (Walk0)
X2ECD ⻍ [2], KanjiRadical, WalkOne (Walk1)
X2ECE ⻎ [3], KanjiRadical, WalkTwo, (Walk2)
X8FB6 辶 [4], KanjiLiberal, WalkThree (Walk3)
XFA66 辶 [5], KanjiConfudal, WalkFour (Walk4)
The system of names (Walk0,Walk1,Walk2,Walk3,Walk4) is based on the descriptions [2][3] of two of these characters by The Unicode Utilities. The similar Kanjis are numbered in the alphabetic order (order of increasing of their Unicode numbers).
Confusion
In century 21, the computer support of characters of Japanese is underdeveloped.
Up to year 2021, there is no united standard for the default font, that would allow each character to look the same at various computers, but different from other characters.
Apparently, the opposite cases take place: the character looks different with the default font of different operational systems, but similar to other characters at the same computer. This causes confuses.
In particular, such a confusion takes place with characters of set WalkJ. Sometimes, they are considered as equivalent, and sometimes not. [6]
Characters X2ECC ⻌ , X2ECD ⻍ , X2ECE ⻎ , X8FB6 辶 , XFA66 辶 look similar.
Even a native Japanese speaker, watching characters
⻌,
⻍,
⻎,
辶,
辶,
is unlikely to answer:
Which of them is X2ECC?
Which of them is X2ECD?
Which of them is X2ECE?
Which of them is X8FB6?
Which of them is XFA66?
In such a case, the correct specification of the character is
"X2ECC ⻌ or X2ECD ⻍ or X2ECE ⻎ or X8FB6 辶 or XFA66 辶"
Term WalkJ is defined to substitute this (long and complicated) construction with the single word.
The characters of WalkJ and their encodings can be identified, revealed with PHP programs du1.t and ud.t. Example of the call:
php ud.t 2ECC 2ECD 2ECE 8FB6 FA66
In the most of cases, the difference between these characters is not so important, while the phonetic and semantic properties of these characters are similar. However, the mistakes appears, when one needs to search for the character, or to replace it through the text, or to perform more complicated operations with the document.
In the first semantic and phonetic approximation, all characters of WalkJ are considered as equivalent. They can be pronounced as しんにょう, Sjinniou, or Sjinnyou, meaning go, to walk, to advance (for example, after to be stack in a traffic jam). After replacement of all characters of WalkJ to one of these phonetic representation, the mistakes mentioned do not occur.
Tarja
In order to avoid confusions mentioned above, the special system of notations Tarja is suggested.
Tarja appears as kind of Japanese-English Pidgin. This language is similar to Japanese, but the ambiguous, confusive characters are excluded, substituted with Romanji and/or Hiragana words, basing on the graphic (JapanPi, JapanPsi), phonetic (Tori, Sakana, Chikara) or semantic (for ex.,Vecher) similarity.
Tarja is oriented to reading and analysis of Japanese texts.
First, the characters are translated from Japanese to Tarja.
This convention reduces the amount of options to be considered,
looking for some meaningful combination of interpetations of each Katakana or Kanji:
some katakana look as Kanji, but in Tarja, they appear as ascii or Hiragana.
Then, text in Tarja is analyzed and/or (if necessary) translated to other language(s).
Tarja is expected to be useful in analysis of ambiguous Japanese texts: the goal is to reveal the ambiguities, to describe them and avoid them in the meaningful texts.
Characters of WalkJ can be translated from Japanese to Tarja in the following ways:
1. しんにょう. This word is allowed in Japanese.
2. Shinniou. This word is almost half-shorter (8 bytes instead of 15), but in absent in standard Japanese.
3. WalkJ. The last letter of this word can be interpreted as parameter J, that can take integer values from 0 to 4.
4. The hexadecimal number of a WalkJ character, id est, X2ECC or X2ECD or X2ECE or X8FB6 or XFA66.
Use in Japanese
In written Japanese languages of century 21, characters of walkJ, by themselves, are not popular.
The image(s) of these characters are used by the font designers to make pictures of other, more complicated and more usual, kanjis.
For some reasons, often, more complicated Kanjis are more usual, than simple ones.
This observation is counter-intuitive.
From the common sense (ability to write and read, saving space due to the smaller font),
one could expect simple kanjis to be more popular.
Revealing and analysis of these reasons is good topic for the scientific research.
References
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECC ⻌ 2ECC CJK RADICAL SIMPLIFIED WALK Han Script id: allowed confuse: 辶 , 辶 , ⻍ ..
- ↑ 2.0 2.1 https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECD ⻍ 2ECD CJK RADICAL WALK ONE Han Script id: allowed confuse: 辶 , 辶 , ⻌ ..
- ↑ 3.0 3.1 https://util.unicode.org/UnicodeJsps/character.jsp?a=2ECE ⻎ 2ECE CJK RADICAL WALK TWO Han Script id: allowed confuse: none ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=8FB6 辶 8FB6 CJK UNIFIED IDEOGRAPH-8FB6 Han Script id: restricted confuse: 辶 , ⻌ , ⻍ ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=FA66 辶 FA66 CJK COMPATIBILITY IDEOGRAPH-FA66 Han Script id: allowed confuse: 辶 , ⻌ , ⻍ ..
- ↑ In the similar way, term Trillion causes confusion. For example, an offee takes the trillion loan from the government, while the trillion is qualified as 10^18, and then returns the debt assuming that the trillion is only 10^12. The results of such a business is described in article денег нет (in Russian): the offees have collected tons of cash and/or gold in tresories of their castles, but there is no money in Budget for the social programs.