Difference between revisions of "Japanese"
(add link) |
|||
| Line 10: | Line 10: | ||
[[Kanji]] (characters [[X0F90]] - [[XFA6D]], native [[Hieroglyph]]s) |
[[Kanji]] (characters [[X0F90]] - [[XFA6D]], native [[Hieroglyph]]s) |
||
| − | [[ |
+ | [[Romaji]] (characters since [[X0020]] ([[spacebar]]) to [[X007E]] ([[tilde]]); practically, the same as [[Ascii]]). |
Some Japanese characters are collected in article [[SomeU]]; |
Some Japanese characters are collected in article [[SomeU]]; |
||
| Line 17: | Line 17: | ||
There is no isomorphic mapping of words in [[Kanji]] to their synonyms in [[Hiragana]].<br> |
There is no isomorphic mapping of words in [[Kanji]] to their synonyms in [[Hiragana]].<br> |
||
In this sense, there are two [[Japanese]] languages, ideographic and phonetic. The translation from one to another makes problems for foreigners and may cause confusion even for the native Japanese speakers. |
In this sense, there are two [[Japanese]] languages, ideographic and phonetic. The translation from one to another makes problems for foreigners and may cause confusion even for the native Japanese speakers. |
||
| + | |||
| + | ==Multitud== |
||
| + | |||
| + | Actually, term [[Japanese]] may refer to any of 3 languages, one oral (verbal) and two typeral (printable). |
||
| + | |||
| + | 1. [[Romaji]], that can be described with Latin characters (some equivalent of Ascii) |
||
| + | |||
| + | 2. [[Hiragana]], that represent characters with special phonetic alphabet |
||
| + | |||
| + | 3. [[Kanji]]. |
||
| + | |||
| + | 4. In addition, there is special phonetic alphabet [[Katakana]] for representation of foreign words with sounds that are allowed in phonetic Japanese language. |
||
==[[Hiragana]] and [[Katakana]]== |
==[[Hiragana]] and [[Katakana]]== |
||
| Line 32: | Line 44: | ||
</table> |
</table> |
||
</big></big> |
</big></big> |
||
| + | |||
| + | ==[[Latex]]== |
||
| + | |||
| + | By default, [[Japanese]] characters are not supported in [[Latex]], even at computers made in Japan. |
||
| + | |||
| + | The special efforts may be required in order to type in [[Japanese]]; there seem to exist no standard default way |
||
| + | to type [[Japanese]] characters in [[Latex]]. |
||
| + | |||
| + | Two options are mentioned below. |
||
| + | ===CJK=== |
||
| + | Many Japanese characters can be printed with [[CJK]] package. The example is below: |
||
| + | |||
| + | https://tex.stackexchange.com/questions/223237/packages-cjk-versus-cjkutf8 |
||
| + | <pre> |
||
| + | % !TEX encoding = UTF-8 |
||
| + | % !TEX program = pdflatex |
||
| + | \documentclass{article} |
||
| + | \usepackage{CJKutf8} |
||
| + | \usepackage[utf8]{inputenc} % optional |
||
| + | \usepackage[T1]{fontenc} |
||
| + | |||
| + | \begin{document} |
||
| + | % We always use CJK package globally to prevent some bugs. |
||
| + | \begin{CJK}{UTF8}{gbsn} |
||
| + | Without \texttt{CJKutf8} package, the result will be wrong. |
||
| + | |||
| + | Café: 咖啡厅 |
||
| + | |||
| + | Gödel: 哥德尔 |
||
| + | |||
| + | © 版权所有 |
||
| + | |||
| + | \clearpage\end{CJK} |
||
| + | \end{document} |
||
| + | </pre> |
||
| + | |||
| + | [[CJK]] has bugs: Some Kanji are not printed. Here are 4 examples: |
||
| + | |||
| + | <big><big> |
||
| + | 結 過 長 論 |
||
| + | </big></big> |
||
| + | |||
| + | with [[CJK]] package, these characters are ignored. |
||
| + | |||
| + | ===[[XeLaTeX]]=== |
||
| + | In some versions of Latex (mainly at Macintosh), there is option [[XeLaTeX]]. |
||
| + | |||
| + | [[XeLaTeX]] seems to be not compatible with the [[CJK]] package, but allows to type various [[Unicode]] characters. |
||
| + | |||
| + | The default example of [[XeLaTeX]] document is copipasted below: |
||
| + | <pre> |
||
| + | % XeLaTeX can use any Mac OS X font. See the setromanfont command below. |
||
| + | % Input to XeLaTeX is full Unicode, so Unicode characters can be typed directly into the source. |
||
| + | |||
| + | % The next lines tell TeXShop to typeset with xelatex, and to open and save the source with Unicode encoding. |
||
| + | |||
| + | %!TEX TS-program = xelatex |
||
| + | %!TEX encoding = UTF-8 Unicode |
||
| + | |||
| + | \documentclass[12pt]{article} |
||
| + | \usepackage{geometry} % See geometry.pdf to learn the layout options. There are lots. |
||
| + | \geometry{letterpaper} % ... or a4paper or a5paper or ... |
||
| + | %\geometry{landscape} % Activate for for rotated page geometry |
||
| + | %\usepackage[parfill]{parskip} % Activate to begin paragraphs with an empty line rather than an indent |
||
| + | \usepackage{graphicx} |
||
| + | \usepackage{amssymb} |
||
| + | |||
| + | % Will Robertson's fontspec.sty can be used to simplify font choices. |
||
| + | % To experiment, open /Applications/Font Book to examine the fonts provided on Mac OS X, |
||
| + | % and change "Hoefler Text" to any of these choices. |
||
| + | |||
| + | \usepackage{fontspec,xltxtra,xunicode} |
||
| + | \defaultfontfeatures{Mapping=tex-text} |
||
| + | \setromanfont[Mapping=tex-text]{Hoefler Text} |
||
| + | \setsansfont[Scale=MatchLowercase,Mapping=tex-text]{Gill Sans} |
||
| + | \setmonofont[Scale=MatchLowercase]{Andale Mono} |
||
| + | |||
| + | \title{Brief Article} |
||
| + | \author{The Author} |
||
| + | %\date{} % Activate to display a given date or no date |
||
| + | |||
| + | \begin{document} |
||
| + | \maketitle |
||
| + | |||
| + | % For many users, the previous commands will be enough. |
||
| + | % If you want to directly input Unicode, add an Input Menu or Keyboard to the menu bar |
||
| + | % using the International Panel in System Preferences. |
||
| + | % Unicode must be typeset using a font containing the appropriate characters. |
||
| + | % Remove the comment signs below for examples. |
||
| + | |||
| + | % \newfontfamily{\A}{Geeza Pro} |
||
| + | % \newfontfamily{\H}[Scale=0.9]{Lucida Grande} |
||
| + | % \newfontfamily{\J}[Scale=0.85]{Osaka} |
||
| + | |||
| + | % Here are some multilingual Unicode fonts: this is Arabic text: {\A السلام عليكم}, this is Hebrew: {\H שלום}, |
||
| + | % and here's some Japanese: {\J 今日は}. |
||
| + | \end{document} |
||
| + | </pre> |
||
| + | |||
| + | Some of comments (but not all) can be omitted in the document above, and the example still seems to be compiled well. |
||
==[[Unicode]] and confusions== |
==[[Unicode]] and confusions== |
||
| Line 58: | Line 170: | ||
Many [[Japanese]] [[Kanji]] have no unique encoding. |
Many [[Japanese]] [[Kanji]] have no unique encoding. |
||
| − | In [[TORI]], the technical language [[Tarja]] is under |
+ | In [[TORI]], the technical language [[Tarja]] is under developing with goal |
| − | + | to collect [[Japanese]] characters that have unique 3-byte encoding. |
|
Characters that have no unique encoding, are replaced with |
Characters that have no unique encoding, are replaced with |
||
| − | [[Hiragana]] or [[ |
+ | [[Hiragana]] or [[Romaji]]; either transliteration into Ascii, or translation of the whole word into English; the grammar most similar to Japanese is preserved. |
The ambiguity and the confusion of the [[Japanese]] [[Kanji]]s has analogies in other languages. |
The ambiguity and the confusion of the [[Japanese]] [[Kanji]]s has analogies in other languages. |
||
| Line 87: | Line 199: | ||
https://www.youtube.com/watch?v=b-LF-iLS_ys&list=PLhcJvXrBVQgoLbowh7Cvn8zqGPZz6Kdg3&index=4 |
https://www.youtube.com/watch?v=b-LF-iLS_ys&list=PLhcJvXrBVQgoLbowh7Cvn8zqGPZz6Kdg3&index=4 |
||
Learn Japanese with JapanesePod101.com // Dec 21, 2017 |
Learn Japanese with JapanesePod101.com // Dec 21, 2017 |
||
| + | |||
| + | 2022.01.28. |
||
| + | https://www.youtube.com/watch?v=xGruG40wifQ |
||
| + | Japanese Conversation | Learn Japanese While Sleeping #learnjapanese |
||
| + | Japanese Everyday |
||
| + | Jan 28, 2022 |
||
2023.10.16. |
2023.10.16. |
||
https://www.youtube.com/watch?v=dcKQyLaJXIE |
https://www.youtube.com/watch?v=dcKQyLaJXIE |
||
Japanese Learn While Sleeping | BASIC Japanese for Beginners Oct 16, 2023 Learn Japanese Everyday |
Japanese Learn While Sleeping | BASIC Japanese for Beginners Oct 16, 2023 Learn Japanese Everyday |
||
| + | |||
| + | https://www3.nhk.or.jp/nhkworld/lesson/ja/ |
||
| + | Easy Japanese(NEW) やさしい日本語 (2026) |
||
{{fer}} |
{{fer}} |
||
| Line 104: | Line 225: | ||
«[[KanjiRadical]]», |
«[[KanjiRadical]]», |
||
«[[Katakana]]», |
«[[Katakana]]», |
||
| + | «[[Romaji]]», |
||
«[[Tarja]]», |
«[[Tarja]]», |
||
«[[Unicode]]», |
«[[Unicode]]», |
||
Latest revision as of 13:15, 22 February 2026
Japanese is main language and official language in Japan.
Japanese has 4 writing systems:
Hiragana (characters X3041 - X3096, phonetic alphabet used to indicate pronunciation of native Japanese words)
Katakana (characters X3097 - X30F6, used for words borrowed from other languages)
Kanji (characters X0F90 - XFA6D, native Hieroglyphs)
Romaji (characters since X0020 (spacebar) to X007E (tilde); practically, the same as Ascii).
Some Japanese characters are collected in article SomeU; most of them are encoded with 3 bytes.
There is no isomorphic mapping of words in Kanji to their synonyms in Hiragana.
In this sense, there are two Japanese languages, ideographic and phonetic. The translation from one to another makes problems for foreigners and may cause confusion even for the native Japanese speakers.
Multitud
Actually, term Japanese may refer to any of 3 languages, one oral (verbal) and two typeral (printable).
1. Romaji, that can be described with Latin characters (some equivalent of Ascii)
2. Hiragana, that represent characters with special phonetic alphabet
3. Kanji.
4. In addition, there is special phonetic alphabet Katakana for representation of foreign words with sounds that are allowed in phonetic Japanese language.
Hiragana and Katakana
Here is phonetic table of Hiragana and Katakana characters:
| w | r | y | m | h | んン | t | s | k | |
| わワ | らラ | やヤ | まマ | はハ | なナ | たタ | さサ | かカ | あア |
| - | りリ | - | みミ | ひヒ | にニ | ちチ | しシ | きキ | いイ |
| - | るル | ゆユ | むム | ふフ | ぬヌ | つツ | すス | くク | うウ |
| - | れレ | - | めメ | へヘ | ねネ | てテ | せセ | けケ | えエ |
| をヲ | ろロ | よヨ | もモ | ほホ | のノ | とト | そソ | こコ | おオ |
Latex
By default, Japanese characters are not supported in Latex, even at computers made in Japan.
The special efforts may be required in order to type in Japanese; there seem to exist no standard default way to type Japanese characters in Latex.
Two options are mentioned below.
CJK
Many Japanese characters can be printed with CJK package. The example is below:
https://tex.stackexchange.com/questions/223237/packages-cjk-versus-cjkutf8
% !TEX encoding = UTF-8
% !TEX program = pdflatex
\documentclass{article}
\usepackage{CJKutf8}
\usepackage[utf8]{inputenc} % optional
\usepackage[T1]{fontenc}
\begin{document}
% We always use CJK package globally to prevent some bugs.
\begin{CJK}{UTF8}{gbsn}
Without \texttt{CJKutf8} package, the result will be wrong.
Café: 咖啡厅
Gödel: 哥德尔
© 版权所有
\clearpage\end{CJK}
\end{document}
CJK has bugs: Some Kanji are not printed. Here are 4 examples:
結 過 長 論
with CJK package, these characters are ignored.
XeLaTeX
In some versions of Latex (mainly at Macintosh), there is option XeLaTeX.
XeLaTeX seems to be not compatible with the CJK package, but allows to type various Unicode characters.
The default example of XeLaTeX document is copipasted below:
% XeLaTeX can use any Mac OS X font. See the setromanfont command below.
% Input to XeLaTeX is full Unicode, so Unicode characters can be typed directly into the source.
% The next lines tell TeXShop to typeset with xelatex, and to open and save the source with Unicode encoding.
%!TEX TS-program = xelatex
%!TEX encoding = UTF-8 Unicode
\documentclass[12pt]{article}
\usepackage{geometry} % See geometry.pdf to learn the layout options. There are lots.
\geometry{letterpaper} % ... or a4paper or a5paper or ...
%\geometry{landscape} % Activate for for rotated page geometry
%\usepackage[parfill]{parskip} % Activate to begin paragraphs with an empty line rather than an indent
\usepackage{graphicx}
\usepackage{amssymb}
% Will Robertson's fontspec.sty can be used to simplify font choices.
% To experiment, open /Applications/Font Book to examine the fonts provided on Mac OS X,
% and change "Hoefler Text" to any of these choices.
\usepackage{fontspec,xltxtra,xunicode}
\defaultfontfeatures{Mapping=tex-text}
\setromanfont[Mapping=tex-text]{Hoefler Text}
\setsansfont[Scale=MatchLowercase,Mapping=tex-text]{Gill Sans}
\setmonofont[Scale=MatchLowercase]{Andale Mono}
\title{Brief Article}
\author{The Author}
%\date{} % Activate to display a given date or no date
\begin{document}
\maketitle
% For many users, the previous commands will be enough.
% If you want to directly input Unicode, add an Input Menu or Keyboard to the menu bar
% using the International Panel in System Preferences.
% Unicode must be typeset using a font containing the appropriate characters.
% Remove the comment signs below for examples.
% \newfontfamily{\A}{Geeza Pro}
% \newfontfamily{\H}[Scale=0.9]{Lucida Grande}
% \newfontfamily{\J}[Scale=0.85]{Osaka}
% Here are some multilingual Unicode fonts: this is Arabic text: {\A السلام عليكم}, this is Hebrew: {\H שלום},
% and here's some Japanese: {\J 今日は}.
\end{document}
Some of comments (but not all) can be omitted in the document above, and the example still seems to be compiled well.
Unicode and confusions
Many Japanese Kanji have no unique pictures. To century 21, in various software, often, few characters have the same picture, the same semantics and the same mode of pronunciations.
The ambiguous characters are classified as KanjiRadical, KanjiLiberal (almost the same as CJK chharcters) or KanjiConfudal.
Some software (Mainly at Macintosh) use the same pictures for KanjiRadical and KanjiLiberal characters, causing concussions.
Some software automatically and silently (without any warning) reface KanjiConfudal with KanjiLiberal, making confusions even worse.
The PHP code du.t allows to identify characters, returning their unicode numbers and the encoding (assuming the UTF-8 Unicode system). Typically, each Japanese character is encoded with 3 bytes; so, the text in Japanese is a little bit longer than its English version (that uses a single byte per a character).
Ambiguity and Tarja
Many Japanese Kanji have no unique encoding.
In TORI, the technical language Tarja is under developing with goal to collect Japanese characters that have unique 3-byte encoding.
Characters that have no unique encoding, are replaced with Hiragana or Romaji; either transliteration into Ascii, or translation of the whole word into English; the grammar most similar to Japanese is preserved.
The ambiguity and the confusion of the Japanese Kanjis has analogies in other languages.
Ascii Characters also may be confused in the similar way;
for example, the most of Humans looking at
word (1) PABEHCTBO cannot distinguish it from
word (2) РАВЕНСТВО,
although
word (1) is written in Ascii characters and counts 9 bytes while
word (2) is written in Russian and counts 18 bytes.
Warning
Interpretation of Japanese in terms of Tarja is an attempt to simplify use of Japanese by the English-speaking foreigners.
It is not an attempt to substitute Japanese with any surrogate
nor a suggestion to modify the current version of Japanese.
References
2017.12.21. https://www.youtube.com/watch?v=b-LF-iLS_ys&list=PLhcJvXrBVQgoLbowh7Cvn8zqGPZz6Kdg3&index=4 Learn Japanese with JapanesePod101.com // Dec 21, 2017
2022.01.28. https://www.youtube.com/watch?v=xGruG40wifQ Japanese Conversation | Learn Japanese While Sleeping #learnjapanese Japanese Everyday Jan 28, 2022
2023.10.16. https://www.youtube.com/watch?v=dcKQyLaJXIE Japanese Learn While Sleeping | BASIC Japanese for Beginners Oct 16, 2023 Learn Japanese Everyday
https://www3.nhk.or.jp/nhkworld/lesson/ja/ Easy Japanese(NEW) やさしい日本語 (2026)
Keywords
«Du.t», «Hiragana», «[[]]», «Japan», «Japanese», «Kanji», «KanjiConfudal», «KanjiLiberal», «KanjiRadical», «Katakana», «Romaji», «Tarja», «Unicode»,