Onna
Onna is a set of three visually identical Unicode characters representing the kanji 女 (“woman”) that originate from different parts of the Unicode standard.
Term Onna (おんな) or "onna" may refer to one of the following three Unicode characters:
12069 (X2F25, ⼥)[2], KanjiRadical
22899 (X5973, 女)[3], KanjiLiberal
63873 (XF981, 女)[4], KanjiConfudal
Onna may refer also to the set of these 3 characters.
Often, these characters are pronounced as "Onna" and refer to a woman, Human female [1], see picture below.
Characters of set Onna cause confusions.
Confusion
As of 2026, no unique Unicode number is assigned to the glyph ⼥. Unicode encodes abstract characters rather than graphical glyphs. Therefore different characters may share identical visual representations in many fonts. This causes a confusion: different characters look the same.
Both teachers of Japanese and Japanese textbooks ignore this confusion.
A newcomer encounters the problem and first attributes it to his or her own mistake. Then novice notices that the error appears repeatedly, again and again, and, en fin, recognizes, that it is not his/her mistake, but the bug of the software.
The problem is related not only to learning Japanese.
Even specialists, even native Japanese speakers looking at characters
⼥
女,
女 are unlikely to guess:
Which of them is character X2F25 [2]?
Which of them is character X5973 [3]?
Which of them is character XF981 [4]?
Term Onna is synonym of construction «⼥ or 女 or 女" for cases, when the only view of the Kanji is available, and it is difficult to identify it.
Not only Humans, but also some software confuse the characters with similar or the same glyph(s).
The default Mediawiki software replaces character XF981 to character X5973 without any warning.
This causes problems at automatic processing of data: in some cases the two objects are the same, and sometimes they are not. (Similar confusion takes place at the careless use of term «equality» applied to triangles in geometry).
In a text, that assumes any kind of citing, for example, copy-pasting into the address bar of a brouser or any search engine, the characters corresponding to the sound "onna" should be specified as X2F25, X5973, XF981 rather than ⼥, 女, 女: the software confuses the last two characters.
At least since year 2021, this confusion is recognized and described [2][3][4].
Confusions related to the apparent bugs with graphical representations of the Unicode characters are described in articles «Chikara», «Miru», «Onna», «Sakana», «StickPi», «TsukiGatsu».
Unicode
At least three Unicode characters are qualified as «Onna».
These characters are
⼥ X2F25,
女 X5973,
女 XF981.
The Utf8 encoding can be revealed with the PHP program onna.t;
it is copypasted below. File uni.t also may have need to be loaded.
<?php
include "uni.t";
$a=unichr(0x2f25);
$a.=unichr(0x5973);
$a.=unichr(0xF981);
echo "$a\n";
$N=strlen($a);
echo "The array has $N bytes; here is its splitting:\n";
for($n=0;$n<$N;$n++){ printf("%02x ",ord($a[$n]) ); }
echo "\n";
$b = mb_str_split($a);
var_dump($b);
$M=count($b);
for($m=0;$m<$M;$m++) {
printf("\n");
$c=$b[$m];
$u=uniord($c);
printf("Unicode character number %05d id est, x%04X\n",$u,$u);
$d=strlen($c);
echo "Picture: $c uses $d bytes. These bytes are:\n";
for($n=0;$n<$d;$n++) printf("x%2X ",ord($c[$n]));
printf("in the hexadecimal representation and\n");
for($n=0;$n<$d;$n++) printf("%3d ",ord($c[$n]));
printf("in the decimal representation\n");
}
?>
File uni.t also would be loaded for the execution; then, command
php onna.t
produces the output below:
⼥女女
The array has 9 bytes; here is its splitting:
e2 bc a5 e5 a5 b3 ef a6 81
array(3) {
[0]=>
string(3) "⼥"
[1]=>
string(3) "女"
[2]=>
string(3) "女"
}
Unicode character number 12069 id est, x2F25
Picture: ⼥ uses 3 bytes. These bytes are:
xE2 xBC xA5 in the hexadecimal representation and
226 188 165 in the decimal representation
Unicode character number 22899 id est, x5973
Picture: 女 uses 3 bytes. These bytes are:
xE5 xA5 xB3 in the hexadecimal representation and
229 165 179 in the decimal representation
Unicode character number 63873 id est, xF981
Picture: 女 uses 3 bytes. These bytes are:
xEF xA6 x81 in the hexadecimal representation and
239 166 129 in the decimal representation
The similar analysis can be performed with more universal dumping routine du.t; here is the example of the use:
php du.t "⼥女女"
In such a way, a consequence of historical encoding decisions in Unicode and legacy character sets, together with ignorance of teachers and authors of manuals force the pupils to learn Unicode, UTF8 and the programming in order to distinguish characters used in Japanese language.
Uniglif and Tarja
In the sci-fi utopia «Tartaria», the advanced font Uniglif is mentioned. Each glyph in that font is assigned the unique Unicode number, providing the bijective relation between the set of glyphs and the set of characters, at least for characters with number not exceeding XFFFF (ascii, TwoByteCharacters, ThreeByteCharacters). With font Uniglif, no confusions similar to that above appear.
In the real life, while no analogy of Uniglif is available, the technical language Tarja can be used to avoid confusions. This is japanese-based technical slang that avoids ambiguous glyphs, avoid characters that are not yet supplied with default exclusive glyphs.
At the translation form Japanese to Tarja, for characters ⼥, 女 and 女, their explicit numbers X2F25, X5973 and XF981 can be used.
Alternatively, the transliterations «おんな» or «Onna» of «onna» can be used.
In addition, words borrowed from other languages («Female», «Mujer», ..) can be used in Tarja when it causes no confusion.
Examples
Dictionary Jisho suggest examples with sound onna [6]:
おんなざか 女坂 mother; female parent
おんなきょうだい 女兄弟 the easier of two slopes
女姉妹 【おんなきょうだい】 sisters; female siblings
おんなかぶき 女歌舞伎 girls' kabuki
Censorship and Vestism
Objects and subjects, denoted with term onna, often become targets of aggression and/or censorship.
Vestists insist, that the body should be hidden, and punish those who do not obey [7].
Such a practice is described also in the sci-fi novel «Meganesia.Deportation».
The 5 pictures at right are designed to measure the hatred/tolerance of a religion.
Counting, how many of shown dressing styles are allowed by a religion, gives the qualification of its tolerance with respect to onna in the 5 grade scale.
Reuse the glyph
女 appears as a component in various gliphs. Few examples are suggested in this section.
36B2 㚲 [8] セン, ショウ, テン, yan, ten, small, weak
597B 奻 [9] ダン, ナン, dan, nan, quarrel, dispute
597D 好 [10] コウ, このむ, すく, よい, よし, yoi, good
5999 妙 [11] ミョウ, ビョウ, たえ, miyou, mysterious, strange
59B9 妹 [12] マイ, バイ, メ, いもうと, いも, imouto, younger sister
59C9 姉 [13] シ, あね, ねえさん, vasan, elder sister
59E6 姦 [14] : カン, ケン, かしましい, みだら , midara, making 3 women at once.
5B89 安 [15] アン, やすい, いずくに, いずくにか, いずくにか, いずくんぞ, やすんじる, yasashi, cheap.
Historic context
The Unicode and the many default fonts had been designed in century 20, while the computation had been underdeveloped. The printing techniques, contrary, already existed during centuries. This predetermined the attitude of the designers to the encoding and too fonts. The goal wad to reproduce the required glyph, on the screen or in the printing; it was supposed that nobody cares, how is it encoded.
In Century 21, the roles of a glyph and that of a character swap. The character becomes the principal part of the textual information; the glyphes are still needed for the Human reception of characters.
Then, the lack of the unique encoding for a glyph becomes a problem; one needs some programming (see the example above) to patch the defects of the historic combination of the Unicode with existing fonts.
ChatGPT indicates, that there was no bad will of the designers of the Unicode, nor that of teachers of Japanese and authors of the manuals. They assumed, that their students never begin to write (nor analyze) characters in Japanese, and never meet errors relates to the confusing graphical representation of the characters.
Warning
Publications about characters of the Onna set are collected and analyzed in TORI with scientific goals.
The analysis and the interpretation above should not be interpreted as an appeal for the extrajudicial execution of the font/unicode designers who did not supply some popular glyphs with unique Unicode numbers.
The more civilized solution would be to convince them to develop some realistic default analogy of the fantastic Uniglif, the font with bijective relation between glyphs and characters.
The description above may require correction(s) by a native Japanese speaker.
References
- ↑ 1.0 1.1 https://jisho.org/search/%23kanji%20%E5%A5%B3 https://jisho.org/search/%23kanji_女 woman, female Kun: おんな、 め On: ジョ、 ニョ、 ニョウ Jōyō kanji, taught in grade 1 JLPT level N5 151 of 2500 most used kanji in newspapers On reading compounds 女 【ジョ】 woman, girl, daughter, Chinese "Girl" constellation (one of the 28 mansions) 女王 【ジョオウ】 queen, female champion 処女 【ショジョ】 virgin, maiden 一女 【イチジョ】 one daughter, eldest daughter, first-born daughter 女王 【ジョオウ】 queen, female champion 女房 【ニョウボウ】 wife (esp. one's own wife), court lady, female court attache, woman who served at the imperial palace, woman (esp. as a love interest) 老若男女 【ロウニャクナンニョ】 men and women of all ages 天女 【テンニョ】 heavenly nymph, celestial maiden, beautiful and kind woman 女房 【ニョウボウ】 wife (esp. one's own wife), court lady, female court attache, woman who served at the imperial palace, woman (esp. as a love interest) 女官 【ジョカン】 court lady, lady-in-waiting Kun reading compounds 女 【おんな】 female, woman, female sex, female lover, girlfriend, mistress, (someone's) woman 女形 【おんながた】 onnagata, male actor in female kabuki roles, female partner (in a relationship) 醜女 【しゅうじょ】 homely woman, plain-looking woman, female demon 囲い女 【かこいおんな】 mistress 雌 【め】 female, smaller (of the two), weaker, woman, wife 女神 【めがみ】 goddess, female deity 早乙女 【さおとめ】 young female rice planter, young girl 醜女 【しゅうじょ】 homely woman, plain-looking woman, female demon
- ↑ 2.0 2.1 2.2 https://util.unicode.org/UnicodeJsps/character.jsp?a=2F25 ⼥ 2F25 KANGXI RADICAL WOMAN Han Script id: allowed confuse: 女 , 女
- ↑ 3.0 3.1 3.2 https://util.unicode.org/UnicodeJsps/character.jsp?a=5973 女 5973 CJK UNIFIED IDEOGRAPH-5973 Han Script id: restricted confuse: 女 , ⼥
- ↑ 4.0 4.1 4.2 https://util.unicode.org/UnicodeJsps/character.jsp?a=F981 女 F981 CJK COMPATIBILITY IDEOGRAPH-F981 Han Script id: allowed confuse: 女 , ⼥
- ↑ https://commons.wikimedia.org/wiki/File:Human_female.jpg English: Naked female human body. Русский: Обнаженная женщина. English: Model name: (preferred not to be stated) At time of photograph: Age: 40 Height: 166 cm Weight: 47 kg BMI: 17.1 Ornaments: Ear piercing, ring on left ring finger (not in retouched images), nail polish on toe nails. There is some tilting of the upper trunk towards the left of the body, which may be positional or anatomical. Date 29 September 2011 Source Own work Author Taken at City Studios in Stockholm (www.stockholmsfotografen.se), September 29, 2011, with assistance from KYO (The organisation of life models) in Stockholm. ..
- ↑ https://jisho.org/search/%E5%A5%B3%20%E3%81%8A%E3%82%93%E3%81%AA%20%23words?page=2 女 おんな #words .. Words — 107 found おんなおや 女親 Links Noun 1. mother; female parent Details ▸ おんなざか 女坂 Links Noun 1. the easier of two slopes Details ▸ おんなきょうだい 女兄弟 Links Noun 1. sisters; female siblings Other forms 女姉妹 【おんなきょうだい】 Details ▸ おんなかぶき 女歌舞伎 Links Noun 1. girls' kabuki Details ▸ ..
- ↑ https://edition.cnn.com/2023/09/21/middleeast/iran-hijab-law-parliament-jail-intl-hnk Iranian women face 10 years in jail for inappropriate dress after ‘hijab bill’ approved By Tara Subramaniam, Adam Pourahmadi and Mostafa Salem, CNN. Published 12:34 PM EDT, Thu September 21, 2023
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=36B2 㚲 36B2 CJK UNIFIED IDEOGRAPH-36B2 Han Script confuse: none .. (kDefinition) small and weak, used in girl's name, a woman's feature; lady's face .. (kJapanese) セン|ショウ|テン ..
- ↑ 奻 597B CJK UNIFIED IDEOGRAPH-597B Han Script confuse: none .. (kJapanese) ダン|ナン ..
- ↑ 好 597D CJK UNIFIED IDEOGRAPH-597D Han Script confuse: none .. (kDefinition) good, excellent, fine; well .. (kJapanese) コウ|このむ|すく|よい|よし ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=5999 妙 5999 CJK UNIFIED IDEOGRAPH-5999 Han Script confuse: none .. (kDefinition) mysterious, subtle; exquisite (kJapanese) ミョウ|ビョウ|たえ
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=59B9 妹 59B9 CJK UNIFIED IDEOGRAPH-59B9 Han Script confuse: none .. (kDefinition) younger sister .. (kJapanese) マイ|バイ|メ|いもうと|いも ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=59C9 姉 59C9 CJK UNIFIED IDEOGRAPH-59C9 Han Script confuse: none .. (kDefinition) elder sister .. (kJapanese) シ|あね|ねえさん ..
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=59E6 姦 59E6 CJK UNIFIED IDEOGRAPH-59E6 Han Script confuse: none .. (kDefinition) adultery, debauchery; debauch .. (kJapanese) カン|ケン|かしましい|みだら
- ↑ https://util.unicode.org/UnicodeJsps/character.jsp?a=5B89 安 5B89 CJK UNIFIED IDEOGRAPH-5B89 Han Script confuse: none .. (kDefinition) peaceful, tranquil, quiet .. (kJapanese) アン|やすい|いずくに|いずくにか|いずくんぞ|やすんじる ..