Learn

What Your Japanese Teacher Never Taught You about Kanji

Trying to learn Japanese but getting stuck on the characters? Find out how kanji actually work with this breakdown of its components and their functions.

By John Renfroe Dec 14, 2017 4 min read

In the beginning, there was spoken language. Certain combinations of sounds represented certain meanings and these were called “words.” Communication was without physical form; there was sound and meaning, and it was good. Then one day, someone realized they could make marks on a surface (a rock, an animal shell, some bamboo) that represented those words. A physical form could now correspond to a word (a sound/meaning combination). The earliest form of kanji was born.

So we now have three attributes:

Sound: the pronunciation of a word
Meaning: the definition of a word
Form: the physical representation of a word (aka a sound/meaning combination)

In China, where kanji originated, these forms were pictographic. The mark for “mountain” looked like a mountain. The mark for “sheep” looked like a sheep. And that worked for a lot of words, but since some words are harder to represent with a simple picture, people had to get creative. They did this by using kanji characters as parts in other kanji, with each component being used for one or more of its attributes.

Kanji form components

Sometimes they would juxtapose two or more of these forms to represent another word:

Digits 廾（キョウ） (two hands) playing with a piece of 玉（ギョク））(jade) were used to represent “to fiddle with; tamper” to create 弄（ロウ） (note that the dot in 玉 was deleted for simplicity).
A 又（ユウ） (right hand) taking an 耳（ジ） (ear) from a victim (a common practice in ancient Chinese warfare) gives us 取（シュ） (to take).

And so on.

These pictographic parts are called form components. Their form, or what they depict, is what’s important, rather than their sound or meaning.

Kanji meaning components

Sometimes a kanji depicts one thing, but means another. For instance, 大（ダイ） (big) depicts a person (that is, its form is “a person”), but it means “big.” So, 小（ショウ）(small) over 大 makes 尖（セン） (sharp, pointed) because pointy things go from big to small (think of a triangle). Since 大 represents “big” here rather than “person” (that is, its meaning is used rather than its form), it’s a meaning component.

Kanji sound components

Sometimes they would choose a kanji for its sound and add a form or meaning component to disambiguate. For example, 悟（ゴ） (enlightenment, understanding) and 語（ゴ） (speech, language) are pronounced the same, so they contain the sound component 吾（ゴ） (I).

To disambiguate, 語 contains 言（ゲン）(speech) to indicate that the word being written has to do with speech, while 悟 contains忄(a component form of 心（シン））, or heart/mind) to indicate that the word being written has to do with the mind.

Keep in mind that sound components represent sound relationships in old Chinese. Pronunciation changed over time, and then the kanji were borrowed into Japanese and the pronunciations changed even more. So if you see a sound component that seems weird, that’s probably what’s going on there.

So as we’ve seen, there are three main kinds of components:

Form components
Meaning components
Sound components

Those sound familiar, right? The three main categories of kanji components correspond to the three attributes of writing: form, meaning, and sound. But there’s another category called empty components.

Empty components

Empty components are those that don’t express sound or meaning. There are two basic types of empty components: distinguishing marks and corruptions.

Distinguishing marks were used to disambiguate words which were pronounced similarly. A good example is 百（ヒャク/ハク）） (100). It’s simply 白（シロ/ハク） (white) with a horizontal line used as a distinguishing mark. That line is unrelated to 一（イチ）） (one); it’s just a mark. Notice that the two kanji are also pronounced similarly. Adding a distinguishing mark is like saying “this isn’t 白, but that other word that’s pronounced similarly.”

Corrupted components are components that changed over time into what they are today, losing their ability to represent sound or meaning. A good example is the four dots at the bottom of 黒（クロ/コク）） (black). The character 黒 originally depicted a person with a tattooed face, with extra dots emphasizing the tattoo. The person’s legs and the dots separated from the rest of the kanji, and in the modern script they look like 灬, the component form of 火（ヒ/カ） (fire). Since they have nothing to do with the sound or meaning of 黒, they’re called “empty components.”

So there you have it: three attributes (form, meaning and sound) and four component categories (form, meaning, sound and empty). So why is it that your textbooks and teachers don’t talk about these component categories, but about “radicals” instead? Well, we’ll talk about that next week!

John’s company Outlier Linguistics is developing a new mobile kanji dictionary which explains how kanji actually work, based on the latest academic research on the writing system. For more information, check out their Kickstarter page.