Learning Japanese through input
There are two key milestones on the path to acquiring a second language:
- Comprehension milestone: you are able to consume interesting content (text, audio, video, etc.) in the language with a reasonable degree of understanding and fluidity.
- Speaking milestone: you have sufficient listening skills and vocabulary to attempt non-trivial spoken communication.
The speaking miletone comes second because speaking rests upon a foundation of listening comprehension.
How long it takes to reach these milestones depends upon the distance between your own language and the target language. For a Spanish speaker learning Italian or vice versa, this might be as short as a few months. For an English speaker learning Japanese or vice versa, this usually takes multiple years.
NOTE: For the special case of living in country with daily immersion, you can reach basic functional spoken Japanese in several months, but there will be large holes in your skills, especially with the written language. In any case, this pathway is not really relevant to the discussion because it’s an unrealistic life circumstance for the vast majority of learners.
Once you reach these two milestones, the path forward is clear: consume tons of content and do lots of speaking practice. The hard question, though, is how to reach these milestones in the first place, especially for languages like Japanese that take a long time to acquire.
Here’s my advise on how reach those milestones via listening and reading practice.
Levels of listening and reading practice
Listening and reading practice can be broken into levels based on how much you comprehend. In order of least understanding to most understanding, the levels are:
1. Scanning for words and phrases
At this level, you catch isolated words, phrases, and perhaps occasional whole clauses or sentences, and you may often recognize the ‘outline’ of many sentences (verb auxiliaries, conjunctions, and other connective words or phrases). However, you’re not able to follow what’s happening sentence-to-sentence nor consistently discern the topic. For example:
Something about a cow and magic beans. A giant is involved for some reason?
2. Following the topic
You can follow the topic, but you’re missing too many things to follow sentence-by-sentence:
Someone buys magic beans. There’s a big plant, lots of climbing, and a castle. Also some clouds and a giant? Someone dies at the end?
3. Partial understanding of the sentences
You can now understand the essence of many sentences, but details still get lost:
A kid gets magic beans. His mother is upset about the cow for some reason. Jack climbs a big plant that somehow appears. Jack finds a castle in the clouds and robs a giant, who dies when he falls.
4. Full understanding of the sentences
You now catch basically all the information:
Oh I see, Jack traded the cow for the beans, and the plant grew from the beans which he tossed out the window. Jack stole the giant’s goose, the giant chased him, and Jack killed the giant by chopping down the beanstalk.
5. Full understanding of the nuance
You not only get all the concrete information conveyed, you understand the nuances like a native speaker. In Japanese, this would include things like the choice of end sentence particles or choice of pronouns or honorifics. (Japanese has a lot of ‘coloring’ options, meaning ways of changing the feel of what’s being said without changing the concrete information conveyed.)
Jack is a disrepectful trickster comic hero who speaks in kansai dialect. The giant talks like a tough yakuza guy. The story is narrated from an authoritative, impartial, 3rd-person perspective,
Reading while listening
Reading along with the transcript as you listen can help you understand more and stay engaged with the content. However, keep in mind that fluent reading relies upon engaging the listening comprhension pathways in the brain. This implies:
- Your reading fluency cannot exceed your listening comprehension.
- When listening to something you struggle to understand, reading along with a transcript is an unnatural act.
So it’s critical to practice pure listening without text because the mental overhead of reading will distract your focus from the spoken language. Without listening-only practice, your listening comprehension development may be hindered or even stalled.
Using AI translations and summaries
When tackling content that is above your level (mainly indicated by the content having a significant number of new or unfamiliar words), try first reading through an AI chat translation or English summary. Particularly for a piece of content which you have difficulty understanding, this trick helps you stay engaged as you listen: your memory of the summary will often fill in the gaps of understanding without having to stop and look up words or translate sentences.
On the other hand, this practice may actually diminish your engagement because it takes away the surprise of what comes next. So for some content, particularly fiction, you might instead only use AI summaries to review what you’ve read and listened to.
Word acquisition
Acquisition of a new word broadly follows this sequence:
1. Phonetic memory of a word
Human brains are remarkably good at absorbing the unique sound signatures of words. Often it takes hearing a word just a handful of times before you can fairly reliably recognize the sound of a word when you hear it, even in sentences that otherwise consist of totally unknown words.
However, recognizing the sound of a word does not mean you neecessarily understand it. That comes later.
Nor does recognizing the sound of a word translate automatically to recognizing the written form or being able to reproduce the sound reliably. For example, even if you reliably recognize 面白い when you hear it, you may not reliably be able to remember the exact sequence of syllables when you try to speak the word. So even though speaking is built on a foundation of listening, reliably speaking a word correctly generally takes some actual speaking practice.
2. Map meaning of a word to the phonetic memory
Although conscious recall of a word’s meaning has some value in the learning process, ultimately, fluent listening comprehension requires direct, automatic mapping from the sound of a word to its meaning.
The only way to form these direct connections in your brain is to experience the meaning of a word when you hear it. The experiences do not have to be sensory (though that can help): rather the experiences can simply be occurances where you are engaged with the meaning as you hear the word.
Artificial encounters with words, such as in drills, generally lack this engagement: you can consciouly note that アライグマ means “racoon”, and with enough repetitions you can recall this fact when prompted, yet the direct connection between the sound and meaning will only be formed by experiencing アライグマ in a meaningful context. These contexts could be:
- Seeing an アライグマ and hearing someone say, “アライグマ!”
- Being advised to keep your garbage bins sealed to ward off アライグマ.
- Hearing an engrossing fable about an アライグマ.
- Etc.
The critical part is that the meaning of the word has immediate value to you in these situations, such that you are really paying attention and actually care in the moment what this arbitrary sound means.
3. Nuances of a word
Beyond the “simple” meanings, words often have many nuances: shades of meaning, connotations, and colocations. The only practical way to acquired these nuances is simply by listening to and reading a ton of the language.
4. Active use of a word
Finally, a word will eventually move from your passive vocabularly into your active vocabulary. Again, the only practical way to induce this is through lots of speaking practice, and no artificial exercise is better than conversation and other scenarios where you genuinely need to communicate.
(Note that this last step often overlaps the prior: at the same time as you’re integrating a word into your active vocabulary, you’re often also still picking up its nuances.)
Translating in your head
A common lament of language learners is that they feel stuck “translating in their head” as they read, listen, or speak rather than the being able to use the language automatically without effort like their own native language. Unfortunately, there’s no mental techinique that will help you consciouly stop yourself from doing this mental translation. The only fix is just more practice: more listening, more reading, and more speaking. Eventually you’ll stop translating in your head because you will no longer need to.
For individual words and phrases, there is often an in-between phase, where the meaning of a word you hear does not fully register automatically, so you consciously repeat the word to yourself but don’t actually translate it.
Again, word acquisition starts with just phonetic recognition without understanding of the meaning:
おんがく? I’ve heard that word
Then comes conscious recognition with translation:
おんがく? That means “music”
Then comes conscious recognition:
おんがく? Oh yeah, おんがく
Finally, once a word is fully in your passive vocabulary, the meaning will register automatically when you hear it without any conscious thought:
おんがく
It’s common to have a large set of words that seem stuck in the earlier stages, but understand that this is normal. Don’t get frustrated when you catch yourself consciously thinking about a word when you hear it even if it’s a word you’ve heard a thousand times already. Some words will just take longer than others.
In fact, until you can listen to interesting content with a high degree of comprehension, many words will probably hover at the border of this final threshold. Only once the language is really useful to you such that you can actually consume content for the content’s own sake rather than language practice will the vocabulary and language patterns become truly ingrained.
Repeated listening practice
One effective form of practice is to repeatedly listen to a piece of content until you understand it without aid of a transcript or translation. The content for such practice should generally meet these criteria:
- The content should be monologues or dialogues by native speakers (not text-to-speech).
- The content should have an accurate transcript so that you can check if you heard the words correctly and so that you can easily look up any words or get a full translation.
- Especially in the beginner and intermediate stages, the content should generally be “comprehensible input”, meaning content targeted for non-native speakers.
- The content should be short, say 3 to 10 minutes long. The shorter, the easier it is to repeat.
- The content should be sufficiently within your level that you understand at least 70-80% on your first listen.
After each repetition, you can look at the transcript if you need to verify your understanding and fill in gaps: lookup any unknown words, analyze the grammar, use machine translation, or whatever else it takes to fully understand the text.
You generally should repeat a piece of content no more than 3 or 4 times, and generally these repetitions should be spaced out by at least a few days.
If, after a few repetitions, you still have trouble following a piece of content, it was probably too difficult for listening practice at your current level. Don’t worry—the exercise still had value—but consider picking simpler content for future practice.
NOTE: Depending on the original speaking speed, you may be able to slow the playback rate down to 80% or even 70% speed without significantly distorting the speech. Don’t feel guilty about doing this if it helps! Better to understand as much as you can than to let the words pass over you as a stream of noise.
Intensive reading practice
When doing intensive reading, you analyze a text to get contextual exposure to new words, kanji, points of grammar, and language patterns. As you read, you look up every unknown word and puzzle out how all the words fit together, using grammatical analysis and computer translation or any other means necessary. A few things to note:
- Because beginners and intermediate learners should prioritize listening comprehension, I recommend using transcripts of audio or video content so that you can also listen to the content (preferably spoken by a native speaker).
- Intensive reading is an opportunity to reach above your current natural comprehension level, so the selected content should contain a good number of words and language patterns that you’ve never encountered before. Roughly, for every 5 minutes of audio or video, you want about 20 to 40 new or unfamiliar words.
- Because you are studying the text rather than naturally reading it, intensive reading is a slow process. For a 5 minute piece of audio or video, it often may take more than 30 minutes to study the first time you go through it.
- Like with all language practice, feel free to abandon a piece of intensive reading content at any time because it’s too hard, too easy, too boring, or you just feel like doing something else. Always keep in mind that your attention and interest is critical in language acquisition, so you should follow your own impulses.
Random practice
In addition to systematic listening and reading practice, it’s also good to consume content non-systematically with the goal of branching out and familiarizing yourself with parts of the language not usually found in comprehensible input.
So for random practice, you should consume audio, video, or text of various kinds and a variety of levels. Depending on the level, you may understand anywhere from 10% to 100% of what you consume. Reaching above your level exposes you to new vocabulary and grammar, while revisiting lower levels helps develop the ease and fluidity with which you process the language.
Unlike with systematic practice, these random pieces of content will be consumed only once, and for the most part, you should let anything you don’t understand pass you by rather than stop to look things up or translate.
TV and movies
In theory, TV shows and movies are a very appealing way to learn a language, but in practice, most TV and movies are inhospitable for beginner and intermediate learners due to several factors:
- Use of expansive vocabulary
- Use of advanced grammar
- Use of slang and colloquial speech
- Accents and speech quirks
- Archaic and formal speech
- Shouting, whispering, murmuring, mumbling
- Crosstalk
- Loud music and sound effects over dialogue
Furthermore, subtitles are problematic:
- A beginner or intermediate learner often won’t know enough vocabulary or kanji to get even the gist of the story from Japanese subtitles.
- Just turning off subtitles isn’t a great option either because a beginner or intermediate learner watching without subtitles would usually miss most of what is being said and wouldn’t be able to follow the story.
- It’s hard to concentrate on the Japanese audio while reading English subtitles (in fact, processing two languages at once is extremely difficult even if you already have mastered both languages). Furthermore, to the extent that parts of a Japanese and English sentence correspond, they tend to do so in reverse order, so for longer sentences, the part of the English translation currently displayed on screen often doesn’t match what is currently being said in the Japanese audio. On top of this, translators often take large liberties, such that the English subtitltes do not accurately convey the original meaning or the way in which the original meaning was expressed.
- Most video sources don’t provide the option to display English and Japanese subtitles simulteously, making it impossible to check the English subtitles for understanding while also checking the Japanese subtitles for words you missed. (Again though, exercising this option where available isn’t optimal because processing two written languages adds even more mental overhead that distracts from listening.)
Sadly then, beginner and intermediate learners can’t just watch TV and movies as normal for entertainment to get much language value from them.
What can work, however, is using TV and movie excerpts for repeated listening and intensive reading practice. The chosen excerpts should generally have these qualities:
- The shorter the excerpt, the easier it is to repeat and the less liable you are to lose focus. Excerpts as short as 1 or 2 minutes may be appropriate.
- Avoid the adverse factors listed above: heavy accents, use of slang, shouting, dialogue drowned out in the sound mix, etc. Look for dialogue scenes where the characters calmly take turns saying their lines.
- Accurate subtitles should be available in both Japanese and English, preferably in a form where both can be displayed simultaneously and copy-pasted (useful when you want to get a machine translation).
- It helps if the dialogue scenes have a modicum of visual interest. The images don’t actually have to relate directly to what’s being said: rather, they just need to provide visual cues that help you track where you are in the dialogue. (Even just regularly-paced camera cuts between the characters can help a great deal.)
Unfortunately, the TV and movies that fit these criteria may not be your favorites or even in your preferred genres. Particularly in anime, there’s almost an inverse relationship between quality and appropriateness for learners: the better the anime, often the harder it is to understand due to diverse, complex dialogue, extended action set pieces with shouting and loud sounds and music, and other learner-hostile qualities. So until you reach an advanced level, you may have to venture outside of your normal zone of interest to find appropriate content.
Because TV and movies are “native” content rather than “comprehensible input” designed explicitly for learners, they do not fit exactly in the listening and reading practice dichotomy. Even the easiest TV and movie content will generally have many new words and phrases for a beginner or intermediate learner, so you probably will need to fully analyze the transcript a few times to fully understand it. However, the whole point of using this content is for listening, so you should also watch each excerpted story multiple times without subtitles.
Like for listening practice, an excerpt should be repeated no more than three or four times, and the repetitions should be spread out over a few days or weeks. Even if you don’t fully understand the excerpt by the last repetition, don’t worry: the exercise still has great value even when you fall short of full understanding.
Vocabulary tracking and drilling
The key feature of this program is that it tracks the words you encounter in your listening and reading content. When a word becomes reasonably well-known to you (but not necessarily mastered), you can “archive” the word such that it is no longer highlighted in the subtitles and will be filtered by default from vocab drills. This vocab tracking provides two key benefits:
- You can track your progress in the language by the number of words you have encountered and archived.
- You can assess the difficulty of a piece of content based on the number and proportion of unarchived words it contains.
For drilling, follow four rules:
- Drill only the words that you have encountered recently in listening or reading.
- For words and kanji with multiple meanings and pronunciations, focus only on the meanings and pronunciations used in the stories.
- Do not ‘test’ yourself on the words. Drilling is an opportunity to get systematic repeated exposure, so just seeing and hearing words with their definitions is what provides the value.
- Drill an individual word no more than 10-20 times total in your lifetime. You most likely won’t master a word after just 10 to 20 drills, but the goal of vocab drilling should not be to master words but rather to make the words more familiar.
Real mastery of words only comes through encounters in meaningful context, but by drilling words to a certain level of familiarity, each subsequent individual contextual encounter becomes much more impactful. In other words, drilling a word can ‘prime’ you for when you later encounter it in your listening or reading practice or in the wild.
Kana and Kanji
For how to approach the writing system, see the writing system.
Grammar study
The value of explicit grammar knowledge in acquiring a language is highly debated. The obvious demerits are:
- Conscious thinking about grammar while speaking or listening requires too much mental overhead and in fact only really steals your focus.
- The grammar of all natural human languages have parts that are messy and murky.
- Accurate, complete grammar information is hard to find.
On the plus side:
- When reading, grammar can help you break down sentences to understand things above your current “natural” comprehension level.
- When writing, grammar can help you construct valid sentences above your “natural” speaking level.
- Knowledge of grammar can give you comfort or even confidence by putting rational bounds on the language. Without any knowledge of grammar, a language seems choatic and untamable.
- Many of the most common words in a language have functions more than meanings. For example in English, the articles “a” and “the” do not refer to any person, place, or thing like a noun, nor any action like a verb, nor even any manner or context of action like an adverb, and many other languages have no equivalents or near equivalents of articles. So good luck explaining these words without describing how they are used! To describe the functions of words is to explain grammar, so in some sense, at least basic grammar knowledge is inescapable.
My recommendation then is:
- DO learn grammatical concepts
- do NOT memorize grammatical facts and tables
- do NOT practice points of grammar
In other words, its useful to understand grammar as best you can, but attempting to memorize and practice gramar, such as verb conjugations, is more frustrating than helpful. Instead, you want to internalize grammar patterns through lots of repeated exposure to real language, i.e. lots of input. Only this kind of internalization will help you actually put grammar into practice, both for input and output.
Language partners
An increasingly popular practice in recent years is pairing up with a “language partner”, a native speaker of your target language who is also learning your native language. The usual idea is that language partners will meet (perhaps in person but more commonly online) on a regular basis and take turns holding a conversation in each other’s languages.
This is surely a great way to practice speaking, but as discussed at the start, beginning and intermediate learners generally lack the required vocabulary and listening comprehension skills to even attempt holding a basic conversation.
An interesting question then is what else beginner and intermediate learners could do with a language partner that might be effective. One thing I’d like to try (but haven’t yet had the opportunity) is, rather than holding conversations, instead partners could do some kind of exercise or game that prompts communication with a language partner, but with both people only speaking their own native language. So rather than practicing speaking, the goal would be to just practice listening.
There are many possible exercises or games that might work well for this, but probably the simplest would be if both parties brought a story for the other person to read aloud. For example, if I bring my partner a story in Japanese, they would read it aloud to me sentence-by-sentence, and after each sentence, I would attempt to give them my translation in English. Both my partner and I could ask each other questions and prompt each other for clarification, but I would stick to speaking English while they speak Japanese (except perhaps when we prompt each other about specific words, e.g. ‘What does X mean?’). Before moving on to the next sentence, we would check a proper translation to make sure we both understood correctly.
This listening-only approach would have at least a few key advantages over conversation practice:
- Removing the requirement to speak the target language makes the activity more accessible at lower levels, and for many people it would induce less anxiety.
- One reason finding a language partner is quite difficult is because it requires that both people are ready to start speaking, something which take several months or more of study. In contrast, basic listening is something that learners can try in the first month or even on the first day.
- Both participants get practice throughout the whole session, not just when the conversation switches to their target language. Even when my partner reads my story in Japanese, they are getting English input from me when I give my translation and ask them questions. Hopefully then they’re less liable to feel impatient while waiting for me to read their story in English.
Summary of listening and reading practice advice
- Prioritize listening practice over reading practice and especially speaking practice.
- Track the vocabulary you encounter in listening and reading.
- Limit vocabulary drills to words you have recently encountered in listening and reading, and limit lifetime drills of any individual word to 10-20 repetitions.
- If you repeat a piece of content, do so no more than a handful of times, even if you don’t fully “master” it by the last repetition.
- For a long time, you wont understand most of what you encounter, and that which you do understand will often require conscious effort. “Natural”, automatic understanding comes only with massive consumption of the language, and still only then slowly and in pieces.
- As long as the language itself distracts you from focusing on the content of what you read and hear, your input consumption will be intensive rather than extensive. This is not a matter of learning style or choice: the beginner simply cannot understand what they read or hear without looking up many words and consciously pondering how the words fit together, a slow process which consumes much mental energy and greatly limits how much language content can be encountered per day.
- Comfort with a particular topic and style of content doesn’t necessarily translate to comfort with other topics or styles. This can make it seem like your comprehension level has suddenly regressed when you’ve actually just stumbled out of your zone.
- Accept that it will typically require many exposures before a word or kanji is fully memorized. Instead of consciously trying to force memorization, you should lean heavily on tools that make it quick and easy to look up words and kanji. Depend upon reminders, not memorization.
- A single aspect of a language cannot be truly mastered in isolation, and attempting to do so is generally inefficient. Don’t try to master one aspect or ‘level’ of the language before moving on.
Sources of comprehensible input
These are some of the best sources I’ve found of comprehensible input material with transcripts:
- Comprehensible Japanese (transcripts through Patreon)
- Nihongo Picnic (free transcripts)
- Japanese with Shun (transcripts through Patreon)
- Sakura Tips (free transcripts)
- Japanese with Noriko (free transcripts; the podcasts are also posted on youtube)
- Sakura Podcast
These sources lack human-written transcripts in most cases, so you must rely upon Youtube captions or other auto-generated transcription: