1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
|
# Datagengo
Datagengo (データ言語) is a new (experimental) method for learning Japanese kanji.
Datagengo is an algorimically-generated list of lessons, each containing a batch of 20 kanji and just enough example sentences to learn those kanji in context.
The core of the method consists in memorizing the example sentences and writing them down repeatedly on paper.
Lessons are in increasing difficulty according to JLPT levels and school grade indicated in [KANJIDIC2](http://www.edrdg.org/wiki/index.php/KANJIDIC_Project).
Example sentences are sourced from the [Tanaka corpus](http://www.edrdg.org/wiki/index.php/Tanaka_Corpus).
## Why this method?
I'm putting this section first to make it visible, but you might want to skim
[how to use Datagengo](#how_to_use_datagengo) first.
I had previous experience with learning kanji using books, starting with
Heisig's *Remembering the Kanji* (RTK), and spaced repetition systems (SRS)
such as Anki or WaniKani. I started with RTK, which I think was truly usefull
to me as it gave me a sense of how kanji are constructed and therefore how to
apprehend them, but I never completed the book. I went on and started using an
Anki deck that contained all the RTK kanji, which I didn't use for very long. A
bit later I started WaniKani and went through a bunch of that, but I still
stopped at some point, partly by lack of motivation and partly by frustration
that this was going nowhere.
Here are my main frustrations with SRS methods:
- The methods I used were based on individual kanji or word, and did not
provide real-world context in which those kanji or words might be used.
- Items (kanji or words) are taken in isolation and not grouped by level or by
lesson, which means that the full effort of memorization has to be made for
each item independently.
- As a consequence of the lack of context and logical grouping, I found it very
easy to confuse different kanji or different words.
- I was always doubtful of "spaced repetition with increasing intervals", it
feels way less efficient to me than consistently repeating something for some
time and then assuming that it is definitively learnt. In my head, "assuming
something is learnt", after having spent a definite amount of effort on it and
having decided it was over with, helps create a category of things I'm supposed to
know, which in turn helps cement that knowledge.
During my multi-year break from Japanese (something like 2016-2023), I pondered
this question occasionally and started thinking about a new method, that
eventually became Datagengo.
The basic idea behind Datagengo is to add as much context (explicit or implicit) with the
learning of each kanji, so that all of these contextual clues can be used
when recalling them.
Here is what Datagengo does, and why I expect it to work much better, at least for me:
- Datagengo exploits the Tanaka corpus to provide in-context example sentences
for all of the learnt kanji.
- Datagengo tries to help you learn all the kanji using as few example
sentences as possible, to be efficient.
- Datagengo requires the learner to write by hand, which is in my experience the
best way to learn difficult things.
- Lessons, or "batches", are logical units that somewhat "work together" (even
if the sentences or kanji have nothing to do with one another), helping to cement
that entire unit of knowledge in one go instead of lots of independent
efforts on tiny things.
- Once a lesson has been studied around a dozen times, it becomes very easy to
recite, and it is therefore natural to declare it as "acquired knowledge".
Even if you forget the details of the individual sentences, memory of the
kanji will stay, embedded within the context of the lesson and therefore
easier to pick up when called upon.
- The period at which you were studying each kanji also becomes part of its
learning context, making it again easier to recall. If like me you
have a visual understanding of the passage of time (I personally see the
year as a big loop, with months of the year each having a relatively
precise position on it), then this effect can be even stronger.
## How to use Datagengo
### How to study a lesson
#### High-level overview
1. Write down the 20 kanji for each lesson.
2. Write down all of the example sentences in the lesson from memory.
3. Check what you did.
4. Repeat every day for about 10 days.
#### Detailed explanation
1. Write down the number of the lesson, the current date and time, and how many times you have studied this lesson.
2. Write down the 20 kanji composing the lesson.
If possible, do this from memory, otherwise it's fine to look at the lesson page.
Make sure that you are not mixing up different kanji and that you know the correct stroke order for each kanji.
3. If this is your first encounter with this lesson, just read the sentences and familiarize yourself with all of the sentences and the new words they contain, and be done with it.
Otherwise, proceed to the next step.
4. If this is one of the first few times you are studying this lesson, re-read all of the sentences to ensure you have them in mind.
5. Close the lesson's web page, and write down all of the sentences in the lesson from memory.
You can use the list of 20 kanji you just wrote to help you remember the sentences.
6. Return to the lesson's web page and check that you wrote all sentences correctly.
If you have any doubts on stroke order or pronunciation, take the time to check.
7. Write down the time at which you finished studying the lesson.
8. If you have the patience, have a look at the "extra vocabulary" section at the bottom of the lesson's page.
You don't need to actively memorize this, it's just so that you have the information that these words can be written using the kanji of this lesson when you encounter them elsewhere.
The overall process takes 10 to 15 minutes per lesson, sometimes less, sometimes more. Each lesson will take less and less time as you repeat it.
### Planning your study
I recommend studying each lesson for 10 to 12 days, every day, and adding a new
lesson about every three or four days. This means that you will have 3 or 4
lessons to study each day, which takes me between 30 and 45 minutes total
(that's why I'm suggesting that you write down the time when you
start/finish!).
If you are having a harder time memorizing the kanji and the sentences, you can
adapt the schedule to your learning speed. Here are some examples:
- __Slowest:__ study each lesson for 12 days, add a new lesson every 6 days (average 2 lessons to study every day).
- __Slow:__ study each lesson for 10 to 12 days, add a new lesson every 5 days (average 2 lessons to study every day).
- __Medium:__ study each lesson for 10 to 12 days, add a new lesson every 4 days (average 2-3 lessons to study every day).
- __Fast:__ study each lesson for 10 to 12 days, add a new lesson every 3 days (average 3-4 lessons to study every day).
- __Extra-fast:__ study each lesson for 8 to 10 days, add a new lesson every 2 days (average 4-5 lessons to study every day).
The two parameters can be tuned separately according to your needs:
- __How many days you will keep studying each lesson:__ you can reduce this
if you feel that the last repetitions are becoming boring/useless, but those
last repetitions will also become very fast and it's always good to do them
as practice.
- __How frequently you add a new lesson:__ being consistent with this will help you
plan long-term. For instance if you are on average adding one lesson every 6
days you will know all JLPT N2 kanji within a year, and if you consistently
add a new lesson every 4 days you will know all JLPT N1 kanji in slightly
over one year.
__Note that 65 JLPT N1 *jōyō* kanji, as well as 186 *jinmeiyō* kanji also marked for N1 in KANJIDIC2, did not have an example sentence in the Tanaka corpus and are therefore not included in Datagengo.__
The list can be found below the batch list in the level list, in the "missing
kanji" column, in rows N1a and N1b for the *jōyō* kanji and N1-9 for the
*jinmeiyō* kanji. You might want to study at least the 65 *jōyō* kanji
separately before attempting to pass JLPT N1.
### Which lessons should I learn?
Here is how the lessons are organized, currently:
- __000 to 005:__ JLPT N5
- __005 to 014:__ JLPT N4
- __014 to 031:__ JLPT N3
- __031 to 051:__ JLPT N2
- __051 to 098:__ JLPT N1, further split in three parts for easier processing:
- Lessons 051 to 058 (marked N1a) contain kanji learnt in Japanese elementary school.
- Lessons 058 to 095 (marked N1b) contain kanji learnt in Japanese secondary school.
- Lessons 095 to 098 (marked N1-9) contain *jinmeiyō* kanji (for use in names).
- __098 to 105:__ extra *jōyō* kanji not part of JLPT but learnt in Japanese elementary or secondary school (marked N0a and N0b respectively)
- __105 to 114:__ extra *jinmeiyō* kanji (marked N0-9)
- __114 to 126:__ even more kanji, not part of JLPT, *jōyō* or *jinmeiyō* (marked N0+)
If you are studying for advanced levels, make sure to check the character table
below the lesson list, and in particular the "missing kanji" column, to know
all characters for which no example sentences were found in the Tanaka corpus
and which are therefore not included in Datagengo.
|