Implementing a neural network for repetition spacing

Home	Neues	Shopping	Bibliothek	Download	Hilfe	Support
Inhalt : Technical articles

Implementing the repetition spacing neural network

Bartosz Dreger
Piotr Wozniak
May 29, 1998

See Neural Network SuperMemo for a brief introduction to repetition spacing neural network

Basic assumption

The state of memory will be described with only two variables: retrievability (R) and stability (S) (Wozniak, Gorzelanczyk, Murakowski, 1995). The following equation relates R and S:

(1) R=e^-k/S*t

where:

k is a constant
t is time

For simplicity, we will set k=1 to univocally define stability.

Input and output

The following functions are to be determined by the network:

(2) S_i+1=f_s(R,S_i,D,G)

(3) D_i+1=f_d(R,S,D_i,G)

The neural network is supposed to generate stability (S) and item difficulty (D) on the output given R, S, D and G on the input:

(4) (R_i,S_i,D_i,G_i) => (D_i+1,S_i+1)

where:

R_i is retrievability before the i-th repetition
S_iis stability before the i-th repetition
S_i+1is stability after the i-th repetition
D_i is item difficulty before the i-th repetition
D_i+1 is item difficulty after the i-th repetition
G_i is grade given in the i-th repetition

Error correction for difficulty D

Target difficulty will be defined as in Algorithm SM-8 as the ratio between second and first intervals. The neural network plug-in (NN.DLL) will record this value for all individual items and use it in training the network:

(5) D_o=I₂/I₁

where:

D_ois guiding difficulty used in error correction
I₁ is the first optimum interval computed for the item in question (same for all items)
I₂ is the second optimum interval computed for the item

Important! The optimum intervals I₁ and I₂are not the one proposed by the network before its verification but the ones used in error correction after the proposed interval had already been executed and verified (see error correction for stability S)!
Note that the higher the D_o, the less the difficulty.

The initial value of difficulty will be set to 3.5, i.e. D₁=3.5. This is for similarity with Algorithm SM8 only. As initial difficulty is not known, it cannot be used to determine the first interval. After scoring the first grade the error correction is still impossible due to the fact that second optimum interval is not known. Once it is known, D_ocan be used for error correction of D on the output.
To avoid convergence problems in the network, the following formula will be used to determine the correct output on D:

(6) D_opt=0.9*D_i+0.1*D_o

where:

D_optis difficulty used in error correction after the i-th repetition
D_i is difficulty before the i-th repetition
D_o is guiding difficulty from Eqn (5)

The convergence factor of 0.9 in Eqn (6) is arbitrary and may change depending on the network performance.

Error correction for stability S

The following formula, derived from Eqn (1) for forgetting index equal 10% and k=1, makes it easy to convert stability and the optimum interval: I=-ln(0.9)*S

In the optimum case the network should generate the requested forgetting index for each repetition. Variable forgetting index can easily be used once the stability S is known (see Eqn (1)). For simplicity then we will use forgetting index equal 10% in further analysis.

To accelerate the convergence, the network will measure forgetting index for 25 classes of repetitions. These classes are set by (1) five difficulty categories: 1-1.5, 1.5-2.5, 2.5-3.5, 3.5-5, and over 5, and (2) five interval categories: 1-5, 5-20, 20-100, 100-500 and over 500 days. We will denote the forgetting index measurements for these categories as FI(Dm,In). Additionally, the overall forgetting index FI_totwill be measured and used in stability error correction.

The ultimate goal is to reach the forgetting index of 10% in all categories. The following formula will be used in error correction for stability:

(7) FI_opt(m,n)=(10*FI_tot+Cases(m,n)*FI(m,n))/(10+Cases(m,n))

where:

FI_opt(m,n) is forgetting index used in error correction after a repetition belonging to category (m,n)
FI_tot is the overall forgetting index measured in repetitions
Cases(m,n) is the number of repetition cases used to measure the forgetting index in category (m,n)

The formula in Eqn (7) is supposed to shift the weight on error correction from the overall forgetting index to forgetting index recorded in given categories as soon as the number of cases in individual categories increases. Obviously, for Cases(m,n)=0, we have FI_opt(m,n)=FI_tot. For Cases(m,n)=10 the weights for overall and category FI balance, and for a large number of cases, FI_opt(m,n) is approaching FI(m,n).

The following table illustrates the assumed relationship between FI_opt(m,n), grades and the interval correction applied:

Grade 0 1 2 3 4 5

FI_opt(m,n)>10% 40% 60% 80% no correction no correction no correction

FI_opt(m,n)=10% no correction no correction no correction no correction no correction no correction

FI_opt(m,n)<10% no correction no correction no correction 110% 120% 130%

In SuperMemo, grades less than 3 are interpreted as forgetting, while grades equal 3 or more are understood as sufficient recall. That is why no correction is used for passing grades in case of satisfactory FI, and no correction is used for failing grades if FI is greater than requested.
An exemplary correction for an excessive forgetting rate and grade=2 for applied interval of 10 days would be 80%. Consequently, the network will be instructed to assume Interval=8 as correct. Correct stability would then be derived from S=-8/ln(0.9) and used in error correction.
The values of interval corrections are arbitrary but shall not undermine the convergence of the network. In case of unlikely stability problems, the corrections might be reduced (note that the environmental noise in the learning process will dramatically exceed the impact of ineffectively choosing the correction factors!). Similar corrections used to be applied in successive SuperMemo algorithms with encouraging results.

Border conditions

The following additional constraints will be imposed on the neural network to accelerate the convergence:

interval increase in two successive repetition must be at least 1.1 (consequently, difficulty cannot be less than 1.1)
interval increase cannot surpass 8 after the first repetition, and 4 in later repetitions
the first interval must fall between 1 and 40 days
difficulty measure cannot exceed 8

These conditions will not prejudice the network as they have been proven beyond reasonable doubt as true in the practice of using SuperMemo and its implementations over the last ten years.

Pretraining

In the pretraining stage, the following form of Eqns (2) and (3) will be used:

(8) D_i+1:=D_i+(0.1-(5-G)*(0.08+(5-G)*0.02))

(9) S_i+1:=S_i*D_i*(0.5+1/i)

With D₁=3.5 and S₁=-3/ln(0.9).

Eqn (8) has been derived from Algorithm SM-2 (see E-Factor equation).
Eqn (9) has been roughly derived from Matrix OF in Algorithm SM-8.
D₁=3.5 corresponds with the same setting in Algorithm SM-8.
S₁=-3/ln(0.9) corresponds with the first interval of 3 days and forgetting index 10%. The value of 3 days is close to an average across a wide spectrum of students and difficulty of the learning material.
Pretraining will also use border conditions mentioned in the previous paragraph.

Grade	0	1	2	3	4	5
FI_opt(m,n)>10%	40%	60%	80%	no correction	no correction	no correction
FI_opt(m,n)=10%	no correction	no correction	no correction	no correction	no correction	no correction
FI_opt(m,n)<10%	no correction	no correction	no correction	110%	120%	130%