Making Recurrent neural net weights decipherable – new ideas.

One problem with neural nets is that after training, their inner workings are hard to interpret.
The problem is even worse with recurrent neural networks, where the hidden layer sends branches back to feed, along with the inputs in the next time step, back to itself.

Before I talk about how the problem has been tackled, I should mention an improvement to standard recurrent nets, which was called by its authors (Jurgen Shmidhuber and Sepp Hochreiter) LSTM (Long Short Term Memory). The inventors of this net realized that backpropagation isn’t limited to training a relation between two patterns, it can also be used to train gates that control the learning by the other gates.  One such gate is a ‘forget gate’. It uses a ‘sigmoid function’ on the weighted sum of its inputs. Sigmoid functions are shaped like a slanted letter ‘S’, and the bottom and top of the ‘S’ are at zero and 1 respectively. This means that if you multiply a signal by the output of a sigmoid function, at one extreme you could be multiplying by zero, which means that the product is zero too, which means no signal gets through the gate. At the other extreme, you would be multiplying by 1, so that the entire signal gets through. Since sigmoid gates are differentiable, backpropagation can be used on them. In an LSTM, you have a cell-state that holds a memory value, as well as having one or more outputs. In addition to the standard training, you also train a gate to decide how much of the past ‘memory’ to forget on each time-step as a sequence of inputs are presented to the net. A good explanation of LSTMS is at:, but the point to remember is that you can train gates to control the learning process of other gates.

So back to making sense of the weights of recurrent nets. One approach is the IndRNN (Independently Recurrent Neural Network). If you will recall, a recurrent net with 5 hidden nodes would not only feedforward 5 signals into each neuron of its output layer, but would send 5 branches with the signals from the 5 hidden nodes as 5 extra ‘inputs’ to join the normal inputs in the next time step.  If you had 8 inputs, then in total you would have 13 signals feeding into every hidden node. Once a net like this is trained, the actual intuitive meaning of the weights is hard to unravel, so the authors asked – why not just feed each hidden node into itself, this keeping the hidden nodes independent of each other. Each node still gets all the normal signals from inputs it would normally get, but in the above example, instead of getting 5 signals from the hidden layer’s previous time step as well, it gets just one extra signal instead – that of itself on the previous time step. This may seem to reduce the power of the net since there are fewer connections, but it makes the net more powerful.  One plus is that with this connectivity, the net is able to train on many layers in each time step. Another plus is that the neurons don’t have to use ‘S’ shaped functions, they can work with non-saturated activation functions such as RELU (rectified linear unit – which is a diagonal line when the weighted sum of neural inputs is zero and above and otherwise is a horizontal line with value zero).


It is easier to understand what a net like this is doing than a traditional recurrent net.

Another ingenious idea came from a paper titled Opening the Black Box: Low-dimensional dynamics in high-dimensional recurrent neural networks, by David Sussilo of Stanford and Omri Barak of the Technion.

A recurrent network is a non-linear dynamic system, in that at any time step, the output of a computation is used for the inputs of the next time-step, where the same computation is made. One the weights are learned, you can write the computation of the net as one large equation.  In the equation below the J matrix is the weights from the context (hidden units feeding back) and the B matrix is the weights for the regular inputs and h is a function such as hypertangent.  The symbol x is the union of u and r, where u are the signals from the input neurons.


The systems described by these equations can have attractors, such as fixed points. You can think of fixed points as being at the bottom of a basin in a landscape. If you roll a marble anywhere into the valley, it will roll to the bottom. In the space of patterns, all patterns that are in the basin will evolve over time to the pattern at the bottom. Attractors do not have to be fixed points, they can be lines, or they can be a repeating sequence of points (the sequence repeats as time goes by), or they can never repeat but still be confined in a finite space – those trajectories in pattern space are called ‘strange attractors’. A fixed point can be a point where all neighboring patterns eventually evolve to end up, or it can be a repeller, so that all patterns in their neighborhood evolve to go away from it. Another interesting type of fixed point is a saddle. Here patterns in some directions evolve toward the point, but patterns in other directions evolve to go away from it. Think of a saddle of a horse. You can fall off sideways (that would be the ‘repeller’), but if you were jolted forward and upward in the saddle, you would slide back to the center (the attractor).


So Sussilo and Barak looked for fixed points in recurrent networks. They also looked for ‘slow points’ that is points that attract, but eventually drift. I should mention here that just like in a basin, the area around a fixed point is approximately linear (if you are looking at a small area). As patterns approach an attractor, usually they start off quickly, but the progress slows down the closer they get to it.

The authors write:

Finding stable fixed points is often as easy as running the system dynamics until it converges (ignoring limit cycles and strange attractors). Finding repellers is similarly done by running the dynamics backwards. Neither of these methods, however, will find saddles. The technique we introduce allows these saddle points to be found, along with both attractors and repellers. As we will demonstrate, saddle points that have mostly stable directions, with only a handful of unstable directions, appear to be of high significance when studying how RNNs accomplish their tasks.

Why is finding saddles valuable?

A saddle point with one unstable mode can funnel a large volume of phase space through its many stable modes, and then send them to two different attractors depending on which direction of the unstable mode is taken.

Consider a system of first-order differential equations

where x is an N-dimensional state vector and F is a vector function that defines the update rules (equations of motion) of the system. We wish to find values round which the system is approximately linear. Using a Taylor series expansion, we expand F(x) around a candidate point in phase space:

(A Taylor expansion uses the idea that if you know the value of  a function at a point X, you can find the value of the function function at a point (x + delta-x), using first order derivatives, second order derivatives, up to n’th order derivatives)

The authors say that “Because we are interested in the linear regime, we want the first derivative term of the right hand side to dominate the other terms, so that


They say that his observation “motivated us to look for regions where the norm of the dynamics, |F(x)|, is either zero or small. To this end, we define an auxiliary scalar function.   In the caption of the equation, they explain that there is a intuitive correspondence to speed in the real physical world:

A picture that shows a saddle with attractors on either side follows:


The authors trained recurrent nets on several problems, and found saddles between attractors, which allowed them to understand how the net was solving problems and representing data. One of the more difficult problems they tried was to train a recurrent net to produce a sine wave given an input that represented the desired frequency. They would present an amplitude that represented a frequency range, (the higher the amplitude of the input signal, the higher the frequency they wanted the net output to fire at) and they trained the output neuron to fire at a frequency proportional to that input. When they analyzed the dynamics, they found that, even though fixed points were not reached,

For the sine wave generator the oscillations could be explained by the slightly unstable oscillatory linear dynamics around each input-dependent saddle point.

I’m not clear on what the above means but it is known that you can have limit cycles around certain types of fixed points (unstable ones).  In the sine wave example the location of attractors and saddle points differ depending on what input is presented to the network. In other problems they trained the net with, the saddle point(s) was at the same place, no matter what inputs were presented  because the analysis was done in the absence of input – maybe because the, input was transient (applied for a short time), whereas in the sine wave it was always there.  So in the sine wave example, if you change the input, you changed the whole attractor landscape.

They also say that one reason studying slow points, as opposed to just fixed points was valuable, since

funneling network dynamics can be achieved by a slow point, and not a fixed point

(as shown in the next figure):


A mathematician who I’ve corresponded with told me his opinion of attractors.  He wrote:

I think that:
• a memory is an activated attractor.
• when a person gets distracted, the current attractor is destroyed and gets replaced with another.
• the thought process is the process of one attractor triggering another, then another.
• memories are plastic and can be altered through suggestion, hypnosis, etc.  Eye witness accounts can be easily changed, simply by asking the right sequence of questions.
• some memories, once thought to be long forgotten, can be resurrected by odors, or a musical song.

One can speculate that emotions are a type of attractor.   When you depressed, the types of thoughts you have are sad ones, and when you are angry at a friend, you dredge up  the memories of the annoying things they did in the past.

In the next post, I’ll discuss a different approach to understanding a recurrent network. Its called a “Neural Turing Machine”. I’ll explain a bit about it here.

It had been found by Kurt Godel that certain problems could not be solved by any set of axioms.

There had been a half-century of attempts, before Gödel came along to find a set of axioms sufficient for all mathematics, but that ended when he proved the “incompleteness theorem”.

In hindsight, the basic idea at the heart of the incompleteness theorem is rather simple. Gödel essentially constructed a formula that claims that it is unprovable in a given formal system. If it were provable, it would be false. Thus there will always be at least one true but unprovable statement. That is, for any computably enumerable set of axioms for arithmetic (that is, a set that can in principle be printed out by an idealized computer with unlimited resources), there is a formula that is true of arithmetic, but which is not provable in that system.

In a paper published in 1936 Alan Turing reformulated Kurt Gödel’s 1931 results on the limits of proof and computation, replacing Gödel’s universal arithmetic-based formal language with hypothetical devices that became known as Turing machines. These devices wrote on a tape and then moved the tape, but they could compute anything (in theory) that any modern computer can compute. They needed a list of rules to know what to write on the tape in different conditions, and when and where to move it.
So Alex Graves, Greg Wayne and Ivo Danihelka of Google DeepMind in London came up with the idea to make a recurrent neural net with a separated memory section that could be looked at as a Turing machine with its tape. You can see their paper here: I’ve corresponded with one author, and hopefully can explain their project in my next post.

Opening the Black Box: Low-dimensional dynamics in high-dimensional recurrent neural networks by David Sussilo and Omri Barak (
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN – by Shuai Li, Wanqing Li, Chris Cook, Ce Zhu, Yanbo Gao (

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: