Stephen Grossberg, an early pioneer in the field of neural nets and the brain, recently published a book called “Conscious Mind Resonant Brain” that describes the theories he’s developed since the 1970s. In chapter five he discusses one theory he named ‘Adaptive Resonance Theory” (ART) that led to a product that has been used in many fields, from medical imaging to manufacturing.
ART can be thought of as a clustering algorithm. Imagine a semantic space where every dimension is a feature. For instance, think of a coordinate system with ‘x’ ‘y’ and ‘z’ axes where each axis represents a feature. ‘x’ might represent ‘has fur’, ‘y’ might represent ‘purrs’, and ‘z’ might represent ‘barks’. Each axis might go from zero to one, where zero means ‘totally lacks the feature’ and one means ‘has the feature.’ If you draw a point where ‘x’ = 1 and ‘y’ = 1, but ‘z’ = 0, you could label the point ‘cat’. Obviously, you would want to use more dimensions for more features, and you might to classify various relatives of the cat (lion, tiger, leopard) which would be a set of points that cluster near each other, while ‘wolf’, ‘dog’, ‘coyote’) would be another cluster of points. Even though the dog-cluster would not overlap the ‘cat’ cluster, it would be closer to the ‘cat’ cluster than to the ‘dolphin’ cluster. The goal in this example is to have ART learn clusters as it is presented with features of various animals. This raises a couple of issues. One is, how narrow should a cluster be? For instance, dogs bark for many reasons, while wolves mainly bark as an alert signal. Dogs are domesticated, but wolves are not. So should wolves be left out of the ‘dog’ cluster?
A related practical question is, if you simply train a neural net on data, and you alter the weights each time, then could you have the weights that originally stood for ‘dog’ mistakenly approach a representation for ‘cat’?
To show how ART solves this, let’s start off with the following diagram, which shows a feature pattern of inputs at a lower layer, feeding into a few nodes in an upper layer.
Each node in the upper layer is a category. Each category gets the same inputs from below, but the categories compete, by sending inhibitory links to each other. This way, the category that is the most active suppresses competing categories. A category that wins is one whose weights match the inputs better than the other categories. There are two sets of weights involved, weights going up from the inputs to the categories, and weights going down from the categories to the inputs. When a category is learned, both its bottom-up weights and its top-down weights reflect the pattern it learned. Suppose the three features in the example below are: 1. Has fur 2. purrs 3) barks. Then an input of true,true,false (or 1,1,0) might indicate a cat was encountered. 1,0,1 might indicate a dog was encountered. Conversely, if a category stands for ‘cat’, it has weights going downward that would produce true,true,false in the inputs and if a category stands for ‘dog’, it has weights going downward that would produce true,false,true in the inputs.
In the lower half of the diagram, you can see the effect of a top-down signal on an input pattern. The yellow pattern on the left gets modified by encountering a category that has expectations that are somewhat different. The category sends its own expectation of a signal pattern down, which doesn’t quite match the input pattern, and modifies it somewhat.
In the above image, a category is shown sending its signal via weights down to the features, but it is also sending indirectly an inhibitory signal to the features. The balance of opposing signals means that by itself, it will not fire the feature cells. But the category can fire cells in the feature layer if inputs are also coming into them from below. A neuron in the feature layer that doesn’t get excited both from above and from below will not fire. A neuron that gets excitatory signals from both above and below will fire.
At the bottom of the above image, you can see inputs ‘I’ feeding into the feature layer. A single arrow is shown for simplicity, but in reality, many arrows, one for each input would be feeding into the box labeled F1 (feature layer). The inputs I also send a branch to ‘A’, (the triangle with the ‘rho’ symbol in it). That branch computes a sum of all the inputs and multiplies it by a constant (rho) which is between zero and one. The constant is also known as ‘vigilance’. At this point, a pattern ‘X’ is expressed over the cells in F1, and they also send a sum to ‘A’, but on an inhibitory link. The result is that ‘A’ is inhibited. The signal from ‘X’ to ‘A’ has as many neurons firing as ‘I’, so the net effect is that ‘A’ does not fire.
F1 sends a vector of signals up to F2, and that vector is multiplied by the weights leading to the category nodes in F2. The various category nodes in F2 compete, and one (or a few) cells win the competition. Now that the winning category nodes are firing, they send signals back down to F1, along their top-down vector of weights.
Now look at the above figure. Suppose the category doesn’t match the inputs well. In that case, when it sends signals back to F1, they don’t reinforce the inputs all that well. In the above image, the gray triangle in the F1 box represents the neurons that are reinforced. The others do not fire. If the ‘vigilance’ is low, then inhibition might still win out at ‘A’. The inhibition is coming from F1 toward ‘A’ (sideways in the figure) and the inputs aren’t powerful enough to cancel the inhibition. In this case, the category, even though it isn’t a great fit for the inputs, will ‘resonate’ with the inputs. It will reinforce some of them, and some of them will reinforce it, and learning will occur. The weight vectors will approach the input vector.
Suppose the vigilance is high, or the mismatch is high. Then ‘A’ will not be inhibited enough to counteract the excitation coming from the inputs ‘I’. So ‘A’ will send a nonspecific arousal signal to F2. When that happens, all the category cells in F2 get a signal. The category that has been active so far now will be suppressed (I’ll explain why below but think of it as exhaustion at the connections leading from F1 to the category in F2). This gives other categories a relative boost, and one of them wins the competition and has a chance to learn the pattern ‘I’.
In the above diagram, a new category (Y) has won, and input pattern I has reinstated pattern ‘X’ in the F1 layer, since no top-down signals are combining with X yet. So the new category has a chance to learn the new pattern. Y might come an existing category cell that had already learned a pattern, or it could come from a cell that simply has large weights but hasn’t learned to categorize anything yet.
I mentioned that the arousal signal from ‘A’ suppresses the category that was already active. The reason is that over time, the signal from F1 has less effect in exciting its category, perhaps due to the axons from F1 getting fatigued from constant firing. This is known as ‘habituation’. F1 is not firing other categories at this point, so the axons feeding those categories don’t get habituated. In addition, each category cell is paired off with an ‘OFF’ cell, by a link of mutual inhibition. The ‘OFF’ cells get fed by the same nonspecific arousal signal from ‘A’ that the category cells get. When an ‘ON’ category cell gets fatigued, the ‘OFF’ cells rebound in intensity and suppress the fatigued ‘ON’ cell.
The category node learns critical features for a prototype ‘dog’ or ‘cat’. A coarse category node, ‘mammal’, might have fewer features because mammals have fewer features in common than cats do. A node for a very narrow category might stand for your pet cat, with its coloration and so forth. A high vigilance leads to narrow categories. Professor Grossberg theorizes that autistic people might have a high vigilance.
He also has some ideas on how hallucinations come about:
Normally the category node doesn’t fire the feature nodes by itself – rather it modulates them. Its signal is below threshold to fire them. However, if for some reason the category node signal is boosted, the feature cells could fire, and you could experience entities that do not exist.
Grossberg thinks that the basal ganglia can send a signal that converts a modulating category into a firing category. The results might not be hallucinations, they might be imagination, as when you imagine a plan of action. It is only when the basal ganglia get out of control that you could have experiences that are hallucinations. ART also clarifies why unstructured auditory inputs, such as white noise (a noise that is the same at all frequencies) can increase the severity of hallucinations, while listening to speech or music helps to reduce them. The top-down expectation would be a sound made up of only certain frequencies, but the white noise supplies some bottom-up counterpart at those frequencies, (as well as every other frequency) and so those frequencies are enhanced, and the patient has a stronger hallucination. In the case of music, however, the bottom-up signal would not match the top-down expectation and so a reset happens, and the music is heard, rather than the hallucination.
Grossberg writes: “This explanation also clarifies why auditory hallucinations may be more frequent than visual hallucinations. We actively receive bottom-up visual inputs whenever our eyes are open in a room that is not totally dark. Thus, there is a continual stream of visual inputs to reset top-down visual expectations that might otherwise cause hallucinations in a schizophrenic. In contrast, we can often find ourselves in a quiet place.”
One advantage of ART over deep learning is that it in the case of ‘fuzzy ART’, you can read out ‘fuzzy logic rules’ to understand how it arrives at its answers. You can’t do that with ‘deep learning’.
The book “Conscious Mind Resonant Brain” has more theories and suggestions. (I do think you sometimes have to read the references to really understand the chapters.) it’s a very worthwhile book for people interested in how the brain works.