menu: | home | extrapolator | search | links | general semantics | personal | contact | articles | Style | site map | ballroom dancing | Laws of Form |
November 19, 2004
Jeff Hawkins book, On Intelligence, published by Henry Hold and Company, New York, (2004), provides a new, and perhaps first, comprehensive theory of intelligence based on brain function research and the failures of the artificial intelligence efforts. This book summarizes significant research of the past few decades. Hawkins calls his theory of intelligence as implemented in human brains the memory-prediction framework model. As the inventor of the Palm Pilot and the developer of its cursive input language, graffiti, he spent the last twenty-five years of his life deeply interested in the two areas that contribute to his ground-breaking new theory, how the human brain works, and mobile computing. As artificial intelligence (AI) efforts to process handwriting proved more and more elusive, Jeff's studies of human brain function and intelligence lead him to develop graffiti as a stop-gap solution. It's extremely easy for people to learn cursive strokes to represent letters, but it has proven nearly impossible for AI to process handwriting in any sequential input manner. The memory-prediction framework model of intelligence is a direct result of the cross-fertilization of his brain research and computing experiences. The theory is elegant, comprehensive, and surprising simple. Its basic principles should be relatively easy to grasp for the intelligent layperson.
Intelligence, in this theory, is a function of comprehensive adaptable memory. Information travels very slowly in neurons - from 1.4 to 200 miles per hour. Even massive parallel processing cannot perform the computations necessary to "think out" solutions quick enough with slow speed organic computing hardware. The memory-predictive framework model holds that intelligent responses to our environment are mainly remembered experience quickly recalled to predict what's coming and modified for the immediate situation. We respond intelligently to our environment, because we are using our experiences to predict what's coming - not what's coming next week, but the next word, the next note in the song, the next thing we will see. Our brain is constantly and continually using our memory of past experiences to anticipate what we will experience next. Minimum effort and awareness is needed to handle anything that is close to our previous experiences. It's only sharp deviation from those experiences that require our full attention. We are well adapted to our environment because our brains build a comprehensive model of our environment that enables us to accurately predict what we will experience in most situations, from the immediate few seconds for what word, note, or touch, etc., is coming next, to a whole set of scenarios of common experiences. With a familiar song we know the next note coming. When we go to the restaurant, we can predict the sequence of events that will happen. The details may vary somewhat, but prediction using memory of experiences is what enables a slow organic computer to be "on top of the situation" when it arrives. The theory accounts for a great deal of the research results as well as why many of the AI efforts have been so fraught with problems and disappointment. In education alone the memory-predictive framework model will institute a paradigm shift not seen since Piaget developmental approach.
Construction area: notes for relating to general semantics
Don't count on the following remaining unchanged or even here. Do not cite or quote without consulting with me first.
Included in this summary are findings that need to be incorporated into general semantics as an ongoing, self-correcting, and continually updating system.
Page 96-7 states:
"Science is an exercise in prediction. We advance our knowledge of the world through a process of hypothesis and testing."
That we do this on an individual neurological and neurolinguistic basis has been the claim and subject of Korzybski's general semantics since 1933 and before. Hawkins presents a theory of brain function that directly explains how abstracting and consciousness of abstracting can work.
This book presents significant information about brain function that supplements and sheds light on general semantics. It's main thesis is essentially consistent with abstracting, however the theory shows that both abstracting and identification are essential to intelligent brain function. The main theory is based on research that has shown that the brain is organized hierarchically, that is, in levels of abstracting.
Memory-predictive framework | general semantics |
"names" | abstraction and identification |
hierarchical organization | levels of abstracting |
prediction-anticipation | non-verbal "assumptions" |
same internal representation from different senses | non-elementalisms |
recalling past experiences | semantic reactions |
consciousness as memory | consciousness of abstracting |
prediction | projection |
It's a very readable book that does not require a technical background to understand the main thrust.
It should be required reading for any would-be general semanticist, as well as anyone interest in intelligence.
It is also the philosophical approach to knowledge more generally known today as evolutionary epistemology.
On Intelligence (top of page 20 - Searl's Chinese room): This analogy is "seduced" by the Cartesian homunculus metaphor. The "CPU" (and the man inside the Chinese room) correspond to Descartes "homunculus". It is the orientation of looking for the "essence" as a single entity that embodies "understanding". The idea is, put simply, ludicrous, because it leads to infinite regress. We are also seduced by the notion that "intelligence" is not something that can be understood, and any process we find that we can understand, is not "it". If you argue that the Chinese room does not understand the story, then you must conclude that neither does a person. The person simply responds to inputs using his or her experiences in order to produce a new output. The attitude is an example of extreme reductionism.
On page 125, "One of the most important concepts in this book is that the cortex's hierarchical structure stores a model of the hierarchical structure of the real world. The real world's nested structure is mirrored by the nested structure of your cortex."
There is no "hierarchical structure" in the "real world", because any groupings of any kind are simply the work of our abstraction process. Just as Hawkins described earlier in the book that, in a close look at the edge of a log and the bank, they were indistinguishable, the same is true for any putative "structure" in the "real" world. The "matter" of which the universe is made has no structure other than that which we impose on it. The "real world" is organized hierarchically only by the instrument (our brains) observing it. That is not to say that we can't experience predicable regularity among our abstractions. To infer that distinctions made by an abstraction process "actually exist" in the substratum is to be guilty of the Aristotelian view that there are essences. While the view that the structures do not exist may seem like Nihilism, it does not deny that agents can assign values to their responses, it only acknowledges that there is no way to obtain direct knowledge of any structure other than by interpreting the responses of an observing medium or instrument. It's applying Occham's razor that shears away any postulate that the distinctions made by the instrumentation correspond and are derived, not by the structure of the instrument, but by any pre-existing structure. It's more appropriate to only assume an amorphous undifferentiated source for our reactions. Anything else is a matter of faith, not science. Just as there is no intrinsic value, there is no intrinsic structure. The values, and the structures, we attribute to the "what is going on" are projections - predictive interpretations "imposed" on reality by our efforts to make sense of it based on our prior responses.
In general semantics, Stuart Mayper and Bob Pula used to refer to the event level as the "mad dance of electrons", but even "electrons" are structures we impose, by inference, on the WIGO (what is going on).
Korzybski has said that general semantics may be completely superseded in time. With the publication of On Intelligence a major basic foundation of general semantics is in need of significant revision.
Korzybski presented general semantics as the answer to the general question of epistemology, "How do we know what we know?". Korzybski's answer is, "How we know is with our nervous systems and brains, abstracting in multi-level stages from detecting and sensing, to abstracting characteristics into neurological levels, to abstracting into verbal levels, and continuing to abstract through higher and higher verbal levels - consciousness of abstraction.". General semantics focuses almost exclusively on hierarchical abstracting in the "proper" order of abstracting, from the objective low-level abstracting to the subjective high-level abstracting. In the information technology profession, when we are writing programs and designing applications this way, we call it "bottom-up" processing. The analysis begins with the details and seeks to organize them into higher and higher structural relations. This is the natural science method that yielded morphological classification systems involving species, genera, orders, classes, phyla, and finally kingdoms. Virtually no theory is present in general semantics which provides for the reverse process, "top-down" processing. In fact, general semantics has a direct strong bias against this perspective. We are not supposed to "assume" what the result is going to be; we are supposed to objectively observe. This has been harped on again and again as the "reverse of the natural order". We are supposed to fight our tendency to "identify" across levels of abstracting. "Non-identification" has become a watch-word among general semanticists as a desirable orientation. We are urged to "get to the silent level" in our thinking, to be more "objective", to be more "extensional". All of these put "top-down" processing in a "bad light" from the view of general semantics. This is about to change.
Jeff Hawkins memory-predictive framework model of intelligence is based on research and observations of brain structure and function, and that research is beginning to deviate significantly from the theory that general semantics presents. For general semantics, abstraction in an inward (and upward) direction is the dominant principle. "Identification" is conceived of as "a bad thing". This is so much a part of general semantics that the current "politically correct" orientation is to completely eliminate the use of the "is of identity" from our speaking habits, to learn to speak using E-prime - English without the verb 'to be'. General semantics does not even speak of memory in its basic presentation of the theory. I have been harping about this lack for literally decades. I have frequently stated that we abstract from our memory as well as from our sensory inputs, that we abstract into pre-existing structures. General semantics is, in this regard, an over-simplified theory of sensory input processing that is carried on through a hypothesized transition from neurological to verbal levels and continues on. The emphasis is wholly on the input direction. It is virtually silent on any reverse direction. The structural differential chart sold by the Institute of General Semantics shows a single dotted arrow back to the event level, and Science and Sanity, page 471, shows some label tags tacked back to the event level from the verbal levels. These are interpreted in terms of the parochially called Sapir-Whorf hypothesis. Going "backwards" from higher levels of abstraction to lower levels is considered the "wrong" direction. The Sapir-Whorf hypothesis essentially holds that we perceive the world in terms of the language structures we use to describe it, it is also called the linguist relativity hypothesis, as it was hypothesized centuries before Whorf made it well known.
The view that "different languages carve the world up in different ways" has as a common underlying theme the idea that language in general shapes how we perceive the world. In simple terms, we "see" what we have words for. Because Whorf figures prominently in general semantics, it should be relatively easy to expand the importance of this view to parallel the importance of abstracting. The structure of our language can be thought of as the "destination" for the abstraction process. At general semantics seminars Don Kerr always emphasized that each of us brings our own "experiential" (non-verbal) elements to any symbol. As I have said many times before, "We abstract into our pre-existing linguistic categories," and thus understand our experiences in terms of those categories. To discuss the massive influence those pre-existing linguistic categories that have been built up by our experiences in the world as we grow up, we need some kind of model or metaphor. I will construct one using the relative staid and static picture of the general semantics device, the structural differential, and something else very dynamic and lively. That something will need to reach down the levels of abstraction from the high levels into the lower levels, and it will need to direct and guide the inputs into the "proper" category structures. It will also need to be able to "grab" onto something unique and new and quickly bring it to the highest levels of our consciousness.
Let's turn the structural differential upside down, so that the "up" abstraction direction corresponds with the physical "up" direction, like Hayakawa did when he presented his version called the abstraction ladder. . Superimpose on it an octopus with dangling tentacles hanging down into the lower levels of abstraction. Our sensory inputs travel by successive level of abstracting up until they are caught by the dangling verbal tentacles of the octopus, each arm representing some pre-existing linguistic categories. The dangling arms divide and guide the input process at every step of the way. The model that general semantics presents entails an octopus that is virtually asleep. Nothing happens to its arms except once in a great while when something very new is learned that move or divides one of the arms. In walks Jeff Hawkins and the octopus wakes up and becomes very active. It is constantly wiggling its arms, poking here and there, looking for everything it remembers about this place (where the organism is now) and everything it remembers about similar places.
The memory predictive framework model holds all the memories in the octopus's brain. It is constantly thinking about and predicting what is coming next. It is wiggling and vibrating it's arms at every step of the way right down to very low levels to guide the incoming sensory system in what to expect next. "I remember this was here, look for it there; if you find it there, don't bother me about it, but let me know if it's not where its supposed to be." Sing the opening verse of "Joy to the World". The octopus is using memory to tweak the hearing part of the brain to "prepare" the nerves that will experience the next lower note in the sequence. When it arrives these nerves say, "thank goodness" and go back to sleep, because everything is proceeding according to plan. But let a different note be heard - one that was NOT expected, and these nerves wake up and shake the octopus's dangling arms to get its' attention. "Something is WRONG; it's not happening the way you said it would." In the case that the unexpected note makes sense because it fits into a different melody, we have the makings of a musical joke. Another arm of the octopus has said, at a higher level, I remember this too, look for it." Cognitive restructuring, as this is called, shows that we can quickly switch between two understandable interpretations when we suddenly discover that the more probable interpretation was not the one that is arriving. The only joke I can remember goes like this, "I took the ferry to Staten Island. And on the way, I found out that he was a very nice fellow". Since "ferry" is much more commonly used to refer to a boat, we naturally expect the rest of the story to be about the boat, but when we get to "he" we suddenly realize that the less common (and now politically incorrect) usage as a slang term for a homosexual person is what was originally meant. If our mental processing was like general semantics theory of abstraction says it is, then we should not have been surprised. We should be evaluating the words as they arrive, retaining ambiguities until all data is in. ("delayed reaction") The memory predictive framework model, however, states that we are continually actively predicting what we will be experiencing next, right down to the lowest level of abstracting. We can do this because we have a comprehensive memory of all the experiences we have previously had.
How can having a good memory allow us to "predict" what's coming next? It turns out that the most successful attempts to model some human brain processing involve what is called "auto-associative memory". Auto-associative memory works this way. If something is stored in auto-associative memory, then feeding a part of that something into that auto-associative memory causes the auto-associative memory to spit back out the whole thing. Feeding "my address is" into my human auto-associative memory results in it sending back "my address is 191 White Oaks Road, Williamstown, MA, 01267-2259, USA, North America, Sol 3". Can I remember my paper route from 48 years ago? Sure, I just start with my paper route starts at the drug store where I pick up the papers at 5:45 A.M. Feeding that back, I get a memory of folding all the papers. Feeding that back, I get a memory of starting to walk the route. Down main street, across the bridge, turn left on River street, one house on the left, three on the right, back to the hotel at the corner of Main and River streets, Across the street, cut across the lot to Maple street, etc. Each time I think of a part of the sequence, the next part comes back. Can I tell you now the exact sequence of the final configuration? Not any more, because a thousand days have become rolled up to into one primary sequence, and 48 years is a long time to not use a memory. I have a strong suspicion that if I were to go back and actually walk the route, much more detail would come back to me. One part won't work though, because the a new building blocks the lot I used to cut across.
Memory training schools teach us to associate unconnected lists with sequences we already know. Bring up the known sequence, and all the associated items will be right there. I can't remember somebody's name, so I begin to recite the alphabet to myself, sounding out the letter. Frequently I am rewarded with the person's name popping up when I get to the appropriate letter or sound. Bingo! Adding just a little bit of linguistic data to the input pattern - my vague recollection - makes the submitted pattern just a little bigger - more data - and that causes the auto-associative memory to "click" and send back to the conscious levels the rest of the information. It works, and it works very well. Neural net researchers have built auto-associative networks with as little as four layers of neurons. These networks can be trained to recognize patterns, and, when they are presented with a part of the pattern, they respond with the entire pattern. The human cortex has six functional layers. But it is also estimated that we have about 30 Billion cells, and each layer is more than one cell thick, but the thickness varies from functional area to functional area.
A main characteristic of auto-associative networks is that they contain lots of feedback connections, and these work in both the learning phase as well as the recognition phase. If one draws a parallel with my structural differential and octopus analogy, the octopus arms correspond to the processing in the feedback connections while the structural differential represents the normal forward (inward and upward) flow of responding. When one first wakes up in the morning, and one opens one's eyes, one already expects to see what one remembers from the night before, and previous nights, etc., but if one wakes up in a totally different place, one can feel very disoriented. My very first night away from home after graduating from high school was in a barracks at Navy boot camp at about 0530 to the sound of a coke bottle being run around the inside of a large empty zinc plated garbage can. It was a horrendous noise. We had arrived very late after dark, exhausted from a day of input processing in Albany, a late charter flight to Chicago, and a bus trip to the Great Lakes Naval station Navy Boot Camp. There had been no time to become familiar with our surroundings. When I burst out of the unfamiliar rack and stood up, I could not even see. I was totally disoriented. Noise like I had never heard before, a drill instructor barking commands, bright lights after too short sleep. It took several seconds for my visual cortex to "boot up" and begin processing totally unfamiliar inputs. I saw a gray haze with unrecognizable multi-colored blobs that gradually took shape, not unlike the process of adjusting binoculars which had been totally out of focus. The octopus had been shocked into nearly complete immobility. I did not know what to expect, so I could not even understand what I was seeing. As memory came back, of where I was, how I had gotten here, where I put my things last night, I began to go with the flow. Almost everything was new to me, the physical environment, the social environment, the cultural expectations, etc., but all the little daily routines soon found a way to fit into this new physical and semantic environment. The cortex was rapidly learning how to predict what was coming next, and within two days the routine was established - reveille at 0600, shower, march to exercise, march to class, march to lunch, march to PT, march to more classes, march to supper, march to more PT and drill, march to the barracks, do laundry, maybe have some time to read or write, taps - lights out at 2200. The cortex adapted, prediction worked like clockwork. The disorientation and uncertainty disappeared as rapidly as new memories were acquired. Take me to the chow hall once, and I can find my way back there - following the same path - as the first time. Each step of the way stimulates the octopus to recall the next step, and the feedback connections prepare the lower level cells to experience the previous memory. When it arrives, "no big deal, things are just as expected". But wait, this next corner looks exactly like the previous one, I can't really tell the difference, so my prediction circuits send an alarm up to the next level. That levels is already looking at the plan, and the next higher level remembers that there are two corners that look the same, so it has programmed the intermediate level to be expecting a "same corner" alarm. In case you haven't really noticed, I'm talking as if I were walking around in the levels in the hierarchy of nervous processing, listening in to the "thoughts" of the anthropomorphized levels. It's a metaphor we use very often to explain the structure of something; we virtually "go there" and look around, but I digress. I've described the process in terms of levels with higher levels feeding information down the structure as to what to expect and lower levels feeding information up the structure, but primarily when the received information does not match what the higher levels "told" the lower levels to expect. Which leads me to the hierarchical nature of both the brain organization and the structural differential. The main difference, however, is general semantics describes abstracting as a one-way process going up the hierarchy, and that is beginning to look wrong as a result of brain research over the ensuing 70 years, particularly the last 30 years.
So the neocortex is not like a computer, parallel or otherwise. Instead of computing answers to problems the neocortex uses stored memories to solve problems and produce behavior. Computers have memory too, in the form of hard drives and memory chips; however, there are four attributes of neocortical memory that are fundamentally different from computer memory
- The neocortex stores sequences of patterns.
- The neocortex recalls patterns auto-associatively.
- The neocortex stores patterns in an invariant form.
- The neocortex stores patterns in a hierarchy (*)
Each "level of abstraction" in the neurological processing in the cortex involves 6 cortical layers of cells, and there are many levels to be processed before even getting close to linguistic levels. General semantics presents the most simplified and rudimentary model that is far too abstract to be even remotely useful, because the general semantics model only shows one level for sensory processing (the interface between the event level and the object level), one level for neurological processing (the object level) and one level level for neurolinguistic processing (the interface between the object level and the first verbal level). Others have occasionally proposed expanding this into a few more levels - Tom Nelson was one. Brain research has shown examples where edges are abstracted from raw data. Combinations of edges are abstracted above that to detect closed figures. Another level above that detects certain combinations of closed figures - very important for face recognition. As each level uses the six levels of neurons described earlier, the absolute minimum number of neurological levels of processing to prepare to recognize a face is already up to 21 levels, 22 including the retina and another 6 to distinguish among different faces, and it would require at the very least another 6 levels of neurons to identify a verbal name associated with a particular face. We've used up a minimum of 34 levels of neuron processing to get from the retina to a name, and it will take several more six-stage levels to exercise the motor circuitry to speak that name. How long can this take? The average time it takes a neuron to recharge after firing, so it can fire again, is about 5 milliseconds. 34 levels of processing will need 170 milliseconds - nearly two tenths of a second. Simple recognition tasks show a best speed of about one half second - 500 milliseconds - in psychological research. This is equivalent to what general semantics would call a very fast signal reaction. 500 milliseconds corresponds to a sequence of only 100 (levels) neurons firing in sequence. We've seen that 34 (or more) of those levels are used up just in getting to recognize a person's name, and probably half-again as many more to speak it. This input and output processing alone uses up around half of the available time for neurological processing of the shortest signal reactions. As each level of abstraction requires 6 stages of neurological processing, that leaves a maximum of 8 levels of abstraction. In a standard computer, a binary search limited to 8 levels would go as follows.
Each additional quarter of a second could allow an eight-fold increase in the search capability, but we can select the right response from hundreds of thousands of possibilities in much less time than would be required by sequentially processing search techniques - even the fastest binary search algorithm. There just isn't time enough for any conventional computing model to achieve the results our brains can with a cascade of neurons that is only 100 steps deep. The model being suggested, the memory-predictive framework model, provides a way around this time limitation.
How can this work? Partial stimulation recalls
More later
This page was updated by Ralph Kenyon on 2013/06/08 at 10:21 and has been accessed 7497 times at 24 hits per month. |
---|