The Darwinian metaphor, to which Skinner was an early contributor, has been a commonplace for several years. Operant learning is seen as an interplay between response emission (variation) and reinforcement (selection). In applying his ideas to teaching, Skinner emphasized selection almost exclusively. But the real puzzle posed by non-rote learning, in both animals and humans, is not selection but the sources of variation that cause an action or an idea to appear for the first time. It is in this sense that Skinner’s whole discussion of teaching may have missed the point.
B. F. Skinner was much interested in teaching, although he himself was far from charismatic as a lecturer and, in his later years, spent rather little time with undergraduate students (at least, that was my experience as a teaching assistant in his large Nat. Sci. 114 class at Harvard). Nevertheless, he was quite sure about several points: First and most important, much can be learned from experiments with animals. Strategies that work best for the training of animals can and should be applied to the education of humans. He believed animal experiments to show that positive reinforcement is much better than punishment as a motivator. His errorless-learning experiments with Herb Terrace convinced him that learning without errors was possible. Since making mistakes is unpleasant, and aversive control is bad, he advocated programmed instruction, which he designed so as to eliminate errors, as the teaching method of choice,.
But do Skinner’s claims about how best to teach people, especially intelligent people learning difficult things, in fact follow from what we know about behavior analysis as a science? I don’t think so, and as evidence, I offer first a couple of anecdotes. One is about the abilities of an animal, the other about the learning of a schoolboy. They raise obvious questions: Do the most striking examples of animal intelligence in fact show the effect of the kind of training that Skinner advocated? Do the greatest examples of human education exemplify the effects of exclusive positive reinforcement and errorless training?
On February 22, 1818, Blackwood's Magazine published a letter from the ‘Shepherd Poet’ James Hogg which recounted an extraordinary feat of animal intelligence. Hogg wrote about his dog Sirrah and an experience he had when 700 lambs, newly separated from their dams, escaped at midnight on to the Scottish moor. As Hogg began to search, he could not see his dog in the dark, but spoke and whistled to him nonetheless. He and a companion looked for the lambs until daybreak. Failing to find them, or the dog, they concluded that they must return to their master and tell him his whole flock of lambs was lost. But then:
On our way home, however, we discovered a body of lambs at the bottom of a deep ravine...and the indefatigable Sirrah standing in front of them, looking all around for some relief, but still standing true to his charge...When we first came in view of them, we concluded that it was one of the divisions of the lambs...But what was our astonishment, when we discovered that not one lamb of the whole flock was wanting! How had he got all the divisions collected in the dark is beyond my comprehension. The charge was left entirely to himself from midnight until the rising of the sun; and if all the shepherds in the Forest had been there to have assisted him, they could not have effected it with greater propriety.
There are two methods to train a dog. The quickest, least dependent on individual aptitude, and most obviously related to Skinnerian methods, is clicker training. A clicker is sounded every time the dog gets a little “treat.” He will associate the clicker with reward and pretty soon the sound of the clicker itself works as a reward — just so long as the clicker-treat pairing is occasionally maintained. If he can’t give a treat, the owner can now sound the clicker whenever the animal does whatever is required of him — sit stay, beg or whatever. If, as is usually the case, the beast fails to show the correct behavior full-blown on his own, he can be rewarded for approximations, until the desired behavior does come about and can be rewarded.
This method of training is called “shaping by successive approximations.” It is the method used by circus trainers and the contestants on Animal Stars— we have all noticed the little bit of fish slipped to the dolphin, the treat given to Fifi after each trick. It is effective and reliable, especially if what is to be taught is well-defined and predictable. It is the method of choice for behavior analysts. It emphasizes reinforcement — selection.
But there is another method to train a dog. It relies much more on the animal’s instincts and on his relationship to his owner. Dog trainers often say “the dog wants to please his owner”, and there is some truth to it. More important, the dog is social creature. The owner, if he behaves properly will become the “alpha male” (or female: dogs are not sexist). “Positive reinforcement” is still involved, but now the reinforcement is primarily social. Moreover, the dog will behave in a different and more interesting way if he perceives his owner as a fellow creature rather than simply as a source of food. The emphasis now is not on reinforcement, although of course there always is reinforcement, but on variation, on creating an environment where the animal will “show what he can do.” This is the approach used by shepherds to train their dogs — animals that already “know” what sheep are and instinctively herd them on first sight. The sheepdog loves to work the sheep and asks only to be shown the sheep and told what to do. Hogg’s wonderful dog Sirrah learned in this way and showed his versatility when circumstances demanded it:
[When I bought him, Sirrah] was scarcely then a year old, and knew so little of herding that he had never turned a sheep in his life; but as soon as he discovered it was his duty to do so I can never forget with what anxiety and eagerness he learned his different evolutions. He would try everywhere deliberately till he found out what I wanted him to do, and when once I made him understand a direction he never forgot or mistook it again. Well as I knew him, he often astonished me, for, when hard pressed in accomplishing the task he was put to, he had expedience at the moment that bespoke a great share of reasoning faculty. (my emphasis)
Yes, we can learn a lot about teaching from work with animals. But we have attended to only part of the story. Since the early 1950s, it is the first approach, treats, explicit positive reinforcement and “shaping,” rather than the second much-harder-to-define method, that has formed the “scientific” basis for education. It is this approach to teaching that was advocated most forcefully, and with the most elegant methods, by B. F. Skinner. It is the origin of “time-outs” as punishment (in lieu of swifter and more vigorous methods), of programmed instruction and of positive reinforcement as the major engine for behavioral change. It is also the basis for regarding teaching as training in a “skill,” like a trick to be taught to an animal. It treats students like dogs, and pretty dim ones at that — Odie rather than Lassie.
The training of Sirrah is an alternative approach. It means creating an environment in which the animal’s natural propensities (which, in an intelligent animal, go far beyond reflex response) can flower to their full extent. Not an easy thing to do, perhaps. Not something that can be reduced to the kind of algorithm represented by the law of effect.
A human illustration is beautifully described by Richard Dawkins in his moving account of ‘Sanderson of Oundle’ — Oundle, a British ‘public’ school famous for its output of talent, and Sanderson, its headmaster early in the 20th century.
Sanderson’s hatred of any locked door which might stand between a boy and some worthwhile enthusiasm symbolised his whole attitude to education. A certain boy was so keen on a project he was working on that he used to steal out of the dormitory at 2 am to read in the (unlocked, of course) library. The Headmaster caught him there, and roared his terrible wrath for this breach of discipline (he had a famous temper and one of his maxims was, “Never punish except in anger”)... [The] boy himself tells the story.
“The thunderstorm passed. ‘And what are you reading, my boy, at this hour?’ I told him of the work that had taken possession of me, work for which the daytime was all too full. Yes, yes, he understood that. He looked over the notes I had been taking and they set his mind going. He sat down beside me to read them. They dealt with the development of metallurgical processes, and he began to talk to me of discovery and the values of discovery, the incessant reaching out of men towards knowledge and power, the significance of this desire to know and make and what we in the school were doing in that process. We talked, he talked for nearly an hour in that still nocturnal room. It was one of the greatest, most formative hours in my life... ‘Go back to bed, my boy. We must find some time for you in the day for this’.”
Dawkins adds “That story brings me close to tears…”
This story, like the one about Sirrah, shows a kind of creativity in teaching and a kind of spontaneous flowering in learning that seems to lie quite outside the rhetoric of “successive approximations” and the teaching of tricks. Sanderson’s pupil was not “shaped” to show an interest in metallurgy. Undoubtedly he had felt Sanderson’s ire for past errors, as he felt it now for breaking the school rules. And yet, under Sanderson’s tutelage, in the environment Sanderson had created, he developed a passionate interest in learning of the kind we should all love to see in our own students.
But are these examples fair criticism? Behavior analysts will object that I am merely countering science with anecdote. Isn’t this just the anthropomorphism of George Romanes and your grandmother warmed over? I don’t think so. To explain why, we need to go back to what the science really is.
Skinner made at least two great discoveries in his analysis of operant behavior. One was hardly original at all; yet it is the one for which he has gotten the greatest credit — and which he himself thought the most important, namely the principle of reinforcement. But humanity knew about carrots (although they were usually paired with sticks) for countless generations before Skinner came along. And even the scientific version of reward was experimentally demonstrated by Thorndike, some time before The Behavior of Organisms.
So I think that Skinner’s second contribution is more important than the reinforcement principle but, because it is still not fully understood, it has received much less attention. It is the idea that operant behavior is emitted; that it is essentially spontaneous, at least on first occurrence. Years ago I compared this dichotomy — emitted behavior selected by reinforcement — to the Darwinian idea of selection and variation (Staddon & Simmelhag, 1971). Variation was Darwin’s term for the then-unknown processes that produced variants (variant phenotypes as we would now call them) from which natural selection would pick the winners. In similar fashion, the processes that govern the emission of operant behavior produce an initial repertoire from which reinforcement can then select (see Catania & Harnad, 1988, for a selection of articles on the Darwinian theme in operant conditioning).
There are of course many differences between Darwinian selection in phylogeny and selection by reinforcement during ontogeny. Behavioral variation (unlike much, but not all genetic/developmental variation) is far from random. But the most striking difference is that presentation of reinforcement by itself changes the repertoire; not just by selecting from what is available, but also by changing, usually enlarging, the set of behaviors that comprise the repertoire itself. It is as if the operation of selection by itself were to change the range of genotypes. Darwin thought that natural selection worked this way (although he understood nothing of genotypes, of course), when he summarized the sources of variability in the last paragraph of the Origin: “Variability from the indirect and direct action of the conditions of life, and from use and disuse…”
We know now that genetic variation is essentially independent of “the conditions of life” and “use and disuse.” But the same is not true of behavioral variability: “use and disuse” is just habit, which certainly affects behavior. And as for the “conditions of life,” what are they in the Darwinian metaphor for operant conditioning? Well, I have suggested a few candidates, generalization and Pavlovian conditioning, for example, in my book The New Behaviorism and earlier articles, but the fact is that we really know very little about the “conditions of life” that produce the kind of behavior shown by the Sanderson’s pupil at Oundle — or by the dog Sirrah.
The education establishment, simpleminded as usual, has an obsession with “self-esteem,” which is a crude way of addressing the “variation” issue. If a pupil has high self esteem, we might expect him to be more willing to try out alternatives and be creative. But, of course, “self-esteem” can as well lead to smugness and self-satisfaction. It is a poor proxy for the kind of behavioral variation induced by the very best teachers.
All we can be sure of is that the causation involved in generating effective behavior in challenging situations is complex, involving both nature and nurture in an uncertain mix. But three things seem clear. That there are processes in creative teaching that are understood in an intuitive way by our great teachers, like Sanderson of Oundle and the Shepherd Poet. That the Darwinian framework for behavior analysis points to the fact that processes of variation exist, even though they have been sorely neglected in favor of an almost exclusive focus on reinforcement and selection. And that behavior analysts need to take time out from pressing the “reinforcement” lever, and look around for those sources of variation that yield the most exciting kinds of teaching. Such a change of direction would not be an abandonment of behavior analysis. It would mean only opening a door that has been closed for too long.
Catania, A. C., & Harnad, S. (Eds.). (1988). The selection of behavior: the operant behaviorism of B. F. Skinner. New York: Cambridge University Press.
Skinner, B. F. The behavior of organisms. New York: Appleton-Century, 1938.
Skinner, B. F. The phylogeny and ontogeny of behavior. Science, 1966, 153, 1205-1213.
Staddon, J. E. R., & Simmelhag, V. L. The "superstition.' experiment: a reexamination of its implications for the principles of adaptive behavior. Psychological Review, 1971, 78, 343.
 The Guardian, Saturday July 6, 2002