ClickerSolutions Training Articles

A Clicker Training Primer

The following article was written for the Curly Coated Retriever Club of America's newsletter "The Commentator." It originally appeared in the July 2002 edition.

I’ve agreed to write a series of articles for the “Commentator” introducing the concepts and philosophies of clicker training. In upcoming issues, I’ll debunk some common myths about clicker training, explain how to get behavior (and keep it once you’ve got it), discuss how to deal with unwanted or problem behavior, and delve into the intricacies of reliability, fluency, and training plans. In this first article I’m going to define clicker training, introduce some common terms, and explain how it relates to operant conditioning.

Clicker Training Defined

Believe it or not, it’s not easy to define clicker training. There is no “official” definition, and so one trainer might define it quite differently from another. People have used the clicker as a marker within an otherwise-traditional program, as an aversive to mark (or punish) undesired behavior, as a cue for attention, and as a cue for a recall. Though all of those are valid uses of the tin noisemaker called a clicker, none of them are examples of “clicker training.” Clicker training is both a technology for training animals and a training philosophy.

As a technology, clicker training relies on positive reinforcement to make it more likely the dog will repeat desired behavior in the future. Two things, however, make clicker training unique.

First, its practitioners emphasize the science underlying the method. Clicker training is based on the principles of operant and classical conditioning. Traditional trainers often argue that their methods can also be explained using the terminology of operant conditioning—and that’s true. Clicker trainers, however, aren’t just using the terminology to explain results. They’re actively applying the principles before and during training. This makes clicker training more than a method, more than a set of step-by-step recipes to get behavior. Clicker trainers who learn the underlying principles have at their disposal a powerful set of tools that enable them to analyze behaviors, modify existing methods for individual animals, and create new methods where none previously existed.

Second, as the trainer you use a marker signal, the clicker, to tell the animal when he does what you want. The clicker is like a camera “taking a picture” of the behavior you are training for.

The technology is, at its core, very simple:

  1. Get the behavior.
  2. Mark the behavior.
  3. Reinforce the behavior.

For example, say you want to teach your dog to sit. When he sits, you click. Then you give him a bite of his favorite treat. The click means “That behavior right there! That’s what I want!” and “You’ve earned a reward.” If you click and reinforce every time your dog sits, he will soon figure out that sitting earns a treat and offer the sit more often. You can then add a cue, “sit,” to tell him when you want him to do the behavior.

More importantly, clicker training is more than using a clicker to train your dog. It's a different way of thinking, a way of relating to animals that creates a partnership that is mutually reinforcing and pleasurable. As a philosophy, clicker training has evolved from the works and ideas of Karen Pryor, Jean Donaldson, Bob and Marian Bailey, Turid Rugaas, Murray Sidman, and others who believe it's possible to train a dog—or raise a family, or live a successful life—using the principle of positive reinforcement instead of coercion or force.

The Link to Operant Conditioning

As mentioned above, clicker training is based on the principles of operant conditioning. In the video “Patient Like the Chipmunks,” Bob Bailey defines operant conditioning as both the science of explaining behavior and the powerful technology of changing it. The principles of operant conditioning describe how animals learn. When trainers use operant conditioning, they apply the principles to obtain the results they want. Operant conditioning breaks learning into three parts:

  • the stimulus that elicits behavior,
  • the actual behavior the animal does,
  • the consequence that occurs as a result of the behavior.

According to this theoretical framework, the consequence of a behavior determines whether it will be repeated or not in the future. If the consequence strengthens a behavior—causes it to occur more frequently—we say the behavior has been reinforced. Clicker trainers use positive reinforcement to teach new skills. On the other hand, behavior that leads to unpleasant consequences occurs less frequently. Punishment (as defined below) suppresses unwanted behaviors.

In either case, the consequence results from something being either added (+) or taken away (-) from the environment. This leads us to the definitions of four key operant conditioning terms.

  • Positive reinforcement (R+) means adding something the animal will work for to strengthen (increase the frequency of) a behavior. For example, giving the dog a treat for sitting will increase the probability the dog will sit again.
  • Positive punishment (P+) means adding something the animal will work to avoid to suppress (lessen the frequency of) a behavior. Jerking on the lead to stop a dog from jumping on people is an example of P+ used to suppress the behavior of jumping. Other common examples of P+ include yelling, nose taps, spanking, electric shock, and assorted "booby traps."
  • Negative reinforcement (R-) means removing something the animal will work to avoid in order to strengthen (increase the frequency of) a behavior. An ear pinch, traditionally used to train the forced retrieve, is a classic example of R-. The trainer pinches the ear until the dog opens its mouth, whereupon the trainer inserts the dumbbell. To reinforce taking the dumbbell, the trainer then releases (removes) the ear pinch. R- requires that an aversive first be applied or threatened in order for it to be removed.
  • Negative punishment (P-) means taking away something the animal will work for to suppress (lessen the frequency of) a behavior. For example, a dog jumps on you to get attention. By turning your back or leaving the room you apply P- by removing the attention he wants.

People commonly refer to the four principles of reinforcement and punishment as the “four quadrants of operant conditioning.” That phrase is misleading in two ways.

First, it implies that all four principles are equally weighted or of equal use in a training program. In reality, punishment—particularly positive punishment—has several drawbacks, some extreme, which make it inappropriate for most training issues. In addition, because an aversive must be applied or threatened before negative reinforcement can occur, negative reinforcement is also a poor choice.

Second, the quadrant description doesn’t include a fifth principle of operant conditioning, one that clicker training makes particular use of. This is the principle of extinction. With extinction, a behavior is weakened through the absence of any kind of reinforcement. For example, if no one answers your knock at a door, you will eventually stop knocking. If a dog can't reach a dog biscuit on the other side of a fence, it will eventually stop trying. Because extinction doesn’t have the drawbacks associated with punishment, clicker trainers use extinction to reduce or eliminate most unwanted behaviors.

A more accurate depiction of the relationship between the principles of operant conditioning and clicker training begins with the image of a pie. In clicker training, positive reinforcement is the largest piece, taking up perhaps two-thirds of the pie. The second largest piece is extinction. The third largest is negative punishment. Positive punishment and negative reinforcement are just two tiny slivers. The most important thing to note is that a complete, reliable training program can be composed entirely of positive reinforcement, extinction, and, to a far lesser extent, negative punishment.

Is it important to know these definitions? Yes, for two reasons.

First, it helps us understand each other much better. In everyday usage, the words “positive” and “negative” often mean good and bad. However, in operant conditioning and clicker training, they refer to something added or something taken away. “Punishment” is another word that carries strong connotations in everyday language, but in the context of operant conditioning, punishment means only that which suppresses the occurrence of a behavior.

Second, to clicker train without understanding the science makes clicker training nothing more than a cookbook full of recipes that may or may not work for your dog. Why? Because if you don’t understand the underlying behavioral principles, you can’t examine a training situation, determine why it is—or, more importantly, isn’t—working, and adjust for your particular dog.

Subset, Not Synonymous

Operant conditioning is based on five main principles, and all five are legitimate methods of changing behavior. Clicker training, however, does not make use of all five principles.

Karen Pryor, who coined the term “clicker training,” defines clicker training as a subset of the principles of operant conditioning, including only positive reinforcement, extinction, and to a much lesser extent, negative punishment. The late Marian Breland Bailey, who, with her first husband, Keller Breland, brought operant conditioning out of the laboratory and pioneered and perfected the use of event markers in training, supported this definition.

Negative reinforcement and positive punishment, though sometimes effective for changing behavior, have several possible drawbacks.

  • They are difficult to apply correctly.
  • They may have unexpected side effects, including fear and aggression.
  • They generalize easily—but often inappropriately.
  • They generally rely on fear, pain, or intimidation.
  • They inhibit the animal's willingness to offer behavior.

This last issue - inhibiting the animal's willingness to offer behavior - makes positive punishment and negative reinforcement most incompatible with clicker training. Clicker training can produce incredibly precise behaviors, but shaping these behaviors depends upon the dog's willingness to experiment, to offer a variety of responses, some right, some wrong. A dog that is punished for mistakes isn't going to be anxious to try anything new.

Trainers who are new to clicker training often balk at the thought of giving up positive punishment and negative reinforcement. They equate the lack of physical aversives with the lack of consequences. It’s important to realize that no study anywhere has ever determined that positive punishment (or reinforcement) is inherently more effective than negative punishment (or reinforcement). By definition, all are effective, and within each is a continuum from mild to extreme. Reliability is also not an issue. Reliability is not related to method—it is a number, cold data, a percentage of correct trials. In upcoming issues I’ll explain more about how to apply clicker training as defined in this article to get precise, reliable behaviors, but first, in the next issue, I’ll debunk some common myths about clicker training.

Melissa Alexander
mca @ clickersolutions.com
copyright 2002 Melissa Alexander

 

| Training Articles Contents || Site Home |


List and Site Owner: Melissa Alexander, mca @ clickersolutions.com