What Type Of Reinforcement Schedule Is A Slot Machine

What Kind Of Reinforcement Is A Slot Machine
What Type Of Reinforcement Schedule Is A Slot Machine Machines
What Type Of Reinforcement Schedule Is A Slot Machine Jackpots
What Type Of Reinforcement Schedule Is A Slot Machine Invented
On What Kind Of Reinforcement Schedule Does A Slot Machine Work

We all know what it feels like to be tethered to technology these days. It’s not just our kids – it’s the rest of us as well. Probably one of the biggest hooks is the smartphone. As of 2016, there were about 2.1 billion smartphones worldwide. Considering there are about 7.6 billion people on the planet, that’s a LOT of smartphones! Then there’s social media. Facebook alone has over 2 billion users. Many people worry about smartphone addiction or screen addiction. We all know how we can get sucked into compulsively checking our smartphones, especially to text or use social media. Some people are “news junkies” who always seem to need their next news “fix.” In previous blogs, I discussed how we are drawn to screens to get our needs met. Also, I blogged about how both classical conditioning and supernormal stimuli can compel us to check our screens. There’s another mechanism through which we get sucked into our devices. Variable reinforcement and screens is that third mechanism that is the topic of this blog.

Technology Isn’t Bad

This is the most powerful type of intermittent reinforcement schedule. In humans, this type of schedule is used by casinos to attract gamblers: a slot machine pays out an average win ratio—say five to one—but does not guarantee that every fifth bet (behavior) will be rewarded (reinforcement) with a win.

On what kind of reinforcement schedule does a slot machine work

As I’ve said before, technologies such as smartphones, video games, and social media aren’t inherently bad. They provide countless benefits. If they didn’t, we wouldn’t use them! We like to use various technologies because they can meet our psychological needs for relatedness, autonomy, and competence. In a sense, the many practical, entertainment, and social reasons for using smartphones ultimately involve getting our psychological needs met. However, a curious thing happens with our technologies. We start to use them in a compulsive way that often starts to resemble an addiction.

Smartphone Addiction

The way we feel the need to check our phones begins to seem like that smoker who starts jonesing for a cigarette. We just HAVE to check our smartphone. We begin to check our phones so frequently that it interferes with our relationships, productivity, and our safety. Why would do we keep checking our smartphones so compulsively? Why is it so hard to disengage from them and leave them in our purses, pockets, or another room? One particular mechanism that appears to be involved that can explain this powerful hook: the variable reinforcement schedule.

A Little About Reinforcement Schedules

If you ever took an introductory psychology course, chances are you ran across B.F. Skinner. He was a psychologist and behaviorist who looked at how behavioral responses were established and strengthened by different schedules of reinforcement. For instance, a rat in a cage that is taught to press a lever to earn a food pellet (reward) might be taught that it gets one food pellet for every 3 presses of the lever. This would be an example of a fixed interval reinforcement schedule.

Although there are a variety of types and subtypes of reinforcement schedules that can affect the likelihood of different behavioral responses, I’m going to briefly discuss the variable ratio reinforcement schedule.

Variable Ratio Reinforcement Schedule
A variable ratio reinforcement schedule occurs when, after X number of actions, a certain reward is achieved. Using the rat example, the rat doesn’t know how many presses of the lever produces the food pellet. Sometimes it is 1, others it is 5, or 15…it never knows. It soon learns that the faster it pushes the lever though, the sooner it will receive the pellet. Researchers have found that variables ratio schedules tend to result in a high rate of responding (refer to the VR line in the graph above). Also, variable ratios are extremely resistant to extinction. In the case of the rat, if the researchers stops giving pellets of food after lever presses, the rat will push the lever frequently for a very long time until it finally gives up (which is the extinction part). Slot machines are a real world example of a variable ratio.

Variable Reinforcement in Our Daily Lives

It turns out, variable ratio reinforcement schedules are involved in many behavioral addictions, such as gambling. Yes, that’s right. In a sense, compulsively checking our phones is much like compulsive gambling. In fact, many “obsessions” and hobbies also involve this variable ratio reinforcement schedule, such as:

Fishing
Hunting
Basically any type of collecting (e.g., collecting Pokemon cards, stamps)
Looking for bargains while shopping at the mall, flea markets, or garage sales
Channel surfing on TV (seems that Internet surfing has largely supplanted that past time)

Why Are Variable Reinforcement Schedules Powerful?

Variable reinforcement schedules are NOT bad. There are a way that we learn. They can be very powerful. We learn casual relationships from “connecting the dots.” From an evolutionary perspective, learning causal connections enhances our chances of survival. For instance, if I do “Action A” then “B” is the likely result. When there is a variable relationship, that means when we perform “Action A” then “B” might be the result. The reward system in the brain releases dopamine in fairly large amounts in variable situations to motivate the organism to pay attention so that it might learn the causal connection. In essence, the brain is trying to “crack the code.” When the relationship between two stimuli is variable, then the reward center of the brain keeps releases dopamine so that we can try to figure out the connection.

Variable Reinforcement and Technology

It is easy to see how technologies such as social media, texting, and gaming work on a variable reinforcement schedule. Like a box of chocolate, we never know what we are going to get. Who posted to Facebook? What did they post? Who commented on my post? What did they say? My cell is buzzing – what could this be about? I wonder if Trump tweeted something crazy today?

The moment our smartphones buzz or chime, this reward system is activated. Importantly, it is the anticipation phase that is key to the activation of this reward system. We just HAVE to find out this information, whatever it is. It feels like an itch that needs to scratched or a thirst that needs to be quenched. Variable reinforcement and screens form a powerful combination. I my next blog, I’ll discuss the brain and tech addiction in a bit more detail.

Schedules of reinforcement can affect the results of operant conditioning, which is frequently used in everyday life such as in the classroom and in parenting. Let’s examine the common types of schedule and their applications.

Schedules Of Reinforcement

Operant conditioning is the procedure of learning through association to increase or decrease voluntary behavior using reinforcement or punishment.

Schedules of reinforcement are the rules that control the timing and frequency of reinforcer delivery to increase the likelihood a target behavior will happen again, strengthen or continue.

A schedule of reinforcement is a contingency schedule. The reinforcers are only applied when the target behavior has occurred, and therefore, the reinforcement is contingent on the desired behavior¹.

There are two main categories of schedules: intermittent and non-intermittent.

Non-intermittent schedules apply reinforcement, or no reinforcement at all, after each correct response while intermittent schedules apply reinforcers after some, but not all, correct responses.

Non-intermittent Schedules of Reinforcement

Two types of non-intermittent schedules are Continuous Reinforcement Schedule and Extinction.

Continuous Reinforcement

A continuous reinforcement schedule (CRF) presents the reinforcer after every performance of the desired behavior. This schedule reinforces target behavior every single time it occurs, and is the quickest in teaching a new behavior.

Continuous Reinforcement Examples

e.g. Continuous schedules of reinforcement are often used in animal training. The trainer rewards the dog to teach it new tricks. When the dog does a new trick correctly, its behavior is reinforced every time by a treat (positive reinforcement).

e.g. A continuous schedule also works well with very young children teaching them simple behaviors such as potty training. Toddlers are given candies whenever they use the potty. Their behavior is reinforced every time they succeed and receive rewards.

Partial Schedules of Reinforcement (Intermittent)

Once a new behavior is learned, trainers often turn to another type of schedule – partial or intermittent reinforcement schedule – to strengthen the new behavior.

A partial or intermittent reinforcement schedule rewards desired behaviors occasionally, but not every single time.

Behavior intermittently reinforced by a partial schedule is usually stronger. It is more resistant to extinction (more on this later). Therefore, after a new behavior is learned using a continuous schedule, an intermittent schedule is often applied to maintain or strengthen it.

Many different types of intermittent schedules are possible. The four major types of intermittent schedules commonly used are based on two different dimensions – time elapsed (interval) or the number of responses made (ratio). Each dimension can be categorized into either fixed or variable.

The four resulting intermittent reinforcement schedules are:

Fixed interval schedule (FI)
Fixed ratio schedule (FR)
Variable interval schedule (VI)
Variable ratio schedule (VR)

Fixed Interval Schedule

Interval schedules reinforce targeted behavior after a certain amount of time has passed since the previous reinforcement.

A fixed interval schedule delivers a reward when a set amount of time has elapsed. This schedule usually trains subjects, person, animal or organism, to time the interval, slow down the response rate right after a reinforcement and then quickly increase towards the end of the interval.

A “scalloping” pattern of break-run behavior is the characteristic of this type of reinforcement schedule. The subject pauses every time after the reinforcement is delivered and then behavior occurs at a faster rate as the next reinforcement approaches².

Fixed Interval Example

College students studying for final exams is an example of the Fixed Interval schedule.

Most universities schedule fixed interval in between final exams.

Many students whose grades depend entirely on the exam performance don’t study much at the beginning of the semester, but they cram when it’s almost exam time.

Here, studying is the targeted behavior and the exam result is the reinforcement given after the final exam at the end of the semester.

Because an exam only occurs at fixed intervals, usually at the end of a semester, many students do not pay attention to studying during the semester until the exam time comes.

Variable Interval Schedule (VI)

A variable interval schedule delivers the reinforcer after a variable amount of time interval has passed since the previous reinforcement.

What Kind Of Reinforcement Is A Slot Machine

This schedule usually generates a steady rate of performance due to the uncertainty about the time of the next reward and is thought to be habit-forming³.

Variable Interval Example

Students whose grades depend on the performance of pop quizzes throughout the semester study regularly instead of cramming at the end.

Students know the teacher will give pop quizzes throughout the year, but they cannot determine when it occurs.

Without knowing the specific schedule, the student studies regularly throughout the entire time instead of postponing studying until the last minute.

Variable interval schedules are more effective than fixed interval schedules of reinforcement in teaching and reinforcing behavior that needs to be performed at a steady rate⁴.

Fixed Ratio Schedule (FR)

A fixed ratio schedule delivers reinforcement after a certain number of responses are delivered.

Fixed ratio schedules produce high rates of response until a reward is received, which is then followed by a pause in the behavior.

Fixed Ratio Example

A toymaker produces toys and the store only buys toys in batches of 5. When the maker produces toys at a high rate, he makes more money.

In this case, toys are only required when all five have been made. The toy-making is rewarded and reinforced when five are delivered.

People who follow such a fixed ratio schedule usually take a break after they are rewarded and then the cycle of fast-production begins again.

What Type Of Reinforcement Schedule Is A Slot Machine Machines

Variable Ratio Schedule (VR)

Variable ratio schedules deliver reinforcement after a variable number of responses are made.

This schedule produces high and steady response rates.

Variable Ratio Example

Gambling at a slot machine or lottery games is a classic example of a variable ratio reinforcement schedule⁵.

Gambling rewards unpredictably. Each winning requires a different number of lever pulls. Gamblers keep pulling the lever many times in hopes of winning. Therefore, for some people, gambling is not only habit-forming but is also very addictive and hard to stop⁶.

Extinction

An extinction schedule (Ext) is a special type of non-intermittent reinforcement schedule, in which the reinforcer is discontinued leading to a progressive decline in the occurrence of the previously reinforced response.

How fast complete extinction happens depends partially on the reinforcement schedules used in the initial learning process.

Among the different types of reinforcement schedules, the variable-ratio schedule (VR) is the most resistant to extinction whereas the continuous schedule is the least⁷.

Schedules of Reinforcement in Parenting

Many parents use various types of reinforcement to teach new behavior, strengthen desired behavior or reduce undesired behavior.

What Type Of Reinforcement Schedule Is A Slot Machine Jackpots

A continuous schedule of reinforcement is often the best in teaching a new behavior. Once the response has been learned, intermittent reinforcement can be used to strengthen the learning.

Reinforcement Schedules Example

Let’s go back to the potty-training example.

When parents first introduce the concept of potty training, they may give the toddler a candy whenever they use the potty successfully. That is a continuous schedule.

After the child has been using the potty consistently for a few days, the parents would transition to only reward the behavior intermittently using variable reinforcement schedules.

Sometimes, parents may unknowingly reinforce undesired behavior.

Because such reinforcement is unintended, it is often delivered inconsistently. The inconsistency serves as a type of variable reinforcement schedule, leading to a learned behavior that is hard to stop even after the parents have stopped applying the reinforcement.

Variable Ratio Example in Parenting

When a toddler throws a tantrum in the store, parents usually refuse to give in. But once in a while, if they’re tired or in a hurry, they may decide to buy the candy, believing they will do it just that one time.

But from the child’s perspective, such concession is a reinforcer that encourages tantrum-throwing. Because the reinforcement (candy buying) is delivered at a variable schedule, the toddler ends up throwing fit regularly for the next give-in.

This is one reason why consistency is important in disciplining children.

What Type Of Reinforcement Schedule Is A Slot Machine Invented

Related: Discipline And Punishment

References

Case DA, Fantino E. THE DELAY-REDUCTION HYPOTHESIS OF CONDITIONED REINFORCEMENT AND PUNISHMENT: OBSERVING BEHAVIOR. Journal of the Experimental Analysis of Behavior. Published online January 1981:93-108. doi:10.1901/jeab.1981.35-93
Dews PB. Studies on responding under fixed-interval schedules of reinforcement: II. The scalloped pattern of the cumulative record. J Exp Anal Behav. Published online January 1978:67-75. doi:10.1901/jeab.1978.29-67
DeRusso AL. Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Front Integr Neurosci. Published online 2010. doi:10.3389/fnint.2010.00017
Schoenfeld W, Cumming W, Hearst E. ON THE CLASSIFICATION OF REINFORCEMENT SCHEDULES. Proc Natl Acad Sci U S A. 1956;42(8):563-570. https://www.ncbi.nlm.nih.gov/pubmed/16589906
Dixon MR, Hayes LJ, Aban IB. Examining the Roles of Rule Following, Reinforcement, and Preexperimental Histories on Risk-Taking Behavior. Psychol Rec. Published online October 2000:687-704. doi:10.1007/bf03395378
Redish AD, Jensen S, Johnson A, Kurth-Nelson Z. Reconciling reinforcement learning models with behavioral extinction and renewal: Implications for addiction, relapse, and problem gambling. Psychological Review. Published online July 2007:784-805. doi:10.1037/0033-295x.114.3.784
Azrin NH, Lindsley OR. The reinforcement of cooperation between children. The Journal of Abnormal and Social Psychology. Published online 1956:100-102. doi:10.1037/h0042490

Roomlucky