Does the timing and pattern of reward change how you respond? Press a lever under four different schedules and watch the classic cumulative records emerge.
Responses needed for FR; mean for VR
Seconds for FI; mean for VI
๐ญ Like factory piecework โ paid per N widgets assembled
๐ช Like checking cookies in the oven โ first peek after N minutes
๐ฐ Like a slot machine โ you never know which pull will pay
๐ง Like checking email โ a reply could arrive at any moment
Start the simulation to see how four different reinforcement schedules produce four distinctly different patterns of behavior โ even though the "reward" itself is the same.
B.F. Skinner discovered that the pattern of reinforcement matters as much as the reinforcement itself. Even with the same reward, changing when and how it's delivered produces dramatically different behavioral patterns.
Reinforcement schedules are defined along two dimensions:
| Fixed | Variable | |
|---|---|---|
| Ratio response count |
FR Reinforce every Nth response Break-and-run pattern |
VR Reinforce after ~N responses (varies) High, steady rate |
| Interval elapsed time |
FI Reinforce first response after T sec Scalloped pattern |
VI Reinforce first response after ~T sec (varies) Moderate, steady rate |
Fixed Ratio (FR): Produces a break-and-run pattern. After each reinforcement, the organism pauses (the "post-reinforcement pause"), then responds in a rapid burst until the next reinforcement. The cumulative record shows flat segments followed by steep ramps. Think of a student powering through homework problems โ a break after each set, then a concentrated burst.
Fixed Interval (FI): Produces the classic scallop. Responding is very slow right after reinforcement (why respond when it won't pay off yet?) and accelerates as the interval approaches. Like checking cookies in the oven โ you don't bother at first, but peek more and more often as the timer approaches zero.
Variable Ratio (VR): Produces the highest, steadiest response rate of all four schedules. Since any response could be the one that pays off, there's no logical time to pause. This is the slot-machine schedule โ and it's why gambling is so hard to quit. It's also the most resistant to extinction.
Variable Interval (VI): Produces a moderate, steady response rate. Since the interval varies unpredictably, a consistent checking rate is the best strategy โ not too fast (wastes effort), not too slow (misses opportunities). Like checking your email for a reply that could come at any point.
The cumulative record is Skinner's signature visualization. The x-axis is time, the y-axis is total responses so far. The line can only go up or stay flat โ never down.
1. Switch to Simulation mode and press Start โ watch all four patterns develop simultaneously.
2. In Manual mode, try pressing the FR lever rapidly. Notice how you get reinforced every N presses.
3. Try the FI lever โ can you learn to wait for the interval and respond right when it "opens up"?
4. Increase the ratio to 30 โ the post-reinforcement pause in FR gets longer.
5. Compare VR and FR โ same average requirement, but VR keeps you pressing without any pause!
Reinforcement schedules explain patterns of behavior we see everywhere: why slot machines are addictive (VR), why students cram before exams (FI scalloping), why piecework pay motivates bursts of productivity (FR), and why we check social media at a steady rate (VI). Understanding these schedules gives us a powerful framework for analyzing โ and designing โ incentive structures.