Module 9 Exam Prep — Generalization & Discrimination Learning

🎯

Section 1

The Core Problem: Generalization vs. Discrimination

⟳ Course Arc — Extending the R-W Framework

This week extends the prediction error framework from Weeks 5–8. R-W explains how weights update. Week 9 asks: what is the stimulus being represented? How you represent a stimulus determines what you generalize.

Generalization: transfer of past learning to novel events. The brain's default — treat similar things similarly.

Discrimination: perception of differences between stimuli; knowing when not to generalize.

Core tension: Too broad = fails (broccoli = all vegetables). Too narrow = fails (yellow = only exactly 580 nm). Adaptive behavior requires calibrated generalization.

Classic Evidence: Guttman & Kalish (1956)

Pigeons trained to peck yellow light (580 nm) for food. Tested on 520–620 nm range. Result: bell-shaped generalization gradient peaking at trained stimulus, falling symmetrically. Generally symmetric unless discrimination training intervenes.

🔢

Section 2

Stimulus Representation: Discrete vs. Distributed

Feature	Discrete-Component	Distributed (Shared Elements)
Encoding	One unique node per stimulus	Overlapping node sets for similar stimuli
Generalization	None — untrained node = response 0	Automatic — trained nodes activate for similar stimuli
Example (yellow → orange)	Orange node weight = 0; response = 0	Node 5 shared → response = 0.33
When sufficient	Categorically distinct stimuli (tone vs. light); simple organisms	Stimuli on continuous dimensions (color, pitch, brightness)

Distributed Representation Example — Light Wavelengths

Yellow (580 nm) → nodes [3, 4, 5] weight = 0.33 each after training
Yellow-orange → nodes [4, 5, 6] → response = 0.33+0.33 = 0.66
Orange → nodes [5, 6, 7] → response = 0.33 (only node 5 trained)

A1 (primary auditory cortex) is tonotopically organized. Tuning curves get broader as signals climb from cochlea to cortex — more generalization at higher levels, not less. Removing A1 in cats eliminates gradient (massive overgeneralization). A1 is necessary for appropriate generalization.

⚠ Common Misconception

Distributed representations are NOT always required. Categorically distinct stimuli (tone vs. light) → discrete is sufficient. Hospital wall colors (blue/red/green recovery times) → REQUIRES distributed (colors lie on a continuous wavelength dimension, so interpolation is needed).

📈

Section 3

Discrimination Training & Peak Shift

Discrimination training sharpens gradient (Jenkins & Harrison, 1962): Standard 1000 Hz training → broad gradient. Discrimination (1000 Hz food; 950 Hz no food) → much steeper, narrow gradient centered on 1000 Hz. Why: S- builds inhibitory gradient counteracting generalization from S+.

Human analog (Wills & McLaren, 1997): A-only training → broader gradients than A-vs-B discrimination training.

Peak Shift — Hanson (1959)

After S+=550 nm, S-=555 nm discrimination, peak responding SHIFTS AWAY from S+ (to ~540 nm), toward stimuli OPPOSITE the S-. Spence's theory: excitatory gradient around S+; inhibitory gradient around S-. Net = excitation minus inhibition. Maximum net is on the side of S+ furthest from S-.

Universality: Peak shift occurs in bees, horses, rats, goldfish, chickens, pigeons, and humans — one of the most robust findings in learning research.
Human relevance: Caricatures work via peak shift — exaggerating features that distinguish a face from the average face.

📝 Exam Tip

Know S+, S-, and direction of shift. Peak moves AWAY from S- (opposite side from S-). S+=550, S-=555 → peak at ~540 (not 560).

🔗

Section 4

Meaning-Based Generalization

Not all generalization is similarity-based. Two stimuli sharing the same consequences can generalize even without physical similarity. Three paradigms illustrate this — know all three cold.

4.1 Sensory Preconditioning (3-phase procedure)

Phase Structure — Sensory Preconditioning

Phase 1: Tone + Light together (compound-exp.) vs. separately (control)
Phase 2: Light → Airpuff [both groups learn to blink to light]
Phase 3: Test Tone alone → compound group blinks; control does NOT

Interpretation: Co-occurrence creates association. Light's meaning (airpuff) transfers to tone — cross-modality, meaning-based generalization without physical similarity.

Hippocampal dependence: Rabbits with fornix lesions show NO sensory preconditioning. Lesioned compound-exposure animals look exactly like controls (Port & Patterson, 1984).

4.2 Acquired Equivalence

A1 and A2 both paired with X1 → become "equivalent." Then A1 reinforced → animals generalize strongly to A2 but not to B2 (different equivalence class).

Human version (Myers et al., 2003): Cartoon people preferring fish. Brown-haired and blonde both prefer blue fish. Brown-haired newly prefers red fish → adults infer blonde also prefers red fish.

Hippocampal dependence: Entorhinal cortex lesions in rats eliminate Phase 3 transfer. Phase 1+2 intact but generalization fails.

4.3 Latent Inhibition

Pre-exposing animal to CS (without US) slows later CS-US conditioning. Must "unlearn" tone-nothing association before building tone-shock.

Entorhinal cortex lesions ELIMINATE latent inhibition — lesioned pre-exposed animals learn as fast as controls. Brain damage HELPS learning speed by removing the prior inhibitory memory.

⟳ Common Thread

In ALL three cases, hippocampal region deficit appears at transfer (Phase 3) — never at acquisition (Phase 1+2). The hippocampus builds the representational scaffold; other areas use it.

🔬

Section 5

Cortical Plasticity & Acetylcholine

Weinberger's studies: Paired 2500 Hz tone with shock in guinea pigs. A1 neurons that originally preferred ~1000 Hz shifted to respond best to ~2500 Hz (trained frequency). More A1 neurons become tuned to trained stimulus. Requires pairing (CS-US): tone alone → habituation; unpaired → no shift.

How does A1 "know" what's important? A1 doesn't directly receive input from pain/reward systems. The answer: Nucleus Basalis.

Neuromodulator	Source	Signal	Role
Dopamine	VTA / Substantia Nigra	Valence (signed PE: good or bad)	Reward/punishment prediction error
Acetylcholine (ACh)	Nucleus Basalis (basal forebrain)	Salience (unsigned: how important)	Cortical plasticity & remapping

Mechanism — Cortical Remapping via ACh

Salient event → Amygdala signals Nucleus Basalis
→ ACh released broadly across cortex
→ Enables cortical remapping (tuning shifts toward trained stimulus)

Nucleus basalis lesion + prior auditory discrimination: Previously learned auditory discrimination intact (remapping already happened).
Nucleus basalis lesion + new visual discrimination: IMPAIRED — cannot narrow gradient to distinguish colors; overgeneralizes. New cortical remapping requires ACh.

🗺️

Section 6

Hippocampus in Generalization

The hippocampus is NOT required for simple S-R associations. It IS required for meaning-based generalization (sensory preconditioning, acquired equivalence) and context-dependent discrimination (latent inhibition).

Gluck & Myers (1993) model: Hippocampus compresses redundant/unimportant information AND differentiates (expands the representation of) useful information.

New Yorker map analogy: Fine detail on 9th/10th Ave (expanded for what matters); featureless Midwest (compressed for what doesn't). This is what the hippocampus does with stimulus representations.

Key fMRI Finding

Hippocampal activation during Phase 1 (equivalence training) — NOT during Phase 3 testing — predicts Phase 3 accuracy. Hippocampus builds the representation; other areas use it.

⟳ Connect — Across All Three Paradigms

Sensory preconditioning, acquired equivalence, latent inhibition: in ALL cases, hippocampal region deficit is at TRANSFER (Phase 3), never at acquisition (Phase 1+2). This is the behavioral signature of hippocampal involvement.

💬

Section 7

Categorization, Language & Sapir-Whorf

Prototype: "Average" or most central member of a category. Assign new observations by distance from prototype. Animals can form categories: pigeons distinguish benign/malignant tumors at radiologist-level accuracy.

Sapir-Whorf hypothesis: The language you speak can affect how you categorize the world.

Example	Observation	Principle
Russian blue shades	Two words for blue → Russian speakers show finer blue discrimination than English speakers (same visual hardware)	Category labels tune generalization gradients
Arabic camel (~160 words)	Finer discrimination of camel types	Rich vocabulary → more cortical resources → finer gradient
English hen/rooster vs. one word for lion	Gender-specific terms improve discrimination	Naming creates category boundary

Directionality Debate

Does language shape thought, OR does thought/experience shape language? Ali was explicit: "Is it because of the language that I think in a certain way, or is it because the way I think that we develop this language?" Both directions possible; unresolved.

⟳ Connect to Cortical Maps

Things important to you → more cortical representations. Same principle as the homunculus, violinists' hand expansion, and Weinberger's remapping. Language may drive cortical allocation.

🏥

Section 8

Clinical: Schizophrenia & Generalization Deficits

Neural substrate: Schizophrenia is associated with reduced hippocampal volume and reduced hippocampal fMRI activity.

Core behavioral profile: Learn simple associations NORMALLY; TRANSFER generalization impaired.

Task	Performance	Interpretation
Acquired equivalence — Phases 1 & 2 (acquisition)	Normal	Basic associative learning intact
Acquired equivalence — Phase 3 (transfer)	Significantly impaired (Keri et al., 2005)	Hippocampal encoding of equivalences failed
Transitive inference — non-relational pairs (AE)	Succeed	Rote memory intact (A always wins, E always loses)
Transitive inference — relational pairs (BD)	Fail	Requires hippocampal relational inference across chain

Antipsychotic medications partially remediate the transfer generalization deficit.
Behavioral signature: Intact acquisition + impaired generalization transfer points specifically to hippocampal dysfunction.

⟳ Forward Connection to Week 10

The schizophrenia generalization deficit established here connects directly to the Week 10 clinical section on psychiatric disorders and hippocampal memory.

Key Terms — 20 Flashcards

Click any card to reveal its definition. Use the filters to focus on a category.

Behavioral

Generalization

Click to reveal →

Behavioral

Transfer of learning to novel stimuli; the brain's default mode — treat similar things similarly.

Behavioral

Discrimination

Click to reveal →

Behavioral

Perceiving and responding to differences between stimuli; knowing when NOT to generalize.

Behavioral

Generalization Gradient

Click to reveal →

Behavioral

Bell-shaped curve showing graded responding as a function of stimulus difference from the trained stimulus. Peaks at training value; falls symmetrically.

Behavioral

Peak Shift

Click to reveal →

Behavioral

After discrimination training, peak responding moves away from S+ toward the side opposite S-. Explained by Spence's net excitatory minus inhibitory gradients.

Behavioral

Latent Inhibition

Click to reveal →

Behavioral

Pre-exposing an animal to CS (without US) slows later CS-US conditioning. Must overcome prior tone-nothing association. Eliminated by entorhinal cortex lesions.

Theoretical

Discrete-Component Representation

Click to reveal →

Theoretical

Unique node per stimulus; zero generalization to similar stimuli. Sufficient for categorically distinct stimuli; fails for continuous dimensions.

Theoretical

Distributed Representation

Click to reveal →

Theoretical

Stimuli encoded by overlapping node sets; similar stimuli share nodes → automatic generalization gradient. Produces smooth, realistic gradients for continuous dimensions.

Theoretical

Spence's Theory

Click to reveal →

Theoretical

Peak shift arises from net excitatory (around S+) minus inhibitory (around S-) gradients. Maximum net responding falls on side of S+ opposite to S-.

Theoretical

Configural Node

Click to reveal →

Theoretical

AND-gate node firing only for a specific combination of stimuli; solves negative patterning (AB compound → no US even when A alone and B alone → US).

Theoretical

Negative Patterning

Click to reveal →

Theoretical

A alone → US; B alone → US; AB compound → no US. Requires configural representation; cannot be solved by simple element-based models.

Theoretical

Sapir-Whorf Hypothesis

Click to reveal →

Theoretical

The language you speak can influence how you categorize and perceive the world. Directionality (language → thought OR thought → language) remains debated.

Theoretical

Gluck-Myers Model

Click to reveal →

Theoretical

Hippocampus compresses redundant/unimportant info and differentiates useful info for learning. New Yorker map analogy: compressed Midwest, detailed 9th/10th Ave.

Spatial

Sensory Preconditioning

Click to reveal →

Spatial

3-phase: Phase 1 co-expose AB; Phase 2 B→US; Phase 3 A→CR. Hippocampus-dependent. Demonstrates cross-modality meaning-based generalization via co-occurrence.

Spatial

Acquired Equivalence

Click to reveal →

Spatial

Two stimuli sharing same outcome become "equivalent"; learning about one transfers to the other. Hippocampus-dependent at Phase 3 (transfer).

Spatial

Prototype

Click to reveal →

Spatial

The central/average member of a category. New observations assigned by distance from prototype. Animals (even pigeons) can form prototype-based categories.

Neural

Cortical Remapping

Click to reveal →

Neural

Experience-driven reorganization of cortical tuning; salient stimuli acquire more cortical neurons. Requires ACh from nucleus basalis. Persists after lesion (established memories).

Neural

Nucleus Basalis

Click to reveal →

Neural

Basal forebrain structure; releases ACh broadly when salient events occur. Lesion: established discriminations intact; NEW discrimination learning impaired (overgeneralization).

Neural

Acetylcholine (ACh)

Click to reveal →

Neural

Signals salience (not valence); promotes cortical plasticity. Contrast: dopamine signals valence (good/bad). ACh broadcasts "something important happened" regardless of reward/punishment.

Neural

Hippocampal Region

Click to reveal →

Neural

Includes hippocampus + entorhinal/perirhinal/parahippocampal cortices. Critical for meaning-based generalization and relational memory. NOT required for simple S-R associations.

Clinical

Schizophrenia Generalization Deficit

Click to reveal →

Clinical

Intact acquisition + impaired transfer generalization. Signature of hippocampal region dysfunction. Antipsychotics partially remediate. Fails relational pairs (BD) but succeeds on rote pairs (AE) in transitive inference.

36 Questions

0 Correct

0 Incorrect

— Score

Generalization & Discrimination

The transfer of past learning to new situations and problems is known as:

Generalization & Discrimination

Miranda was bitten by a small brown dog and now she has a fear of all dogs, regardless of their size or color. This is an example of:

Generalization & Discrimination

Brian can tell from the way his baby cries whether she is hungry, needs changing, is sick, or is tired. This is an example of:

Generalization & Discrimination

The process by which one learns about new categories usually based on common features is known as:

Generalization & Discrimination

Tommy is visiting his grandmother and cooking dinner for her. Since his grandmother has a different model of microwave oven than his, he has learned that he needs to push a different sequence of buttons on his grandmother's microwave oven when he wants to use it for cooking. This is an example of:

Generalization

Isaac once became sick after eating pepperoni pizza. Based on the idea of a generalization gradient, which food would he be MOST likely to avoid in the future?

Generalization

If a generalization gradient were a flat horizontal line, it would mean that:

Generalization

A typical generalization gradient for many different stimuli across a broad range of generalization experiments resembles a:

Generalization & Discrimination

Suppose a person reinforces a rat for responding to an 800-Hz tone and then observes that its response to a novel 750-Hz tone is about 50% of its response to the 800-Hz tone. The lower response to the 750-Hz tone occurs because the rat:

Generalization

The shape of generalization gradients shows that two very similar stimuli are _____, while two very different stimuli are _____.

Generalization & Discrimination

_____ representations use a unique node to represent each individual stimulus.

Representations

When a discrete-component representation is used, there is:

Representations

If one trains a discrete-component model to respond to a blue light, how will it respond to a blue-green light?

Representations

If one trains a distributed model to respond to a blue light and then presents it with a blue-green light, it responds to the blue-green light because:

Representations

The discrete-component model and the distributed model differ in that only the:

Generalization

Which model(s) can account for generalization?

Generalization

Compared with the generalization gradient that is observed when no discrimination training is given, the generalization gradient that is observed after discrimination training is:

Generalization

Studies have compared discrimination training with examples of two categories with standard acquisition training where participants were shown exemplars for only one category. Results show _____ generalization gradients after standard acquisition training and _____ generalization gradients after discrimination training.

Discrimination

A training procedure in which difficult discrimination is learned by starting with an easy version of the task and proceeding to incrementally harder versions as the easier ones are mastered is referred to as _____ learning.

Generalization

Training in which presentation of two stimuli together as a compound results in a later tendency to generalize what is known about one of these stimuli to the other is known as:

Acquired Equivalence

Suppose one pairs a light and a tone in the first phase of a sensory preconditioning paradigm. If one then pairs just the light with a food pellet, such that the light elicits a salivation response, the tone presented alone will:

Generalization & Discrimination

Austin and Evan both enjoy video games. If a person later learns that Austin also enjoys playing soccer, the person may infer that Evan also enjoys playing soccer. This is an example of:

Generalization

Even though physical similarity is a frequent cause of generalization, _____ demonstrated that learning can be generalized if dissimilar stimuli have a history of co-occurring or predicting the same consequence.

Generalization & Discrimination

If one uses a red light and a blue light as stimuli in a negative-patterning task, one would reward responding:

Negative Patterning

Negative patterning is difficult to learn because it requires the organism to suppress its tendency to:

Negative Patterning

Negative patterning is one example of a larger class of learning phenomena that involve configurations of stimuli and that CANNOT be explained using:

Representations

In a single-layer network using a discrete-component representation, negative patterning:

Generalization

If one lesions the primary auditory cortex of a cat, the generalization gradient:

Generalization & Discrimination

If a tone and shock are repeatedly paired, neurons in A1:

Generalization & Discrimination

The different sensory cortices receive information about:

Cortical Plasticity

The nucleus basalis:

Generalization & Discrimination

Lesions of the hippocampal region lead to:

Generalization & Discrimination

According to Gluck and Myers's model, the hippocampal region should be MOST active:

Brain & Clinical

On an acquired-equivalence task, people with schizophrenia:

Generalization

Which finding confirms Gluck and Myers's prediction regarding the role of the hippocampus in generalization?

Brain & Clinical

Suppose one gives the following information to a patient with schizophrenia: Joe is taller than Frank; Frank is taller than Jim; Jim is taller than Mike; and Mike is taller than Steve. With which comparison would the patient have the MOST difficulty?

Big Picture Synthesis

How the pieces connect — the unifying framework for Module 9.

Unifying Principle

How a stimulus is REPRESENTED in the brain determines everything about what generalizes to what. The same learning mechanism (prediction error update) produces radically different generalization patterns depending on whether stimuli are encoded discretely or as overlapping distributed patterns.

The hippocampus does not learn associations directly — it builds the representational scaffold (compressing irrelevant variation, differentiating useful distinctions) that lets other brain areas generalize correctly.

Behavior

Generalization gradient

Peak shift

Sensory preconditioning

Acquired equivalence

Circuit

A1 tonotopic maps

Hippocampal region (compression/differentiation)

Nucleus basalis (ACh broadcast)

Synapse

ACh enables cortical plasticity

Hippocampal Phase 1 encoding → later transfer

Structure

A1 cortical remapping

Hippocampal volume/activity predicts generalization

Key Exam Themes

Discrete vs. distributed — which model is required when (hospital color example)
Peak shift direction — away from S-, know S+ and S-
3-phase structures — sensory preconditioning and acquired equivalence
Hippocampus deficit — Phase 3 ONLY, not acquisition
ACh vs. dopamine — salience vs. valence
Sapir-Whorf — directionality debate unresolved
Schizophrenia — intact acquisition + impaired transfer

Cross-Course Connections

R-W (Weeks 5-8): Distributed representation implements graded generalization R-W describes mathematically
ACh + Dopamine: Complementary neuromodulators — salience vs. valence
Hippocampus (Week 6): Same structure as spatial learning and declarative memory
Schizophrenia → Week 10: Clinical generalization deficits connect to psychiatric disorders module
Cortical maps: Same principle as homunculus, violinists' hand expansion, and Weinberger's remapping

Common Confusions to Avoid

Peak shift direction: Peak moves AWAY from S- (not toward it). S+=550, S-=555 → peak at ~540, NOT ~565.
Latent inhibition lesion paradox: Entorhinal lesion HELPS speed (not hurts) — removes interference from prior tone-nothing memory.
Hippocampus timing: Critical at ENCODING (Phase 1), not retrieval (Phase 3). fMRI at Phase 1 predicts success; Phase 3 activity does not.
ACh vs. dopamine: ACh = unsigned salience (good or bad matters equally). Dopamine = signed valence (which direction matters).
Nucleus basalis lesion: Old discriminations INTACT; only NEW discriminations fail. ACh needed for future remapping, not maintenance of past.

Generalization & DiscriminationLearning

Key Terms — 20 Flashcards

Big Picture Synthesis

Generalization & Discrimination
Learning