Construct 4: Episodic Memory

Definition

This construct measures how well individuals can store, maintain, and retrieve detailed information in long-term memory. It is highly sensitive to normal aging processes and shows robust deficits in mild cognitive impairment and Alzheimer’s disease (Koen & Yonelinas, 2014). Two classic papers by Endel Tulving (1972, 2002) provide both a theoretical conceptualization of episodic memory and relevant empirical measures.

References

Koen, J. D., & Yonelinas, A. P. (2014). The effects of healthy aging, amnestic mild cognitive impairment, and Alzheimer’s disease on recollection and familiarity: A meta-analytic review. Neuropsychology Review, 24(3), 332-354. https://doi.org/10.1007/s11065-014-9266-5
Tulving, E. (1972). Episodic and semantic memory. Organization of memory, 1, 381-403.
Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1– 25. https://doi.org/10.1146/annurev.psych.53.100901.135114

Note: For all included memory tasks, the same item lists were used at each wave of data collection as there was an approximately 4-year interval between testing sessions.

4.15 Hopkins Verbal Learning, Parts 1-4

Description (task duration: 6 minutes):

  • Encoding: Participants memorize a semantically categorized list of 12 concrete nouns that are read aloud by the experimenter at a rate of one word every 1.5 seconds. The three semantic categories are sports, professions, and vegetables, with 4 words in each category.
  • Immediate Recall: Immediately following the presentation, participants are asked to recall aloud as many words from the list as they can in any order. The experimenter records the words recalled on a scoring sheet. The dependent measure is the number of items correctly recalled out of 12.
  • Delayed Recall: After approximately 20 minutes, participants are again asked to recall aloud as many words as possible from the previous list in any order. The experimenter records the words recalled on a scoring sheet. The dependent measure is the number of items correctly recalled out of 12.
  • Delayed Recognition: Following delayed recall, participants are given a recognition test in which the experimenter reads another list of 24 words, including 12 target words (from recall list) and 12 new words (lures). Of the 12 lures, 6 are semantically related to the target items (2 for each semantic category) and 6 are not semantically related to the target items. Participants make “yes”/”no” judgments to indicate if the word was on the original study list. The dependent measure is the total number of correct judgments (including hits + correction rejections) out of 24. In addition, false alarm rates are available for related and unrelated items, as are hits to old items.
Primary Reference:

Brandt, J. (1991). The Hopkins Verbal Learning Test: Development of a new memory test with six equivalent forms. The Clinical Neuropsychologist, 5(2), 125-142. https://doi.org/10.1080/13854049108403297

4.16 CANTAB Verbal Recognition Memory, Parts 1-4

Description (task duration: 7 minutes):

  • Encoding: Twelve nouns are presented on the computer screen one at a time. Participants are asked to read each word aloud and remember as many as they can.
  • Immediate Recall: Immediately following the presentation of the word list, participants are asked to recall aloud as many of the words as possible in any order. Data for the number of items recalled, out of 12, are available.
  • Immediate Recognition: Immediately following recall, participants complete a recognition test in which the computer displays the 12 target items and 12 distractor items, one at a time. Participants answer whether they remember seeing the item earlier in the task on the computer (“yes” or “no”). Performance was near ceiling for this test, and data are not currently processed/checked.
  • Delayed Recognition: The recognition phase is repeated after a delay of approximately 40 minutes. Data for the number of items recognized (and correct rejections), out of 24, are available. Performance was near ceiling for this task and we advise against using it but include it to provide a complete accounting of the methodology.
Primary Reference:

Robbins, T.W., James, M., Owen, A.M., Sahakian, B.J., McInnes, L., Rabbitt, P. (1994). Cambridge Neuropsychological Test Automated Battery (CANTAB): A factor analytic study of a large sample of normal elderly volunteers. Dementia, 5, 266-281. https://doi.org/10.1159/000106735

Software Reference:

CANTAB Eclipse. Cambridge Cognition (2007).
https://www.cambridgecognition.com/cantab/cognitive-tests/memory/verbal-recognition-memory-vrm/

Note: It is recommended that the delayed recognition score (CantabVrmDelayRcg16) should NOT be used as it has strong ceiling effects, resulting in severe skewness and kurtosis. Standard data transformations were unable to correct this issue.

4.17 Woodcock-Johnson Memory for Names, Parts 1-3

Description (task duration: 15 minutes):

  • Task Overview: In this paired-associate recognition task, participants are given 12 trials that each include both an encoding and recognition component. This task is administered using color illustrations in a printed flip-book. On each trial, the participant first learns the name of a single cartoon space creature. Then, they must identify that creature in an array of nine aliens. Finally, they are asked to identify previously learned creatures in that array. The difficulty increases across trials as participants are required to remember the names of an increasingly larger set of creatures (up to 12 unique creatures). A separate delayed recognition test is administered 20 minutes later.
  • Encoding: For the encoding component of each trial, participants are shown a color illustration of the space creature by itself on a page. Participants are told the name of the creature and are asked to point to it on the page (e.g., “This is Meegoy. Point to Meegoy.”; see figure below).
  • Immediate Recognition: Next, for the recognition component of that trial, participants are shown a page of nine space creatures and are asked to point to the newly-introduced creature among the distractors (“Now point to Meegoy”; see figure below). Then, they are asked to point to previously learned creatures (“Now point to Kiptron”). For each trial, the previously learned creatures are tested in a novel order, and whenever the participant responds incorrectly, they are corrected (e.g., “No, this is Meegoy. Point to Meegoy.”). For each trial, they are tested on all previously learned creatures up to a total of 9 creatures; for trials 10-12, the earliest creatures are dropped to keep that total at 9. Specifically, the total number of creatures to be recognized on each trial progresses across the 12 trials as follows: 1, 2, 3, 4, 5, 6, 7, 8, 9, 9, 9, 9 (total = 72 items). The dependent measure is the number of creatures recognized out of 72.
  • Delayed Recognition: After a 20-minute delay, participants are given a surprise recognition test in which they are asked to point to each space creature when prompted by the experimenter. This delayed test has 3 parts, with 12 trials per part. In each trial, the participant is shown an array of nine space creatures, as before, and is asked to point to a previously learned creature (“Now point to Meegoy”). Next, they are shown a new array and asked to point to a different creature (“Now point to Kiptron”). In this test, incorrect responses are no longer corrected by the experimenter, and creatures are not presented in the order originally learned. This process repeats for part 1 until they have been asked to recognize all 12 unique space creatures in one of the 12 different arrays. For parts 2 and 3, they repeat this processing going through the 12 arrays in the same order, but with the items tested being put in a new order—for example, in trial 1 they may now be asked to identify “Delton” instead of “Meegoy”. Thus, all 12 space creatures are tested 3 times each, and the dependent measure is the number of creatures recognized out of 36.
Primary Reference:

Woodcock, R. W., & Johnson, M. B. (1989). Woodcock-Johnson Tests of Achievement. Allen, TX: DLM Teaching Resources.

4.18 Wechsler Memory Scale (WMS-III) Logical Memory, Parts 1-3

Description (task duration: 7 minutes):

  • Encoding: The experimenter reads two highly detailed stories to the participant. One story describes a fictional character reporting a robbery and another describes a character listening to a weather bulletin.
  • Immediate Recall: Immediately after each story, the participant is asked to recall as much of the story as they can, verbatim. The participant’s response is recorded via tape recorder. Reviewing the tape, the experimenter scores the participant’s response by awarding one point per highly specific detail recalled by the participant (called Story Units, e.g., the main character’s name is Anna, the story took place in Boston, the weather forecast predicted rain and hail, etc.). Story Unit scores for Story A and Story B (each out of 25) are calculated by summing all correct details (total out of 50).
  • Delayed Recall: After a delay of approximately 30 minutes, the participant is asked to repeat as much each of the two stories as they can remember with answers recorded. Story Unit scores for Story A and B (each out of 25) are again calculated, with a combined score out of 50.
Task Example:

Story A: This story involves a fictional character, Anna Thompson, reporting at a police station that she was robbed, including additional details about her profession and family. (length: 351 characters)

Story B: This story involves a fictional character, Joe Garcia, hearing a detailed weather bulletin about inclement weather and then Joe deciding to stay home for the day. (length: 470 characters)

Primary Reference:

Wechsler, D. (1997). Wechsler memory scale (WMS-III). San Antonio, TX: Psychological Corporation.

Note: The Logical Memory task can also be scored based on the participants’ recall of seven or eight thematic details from the stories (e.g., broadly, indication of character’s gender, indication of major events in the story – storm, robbery, etc.). The thematic score is not checked or verified and is not used.

4.19 NIH Toolbox Picture Sequence Memory, Parts 1-2

Description (task duration: 7 minutes):

  • Encoding: This test involves recalling increasingly lengthy series of illustrated objects and activities that are presented in a particular order on the computer screen. These picture sequences revolve around two scenarios: playing in a park and going camping. During encoding, each picture is presented individually in the center of the screen for approximately 5 s with pre-recorded instructions describing the image (e.g., “roasting a marshmallow”) and the item then being placed below in a sequence mirroring presentation order (from left-to-right) (see below example).
  • Retrieval: After all items are placed, these pictures are then returned to the center of the screen in a jumbled pattern, and the participant’s task is to move them below again in the correct sequence. There are 15 items in the first trial, and 18 items in the second trial.
  • Scoring: Participants are given credit for each adjacent pair of pictures that are put in the correct sequence, regardless of location. For example, if pictures in locations 7 and 8 are placed in that order and adjacent to each other anywhere–such as slots 1 and 2–one point is awarded. The maximum score for each trial is one less than the trial length, which equates to 14 points for trial 1 and 17 points for trial 2 (Total Score Range: 0-31). Multiple dependent variables are provided via NIH Toolbox: (1) a raw score is their combined score across the two trials (Score Range: 0-31), (2) a computed score uses item response theory to put everyone on a scale of 200-750, (3) an unadjusted scale score compares this computed score with the full NIH Toolbox nationally representative normative sample (normative M = 100, SD = 15) (4) an age-adjusted scale score compares the computed score of the test-taker to those in the NIH normative sample at the same age (M = 100, SD = 15), (5) an age-adjusted national percentile represents the percentage of people nationally above whom the participant’s score ranks (using NIH normative sample), and (6) a fully-adjusted scale score further adjusts for key demographic variables from the NIH normative sample, including age, gender, race/ethnicity (white/Asian, black, Hispanic, multiracial), and educational attainment (M = 50, SD = 10; NIH Toolbox: Scoring and Interpretation Guide, 2016).
Primary Reference:

Dikmen, S. S., Bauer, P. J., Weintraub, S., Mungas, D., Slotkin, J., Beaumont, J. L., … & Heaton, R. K. (2014). Measuring episodic memory across the lifespan: NIH Toolbox Picture Sequence Memory Test. Journal of the International Neuropsychological Society, 20(6), 611-619. https://doi.org/10.1017/S1355617714000460

Software Reference:

NIH Toolbox for the iPad test ver. 2.1
https://nihtoolbox.force.com/s/article/nih-toolbox-scoring-and-interpretation-guide

Note: Participants in DLBS Wave 2 performed the NIH Toolbox Picture Sequence Memory on a desktop computer, whereas, participants in DLBS Wave 3 performed the task on an ipad. For additional details, we refer you to the the NIH Toolbox website: https://www.healthmeasures.net/explore-measurement-systems/nih-toolbox/obtain-and-administer-measures

Episodic Memory Data Set: Key to Names and Data Structure in Data Set

Item NameAbbreviationDescriptionMeasurement
Subject NumberS#Subject identifier 
Age IntervalAgeIntervalAge at wave recoded into 3-year intervals20-100
SexSexParticipant’s biological sex.m = Male
f = Female
RaceRaceRace that the participant self-identifies with.1 = Asian American/ Pacific Islander
2 = Black/African American
3 = Multiracial
4 = Native American
5 = White/Caucasian
6 = Other
7 = Unknown
EthnicityEthnicityEthnicity that the participant self-identifies with.1=Hispanic/Latin(o/a)
0 = Non-Hispanic
Handedness ScoreHandednessScoreAverage score of participant hand preference while completing various tasks. Higher scores indicate preference for the right hand.Score range: 0-4  
0 = Always left
1 = Usually left
2 = No preference
3 = Usually right
4 = Always right
Mini-Mental State Exam TotalMMSETotal # of items answered correctly.Score range: 0-30
Cognitive Battery Wave 1-2 IntervalCogW1toW2Interval between cognitive testing day 1 for waves 1-2.# of Years
Cognitive Battery Wave 2-3 IntervalCogW2toW3Interval between cognitive testing day 1 for waves 2-3.# of Years
Cognitive Battery Wave 1-3 IntervalCogW1toW3Interval between cognitive testing day 1 for waves 1-3.# of Years
Take Home Wave 1-2 IntervalTakeHomeW1toW2Interval between Take Home for waves 1-2.# of Years
Take Home Wave 2-3 IntervalTakeHomeW2toW3Interval between Take Home for waves 2-3.# of Years
Take Home Wave 1-3 IntervalTakeHomeW1toW3Interval between Take Home for waves 1-3.# of Years
MRI Wave 1-2 IntervalMRIW1toW2Interval between MRI scan for waves 1-2.# of Years
MRI Wave 2-3 IntervalMRIW2toW3Interval between MRI scan for waves 2-3.# of Years
MRI Wave 1-3 IntervalMRIW1toW3Interval between MRI scan for waves 1-3.# of Years
Amyloid PET Wave 1-2 IntervalPETAmyW1toW2Interval between amyloid PET scan for waves 1-2.# of Years
Amyloid PET Wave 2-3 IntervalPETAmyW2toW3Interval between amyloid PET scan for waves 2-3.# of Years
Amyloid PET Wave 1-3 IntervalPETAmyW1toW3Interval between amyloid PET scan for waves 1-3.# of Years
Highest Level of Education CompletedEduComp5This is an ordinal measure of participants’ self-reported highest level of education completed. 1 = Less than high school graduate 
2 = High school graduate/GED 
3 = Some college/trade/ technical/business school 
4 = Bachelor’s degree 
5 = Some graduate work 
6 = Master’s degree 
7 = MD/JD/PhD/other advanced degree 
Education Estimated Years CappedEduYrsEstCap5This is a conversion of the participant’s self-reported highest level of education into a capped estimated number of years it would take to reach this highest level of education.   The “capped” comes into play when someone spend a longer time than usual for a certain degree but did not complete it. In short, someone with a lot of years of education but did not complete a degree will not score higher than someone who did complete the degree. 11 maximum = Less than High school
12 = High School
15 maximum = Some College
16 = Bachelor’s degree
20 maximum = Some Graduate Work
18 = Master’s degree
21 = MD/JD/PhD/ Advanced degree  
Construct NameConstructNameEpisodic Memory 
Construct NumberConstructNumberConstruct 4 
WaveWaveDenotes the data collection wave. See individual differences data set for more detail, including testing date intervals.1 = Wave 1
2 = Wave 2
3 = Wave 3  
Has DataHasData1 = Yes, returned for wave; 2 = No, did not return for wave 
Number of Tasks in ConstructNumTasksHow many tasks make up the episodic memory construct5 tasks for Episodic Memory
Task 15—Hopkins Verbal LearningTask15 (column: AA)1 = Has data 2 = Task data partial 3 = No task data 
Hopkins immediate recallHopImmRcll15Total correctly recalledScore Range: 0-12
Hopkins delayed recallHopDelayRcll15Total correctly recalledScore Range: 0-12
Hopkins delayed recognitionHopRcgCrrct15Total correct (hits + correct rejections)Score Range: 0-24
Hopkins delayed recognitionHopRcgHit15Total hits (calling old item old)Score Range: 0-12
Hopkins delayed recognitionHopRcgFaRelat15Total false alarms to distractors semantically related to target (calling new item old)Score Range: 0-6
Hopkins delayed recognitionHopRcgFaUnrelat15Total false alarms to distractors semantically unrelated to target (calling new item old)Score Range: 0-6
Hopkins delayed recognitionHopRcgFaTotal15Total false alarms to distractors (calling new item old)Score Range: 0-12
Hopkins delayed recognitionHopRcgHitminusfa15Total hits – false alarmsScore Range: -12-12
Task 16—CANTAB Verbal Recognition MemoryTask16 (column: AJ)1 = Has data 2 = Task data partial 3 = No task data 
CANTAB Verbal Recognition immediate recallCantabVrmImmRcll16Total correctly recalledScore Range: 0-12
CANTAB Verbal Recognition delayedCantabVrmDelayRcg16Total correctly recognized (hits + correct rejections)Score Range: 0-24
Task 17—Woodcock-Johnson Memory for NamesTask17 (column: AM)1 = Has data 2 = Task data partial 3 = No task data 
Woodcock-Johnson immediate recognition WjImm17Total correctly recognizedScore Range: 0-72
Woodcock-Johnson delayed recognitionWjDelay17Total correctly recognizedScore Range: 0-36
Task 18—Wechsler Memory Scale Logical MemoryTask18 (column: AP)1 = Has data 2 = Task data partial 3 = No task data 
Logical memory immediate recallLmStoryAImm18Total immediate Story A recall scoreScore Range: 0-25
Logical memory immediate recallLmStoryBImm18Total immediate Story B recall scoreScore Range: 0-25
Logical memory immediate recallLmStoryImm18Total immediate Story A+B recall scoreScore Range: 0-50
Logical memory delayed recallLmStoryADelay18Total delayed Story A recall scoreScore Range: 0-25
Logical memory delayed recallLmStoryBDelay18Total delayed Story B recall scoreScore Range: 0-25
Logical memory delayed recallLmStoryDelay18Total delayed Story A+B recall scoreScore Range: 0-50
Task19–NIH Toolbox Picture Sequence MemoryTask19 (column: AW)1 = Has data 2 = Task data partial 3 = No task data 
NIH Toolbox Picture Sequence MemoryNIHPicSeqRaw19Total number of pictures placed in the correct sequence across both trialsScore Range: 0-31
NIH Toolbox Picture Sequence MemoryNIHPicSeqComp19This computed score uses item response theory to put everyone on a scale of 200-750Score Range: 200-750
NIH Toolbox Picture Sequence MemoryNIHPicSeqUn19It compares the performance of the test-taker to those in the entire NIH Toolbox nationally representative normative sample, regardless of age or any other variable.Normative Mean = 100,
SD = 15
NIH Toolbox Picture Sequence MemoryNIHPicSeqAge19This score compares the score of the test-taker to those in the NIH Toolbox nationally representative normative sample at the same age, where a score of 100 indicates performance that was at the national average for the test-taking participant’s age. Age-corrected standard scores were derived for adults (ages 18-85).Mean = 100,
Standard Deviation = 15
NIH Toolbox Picture Sequence MemoryNIHPicSeqPercent19A Percentile represents the percentage of people nationally above whom the participant’s score ranks (the comparison group will be based on whichever normative score is used)Percentile Rank: 0-100  
NIH Toolbox Picture Sequence MemoryNIHPicSeqFully19This score compares the score of the test-taker to those in the NIH Toolbox nationally representative normative sample, while adjusting for key demographic variables (education, gender, and race/ethnicity) collected during the NIH Toolbox national norming study.Mean = 50,
Standard Deviation = 10