Article

The Google Effect and Miller’s Magic Number: Implications for Questionnaire and Clinical Outcome Assessment Design.

Dec 9, 2025

Mark Gibson

,

United Kingdom

Health Communication and Research Specialist

We have written lots about Miller’s Magic Number, the Google Effect and the impact on our cognitive abilities. We revisit with frequency because it has a direct impact of health information provision and writing for patient audiences. Clinical Outcome Assessments are particularly vulnerable to our patterns of cognitive offloading. This article explains why.

Memory Constraints in Self-Reported Assessments

In the age of digital information, the working memory is no longer what it used to be. The Google Effect, also known as digital amnesia, describes how people tend to forget information that they can easily retrieve from the internet. Meanwhile, Miller’s Magic Number 7 (±2) suggests that the human short-term memory can only hold about 5-9 items at once. Later work by Nelson Cowan challenged this generous estimate, suggesting instead that the true capacity of working memory is closer to 4 ± 1 items. In other words, our real limit is narrower than Miller proposed, and the buffer smaller than many assume. With the Google Effect, there is evidence to suggest that this number has dropped to 3 to 4. We do not know yet the extent to which overuse of LLMs has further atrophied this ability. These cognitive limitations have significant implications for questionnaire design and Clinical Outcome Assessments (COAs), where memory recall is essential for accurate responses.

Questionnaires and COAs often rely on self-reported data, which assumes that respondents can accurately recall past experiences, symptoms or behaviours. However, the Google Effect means that people are more likely to remember where to find information rather than the information itself. This makes memory-based questions prone to errors, particularly when respondents:

·       Struggle to recall medical history, such as dates of diagnoses, when they started treatments.

·       Forget details about symptoms over long periods, e.g. how often did you feel fatigued in the past six months?

·       Rely on external sources (Google, medical records) to reconstruct answers rather than recalling them naturally.

A solution could be to reduce the recall periods in COA items. Instead of asking about a broad time frame, such as “in the past year”, use shorter recall periods like “in the past week…” to improve accuracy. Could reliance on recall be supported by allowing reference to medical records or symptom tracking apps or wearable devices?

Cognitive Offloading and Digital Questionnaire Design

Digital surveys and COAs allow patients to pause, return and look up information, which can help or hinder accuracy. The Google Effect encourages cognitive offloading, meaning that rather than storing information in memory, respondents expect to look it up when needed.

How This Affects Digital Surveys

·       Increased search behaviour: Respondents might Google symptoms before answering, influencing their self-perception and biasing responses.

·       Reduced engagement with complex surveys: If a questionnaire is too long or difficult, users may skip questions or select random answers.

·       Better accuracy when referencing external sources: if the goal is factual accuracy, such as medication lists, allowing respondents to access medical records improves reliability.

One solution could be to optimise digital forms in a way that:

·       respondents can auto-save responses and allow users to return later.

·       use intelligent forms that suggest possible answers, e.g. medication names in a drop-down menu instead of open text entry.

·       Design mobile-friendly survey that align with how people access digital information.

Overcoming the Google Effect in Self-Reported Data

One of the biggest challenges in clinical assessments is ensuring that self-reported data is as reliable as possible. The Google Effect means that people often rely on search results rather than personal memory, which can create bias and inconsistency in questionnaire responses.

This is how it affects clinical assessments:

·       A patient with mild depression may Google symptoms before answering and report feeling worse than they actually do.

·       A participant in a cognitive study may remember a symptom after seeing it listed rather than genuinely recalling it.

·       A participant taking a survey on past hospital visits may estimate incorrectly rather than checking records.

Solution: Reduce Bias and Improve Accuracy

·       Instead of free recall questions, use anchored response scales, such as “Compared last week, how has your pain changed?”

·       If self-reported data is critical, cross-check responses with objective data, such as medical records or clinical assessments.

·       Use adaptive questioning. For example, if a participant selects “Yes” to taking medication, the system asks, “Do you recall the dosage?” instead of forcing an unnecessary open-ended response.

Managing Cognitive Load with Miller’s and Cowan’s Numbers

Miller’s theory states that short-term memory can only hold about 7 ± 2 numbers at a time. Cowan’s refinement suggests that a more realistic figure is 4 ± 1, which makes the design implications even sharper: if patients are really only comfortable juggling four or so items, then even moderate-length response lists risk overload. This has direct implications for questionnaire design, especially in multiple-choice formats and Likert scales.

How Cognitive Load Affects Response Quality

·       Too many answer choices overwhelm respondents, leading to decision fatigue.

·       Long questions increase cognitive strain, making respondents more likely to misinterpret or skip questions.

·       Multiple questions per page reduce accuracy as respondents struggle to remember previous responses.

One solution is to optimise question and response design by:

·       Limiting response options to 5 to 7 choices to align with cognitive capacity. Or, if we take Cowan seriously, aiming for 3 to 5 options may be closer to the true human limit.

·       Use chunking techniques, break complex questions into smaller, digestible sections.

·       For long surveys, include a progress bar to maintain engagement and reduce dropout rates.

From Miller’s 7 ± 2 to Google’s 3-4 and Now the LLM Era

If Miller’s number originally set the limit at 5-9, the Google Effect appears to have reduced this practical span to 3 to 4 items by encouraging people to externalise memory. In reality, Cowan had already placed the natural human limit at 4 ± 1; what Google and now LLMs may be doing is shrinking us below even that baseline. Large Language Models (LLMs) such as ChatGPT may go further by not just storing information but also processing it on behalf of the user.

This means that we are not only offloading what we know but increasingly how we think in terms of summarisation, synthesis and evaluation. All this can now be outsourced. If people become habituated to delegating these operations, their ability to juggle multiple response options may further atrophy. Instead of weighing 5 to 7 choices, users may expect only 2 to 3 simplified alternatives.

Implications for Clinical Outcome Assessments

·       Choice overload becomes a risk even with 5 to 7 options; shorter 3-4 point scales may yield more reliable data.

·       Bias may increase if patients “prime” themselves with prototypical symptom descriptions from AI systems.

·       Recall-based measures may increasingly reflect cognitive offloading strategies rather than true memory.

In short, we may now be entering an era where 3 or less items are the practical limit of working memory span for many digital-age participants, with direct consequences for survey design, scale length and the interpretation of self-reported outcomes.


Thank you for reading,


Mark Gibson

Cockermouth, United Kingdom, August 2025

Originally written in

English