Methodology

How we measure cognitive performance — and what we don't claim.

A short walk through the construct mapping, scoring, aggregation, and privacy posture behind the WelloWork platform. We publish methodology as we generate pilot data.

01
Adaptive task
Difficulty calibrates per session
02
Construct mapping
One primary cognitive construct
03
Per-employee baseline normalisation
Compared to your own trend
04
Team-level aggregate
Threshold-gated, weekly smoothed

No clinical claims. No population ranking.

Working MemoryProcessing SpeedAttentionProblem SolvingCognitive Flexibility

What is the WelloWork methodology, in one paragraph? WelloWork measures cognitive performance via short adaptive tasks mapped to five validated constructs — working memory, processing speed, attention, problem solving, and cognitive flexibility. Per-session scores are normalised against a per-employee baseline and aggregated into team-level trends with minimum-team-size enforcement. We make no clinical or diagnostic claims about biomarker reports.

How are exercises mapped to constructs?

Each task on the platform maps to one primary construct and at most one secondary construct. The primary mapping drives scoring; the secondary mapping is captured for methodology audit but does not contribute to the headline metric. We use established paradigms — N-back, span tasks, symbol-substitution, Posner cueing, Raven-style reasoning, task-switching — adapted for short on-platform sessions.

TaskPrimary construct

Sequence recallWorking Memory
Symbol substitutionProcessing Speed
Target detectionAttention

Each task maps to one primary construct. Secondary signals are captured but do not drive the score.

Why WelloWork scenarios — not cognitive games?

Research consistently shows people get better at cognitive games — not at the complex decisions their jobs actually require. WelloWork scenarios are grounded in behavioral field work.

Standard cognitive gameGame-transfer problem

Meta-analyses consistently show people improve at the game. Transfer to real workplace complexity: minimal.

WelloWork behavioral scenarioEcologically valid

Built from field observations of how people actually reason at work. Not a proxy. Not a game.

How are scores computed?

Per-session performance is normalised against the employee's own running baseline (z-scored within the last 90 days). This deliberately avoids comparing one employee against another at the individual level, since population-relative scoring is sensitive to noise that doesn't matter in a workplace context.

Population rankingNot used

Comparing an employee against a peer distribution. Noisy at the individual level — and uneasy at work.

Within-person baselineWelloWork approach

Each session is scored against the employee's own running baseline from the last 90 days.

We don't compare employees to each other. We track each person against their own established baseline.

How are aggregations done?

Minimum team size enforced

Aggregates only appear when the team threshold is met.

Weekly smoothing

Single-day spikes don't drive manager attention.

Work event annotations

Sprint reviews, releases, and on-call periods are tagged automatically.

Level vs. variability

Reported separately — they answer different questions.

What do we deliberately not claim?

We do not claim transfer of cognitive training to specific business outcomes (revenue, productivity). We do not claim clinical or diagnostic value for biomarker reports. We do not claim individual employee ranking is reliable from short adaptive tasks — only that trends are. And we do not invent metrics: every claim ties back to a published construct or to a methodology note we will publish under /research.

What we measureSupported

Cognitive trends over time
Team-level patterns
Construct-mapped behavioral signals
Within-person variance

What we do not claimOut of scope

Transfer to specific business outcomes
Clinical or diagnostic value
Reliable individual ranking from short tasks
Self-invented metrics

How do we publish updates?

Methodology notes will be posted under /research/science-insight as pilot cohorts produce enough data to write something defensible. We will not publish individual customer data, and we will not publish aggregates that don't meet our minimum-team threshold.

M1Pilot cohort data collected
M2Methodology note drafted + reviewed
M3Published under /research/science-insight

We publish when the data is defensible. Not before.

Privacy in the methodology

Methodology and privacy are linked. The platform's choice to normalise within an employee, not against a population, is also what makes it harder to "de-anonymise" an aggregate.

Why within-person normalisation protects privacy

An individual baseline carries no information about peers — there is no peer distribution to back-solve against.

Why we don't publish sub-threshold aggregates

Small-cohort aggregates can be reverse-engineered to a single employee. We suppress them at the source.

The same methodological choices that make our measurement accurate also make de-anonymisation structurally harder.

Want the methodology walked through live?

A demo includes a 5–10 minute slot to dig into how a specific exercise maps to a construct, and how the score lands in the manager view.

Book a demo Back to research