Product

By respondent, group, question, or cohortCombinable filtersTransparent methodology

A language score that is readable, actionable, and traceable.

The language level score combines readability, sentence structure, lexical richness, punctuation, and register signals to help you interpret open answers with more nuance.

Purple and orange globe in an illustrative style.

Global level

0 to 100

A synthetic score to compare your segments quickly.

Indicative CEFR reading

A1 to C2

A pedagogical reference point, not an official certification.

Visible components

5 families

Each subscore is displayed so you can interpret the result.

Scoring methodology

1

1. Scope aggregation

The engine aggregates responses for the selected scope (question, respondent, group, cohort) using the same filters as your other analyses.

2

2. Metric extraction

We extract simple indicators: average sentence length, lexical density, vocabulary variety, punctuation, abbreviations, and out-of-language words.

3

3. Component normalization

Each indicator family is normalized onto a shared scale to prevent one raw metric from artificially dominating the score.

4

4. Composite score and reading

The final score combines weighted components and provides an indicative language level reading (A1 to C2) to simplify reporting.

Tracked components

Readability

Estimates reading ease with formulas based on sentence and word length.

Syntactic structure

Measures sentence complexity through construction variety and structural depth.

Lexical richness

Observes word diversity and lexical variety stability across texts of different lengths.

Punctuation

Analyzes punctuation density and diversity as an indicator of discourse structuring.

Register signals

Detects register signals: abbreviations, code-mixing, and informal markers.

Response volume

Integrates useful response length to avoid fragile conclusions from very short verbatims.

How to interpret this score

Compare before evaluating

The score is especially useful comparatively: between cohorts, between groups, or before/after an intervention.

Read subscores

The same global score can come from different linguistic profiles. Components explain what actually changes.

Use domain context

Expected level depends on context (training, recruiting, internal diagnosis). The score is not an end in itself.

Avoid automated selection use

We recommend using it as a reading and facilitation aid, never as a single criterion for individual decisions.

Methodology sources

The score relies on metric families widely documented in readability, linguistic complexity, and CEFR literature.

Wood and metal chair on a light background.

Flesch (1948) - readability

Foundational article on readability assessment based on length indicators.

Journal of Applied Psychology

View source
Illustrated mug with a stylized face on a light background.

Kincaid et al. (1975) - readability formulas

Technical report describing grade-level and readability formulas.

U.S. Navy / University of Central Florida

View source
Miniature foosball table in 3D style on a light background.

McCarthy & Jarvis (2010) - lexical diversity

Comparison of MTLD, vocd-D, and HD-D to measure lexical richness.

Behavior Research Methods

View source
Stylized purple and orange rocket on a light background.

Lu (2010) - syntactic complexity

Reference work on automatic analysis of syntactic complexity in L2 writing.

International Journal of Corpus Linguistics

View source
Two stylized suits displayed side by side.

CEFR Companion Volume (Council of Europe)

A1 to C2 descriptors used as pedagogical interpretation references.

Council of Europe

View source
Character climbing stairs to an airplane in 3D style.

GLUECoS (ACL 2020) - code-switching

Reference benchmark for multilingual text analysis and code-mixing.

Association for Computational Linguistics

View source

Some academic references are available through DOI or scientific publishers. Harmate applies these principles pragmatically for operational interpretation of open responses.

Related pages

Measure the linguistic quality of your open responses

Enable the language score in your Harmate analyses and compare populations with explicit components.