Wiktionary:Frequency lists/Esperanto/Tekstaro 2023

Based on the words found in the Tekstaro dump of 2023-03-15. All words are reduced to their base form (plural -j and accusative -n are stripped, the verb endings -as/-is/-os/-us/-u are changed to the infinitive -i). Each word is listed in the most typical case form (lower-case, capitalized, or all-caps). Non-Esperanto-ified proper names are mostly omitted (unless listed in common dictionaries). The total size of the corpus is more than 10 million words.

A version easier to copy and to parse is available on GitHub. There is also a version where words that are not present in ESPDIC (Esperanto-English Dictionary) are filtered out, and one version where each Esperanto word is directly followed by English translations.

First hundred by frequency

100 most common words

Together these 100 words cover 52.70% percent of the whole corpus.

Second hundred

Frequency rank of 101–200

Together these 200 words cover 59.60% percent of the whole corpus.

Third hundred

Frequency rank of 201–300

Together these 300 words cover 63.66% percent of the whole corpus.

Fourth hundred

Frequency rank of 301–400

Together these 400 words cover 66.56% percent of the whole corpus.

Fifth hundred

Frequency rank of 401–500

Together these 500 words cover 68.83% percent of the whole corpus.

Frequency rank of 501–1000

Together these 1000 words cover 76.15% percent of the whole corpus.

Frequency rank of 1001–2000

Together these 2000 words cover 83.46% percent of the whole corpus.

Frequency rank of 2001–3000

Together these 3000 words cover 87.37% percent of the whole corpus.

Frequency rank of 3001–4000

Together these 4000 words cover 89.89% percent of the whole corpus.

Frequency rank of 4001–5000

Together these 5000 words cover 91.66% percent of the whole corpus.

Frequency rank of 5001–6000

Together these 6000 words cover 92.98% percent of the whole corpus.

Frequency rank of 6001–7000

Together these 7000 words cover 94.01% percent of the whole corpus.

Frequency rank of 7001–8000

Together these 8000 words cover 94.84% percent of the whole corpus.

Frequency rank of 8001–9000

Together these 9000 words cover 95.51% percent of the whole corpus.

Frequency rank of 9001–10000

Together these 10000 words cover 96.08% percent of the whole corpus.

Frequency rank of 10001–11000

Together these 11000 words cover 96.55% percent of the whole corpus.

Frequency rank of 11001–12000

Together these 12000 words cover 96.96% percent of the whole corpus.

Frequency rank of 12001–13000

Together these 13000 words cover 97.31% percent of the whole corpus.

Frequency rank of 13001–14000

Together these 14000 words cover 97.61% percent of the whole corpus.

Frequency rank of 14001–15000

Together these 15000 words cover 97.88% percent of the whole corpus.