Based on the words found in the Tekstaro dump of 2023-03-15. All words are reduced to their base form (plural -j and accusative -n are stripped, the verb endings -as/-is/-os/-us/-u are changed to the infinitive -i). Each word is listed in the most typical case form (lower-case, capitalized, or all-caps). Non-Esperanto-ified proper names are mostly omitted (unless listed in common dictionaries). The total size of the corpus is more than 10 million words.
A version easier to copy and to parse is available on GitHub. There is also a version where words that are not present in ESPDIC (Esperanto-English Dictionary) are filtered out, and one version where each Esperanto word is directly followed by English translations.
Together these 100 words cover 52.70% percent of the whole corpus.
Together these 200 words cover 59.60% percent of the whole corpus.
Together these 300 words cover 63.66% percent of the whole corpus.
Together these 400 words cover 66.56% percent of the whole corpus.
Together these 500 words cover 68.83% percent of the whole corpus.
Together these 1000 words cover 76.15% percent of the whole corpus.
Together these 2000 words cover 83.46% percent of the whole corpus.
Together these 3000 words cover 87.37% percent of the whole corpus.
Together these 4000 words cover 89.89% percent of the whole corpus.
Together these 5000 words cover 91.66% percent of the whole corpus.
Together these 6000 words cover 92.98% percent of the whole corpus.
Together these 7000 words cover 94.01% percent of the whole corpus.
Together these 8000 words cover 94.84% percent of the whole corpus.
Together these 9000 words cover 95.51% percent of the whole corpus.
Together these 10000 words cover 96.08% percent of the whole corpus.
Together these 11000 words cover 96.55% percent of the whole corpus.
Together these 12000 words cover 96.96% percent of the whole corpus.
Together these 13000 words cover 97.31% percent of the whole corpus.
Together these 14000 words cover 97.61% percent of the whole corpus.
Together these 15000 words cover 97.88% percent of the whole corpus.