10 Results found for " Wiktionary:Corpora"

Wiktionary:Corpora

the work of creating a dictionary. These collections are often known as "corpora" or less commonly "corpuses". Many of them feature functions like full-text...

Wiktionary:Frequency lists/Danish

(multiple corpora, 2014-2021) Word frequency lists for Danish and other languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection...

Wiktionary:Frequency lists/Finnish

languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0) 50K and larger word lists based on www.opensubtitles...

Wiktionary:Frequency lists/Uyghur

Word frequency lists for English and other languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0)...

Wiktionary:Frequency lists/Cebuano

Cebuano from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0) 200 most used words in Cebuano according to Binisaya...

Wiktionary:Frequency lists/Odia

Odia and other languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0) 1,000 most common Odia words...

Wiktionary:Frequency lists/Silesian

Word frequency lists for Silesian and other languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0)...

Wiktionary:Frequency lists/Manx

languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0) 1,000 most common Manx words Archived link....

Wiktionary:Frequency lists/North Frisian

frequency lists for North Frisian and other languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0)...

Wiktionary:Frequency lists/Kabardian

Word frequency lists for Kabardian and other languages from 10K up to 1M, available for download as part of the Leipzig Corpora Collection (CC BY-4.0)...

Wiktionary:Corpora

This page is dedicated to listing collections of texts useful for the work of creating a dictionary. These collections are often known as "corpora" or less commonly "corpuses". Many of them feature functions like full-text search, term frequency information and collocation search. For a more user-friendly introduction to some of the most prominent corpora, as well as other resources like dictionaries, see Wiktionary:Quotations/Resources. Another page, Wiktionary:Searchable external archives also contains information with a more specific focus on those which can solidly provide citations passing Wiktionary's criteria for inclusion. Note that corpora that contain text in multiple languages but where English text makes up of a significant portion of the corpora are listed in the English table below with their "Dialect" in the listing including the word "Multilingual". If there are any other resources that you know of which aren't listed here, please do add them or suggest them on the talk page.

10 Results found for " Wiktionary:Corpora"

Wiktionary:Corpora

Wiktionary:Frequency lists/Danish

Wiktionary:Frequency lists/Finnish

Wiktionary:Frequency lists/Uyghur

Wiktionary:Frequency lists/Cebuano

Wiktionary:Frequency lists/Odia

Wiktionary:Frequency lists/Silesian

Wiktionary:Frequency lists/Manx

Wiktionary:Frequency lists/North Frisian

Wiktionary:Frequency lists/Kabardian

Wiktionary:Corpora

Wikious

Boobota

Sagapedia