User talk:Equinox/code/ExtractBookWords

Hello, you have come here looking for the meaning of the word User talk:Equinox/code/ExtractBookWords. In DICTIOUS you will not only get to know all the dictionary meanings for the word User talk:Equinox/code/ExtractBookWords, but we will also tell you about its etymology, its characteristics and you will know how to say User talk:Equinox/code/ExtractBookWords in singular and plural. Everything you need to know about the word User talk:Equinox/code/ExtractBookWords you have here. The definition of the word User talk:Equinox/code/ExtractBookWords will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofUser talk:Equinox/code/ExtractBookWords, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.

Hello @Equinox,

I used this, changing the line

               if ((ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z'))

to

               if ((ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z') || (ch == 'æ') || (ch == 'ø') || (ch == 'å') || (ch == 'Æ') || (ch == 'Ø') || (ch == 'Å'))

yet the program stills throws out those extra letters. It works fine apart from that. Perhaps you can tell me what I've done wrong?__Gamren (talk) 14:00, 5 October 2016 (UTC)Reply

My immediate thought is that the program uses File.ReadAllText(string) and not File.ReadAllText(string, Encoding): if your file contains complex characters, you might need to specify the encoding, such as UTF-8 or Windows-Latin-1. That's just a guess. Since "throws out" in your comment might mean either "discards" or "outputs" (isn't English dumb?), I don't really understand what problem you are having. Equinox 20:00, 5 October 2016 (UTC)Reply
Thank you! UTF8 didn't work for some reason, so I used
           string s = File.ReadAllText(INPUT_FILE, Encoding.UTF7)
which worked. I am a complete newbie to C#, or C in general. By "throws out" I meant "discards".__Gamren (talk) 06:26, 6 October 2016 (UTC)Reply
By the way, how would you recommend easily getting books in text format, apart from Gutenberg?__Gamren (talk) 09:20, 6 October 2016 (UTC)Reply
I don't really know any other (legal!) sources. Equinox 16:11, 7 October 2016 (UTC)Reply
If copyright is the issue, how about if one changed the sequence of words, e.g. through alphabetization? Surely that would not constitute infringement?__Gamren (talk) 09:40, 8 October 2016 (UTC)Reply
You're using the copyrighted work to create another work, so I think that counts as a "derivative work" or something. IANAL. Equinox 09:51, 8 October 2016 (UTC)Reply
I mean, I don't think that generating a word list from a typical novel etc. is a problem, but I thought you were asking where to get hold of computerised copies of books that are still in copyright. You'd have to go to illegal torrents etc. (or maybe hack Amazon Kindle's DRM!). Equinox 09:53, 8 October 2016 (UTC)Reply