Module:string/compare

Hello, you have come here looking for the meaning of the word Module:string/compare. In DICTIOUS you will not only get to know all the dictionary meanings for the word Module:string/compare, but we will also tell you about its etymology, its characteristics and you will know how to say Module:string/compare in singular and plural. Everything you need to know about the word Module:string/compare you have here. The definition of the word Module:string/compare will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofModule:string/compare, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.


local byte = string.byte
local match = string.match
local sub = string.sub

--[==[
A comparison function for strings, which returns {true} if {a} sorts before {b}, or otherwise {false}; it can be used as the sort function with {table.sort}.

This function always sorts using byte-order, which makes it roughly equivalent to the {<} operator, but with fixes for two serious bugs raised in ] and ]:
* {<} is supposed to compare UTF-8 codepoints in the two strings, but when a codepoint that is U+10000 or above is encountered in the left-hand string, {<} always returns {false}, irrespective of the content of the other string.
* {<} treats unassigned codepoints and non-UTF-8 byte sequences as being higher than {"\0"} but lower than {"\1"}, instead of sorting according to byte order.]==]
return function(a, b)
	-- Equality check.
	if a == b then
		return false
	end
	-- Byte comparison is slow, so only do it when it's really needed:
	-- iterate over both strings, grabbing a set of ASCII bytes followed by
	-- a set of non-ASCII bytes from each (either of which could be empty),
	-- and compare them with ==. If the ASCII substrings are unequal, just
	-- use <, since the bug won't affect it. Otherwise, compare bytes in the
	-- non-ASCII substrings.
	local loc, ascii_a, nonascii_a, ascii_b, nonascii_b = 1
	repeat
		ascii_a, nonascii_a = match(a, "^(*)(*)", loc)
		ascii_b, nonascii_b, loc = match(b, "^(*)(*)()", loc) -- update `loc` on the second call
		-- When comparing ASCII sets, use <. The lower substring will be
		-- from the lower string *except* when it comprises the start of the
		-- other substring and is followed by a non-ASCII character. For
		-- instance, if `ascii_a` is "pqrs":
		-- If `ascii_b` is "abc", `b` is lower, since "abc" < "pqrs".
		-- If `ascii_b` is "pqr" and followed by non-ASCII "ž", `a` is
		-- lower, since "pqrs" < "pqrž".
		-- If `ascii_b` is "pqr" and at the end of `b`, `b` is lower, since
		-- "pqr" < "pqrs".
		if ascii_a ~= ascii_b then
			if ascii_a < ascii_b then
				return nonascii_a == "" or ascii_a ~= sub(ascii_b, 1, #ascii_a)
			end
			return not (nonascii_b == "" or ascii_b ~= sub(ascii_a, 1, #ascii_b))
		end
	-- If the non-ASCII parts are not equal, terminate the loop.
	until nonascii_a ~= nonascii_b
	-- If either one is the empty string, then the end of that string has
	-- been reached, making it the lower string.
	if nonascii_a == "" then
		return true
	elseif nonascii_b == "" then
		return false
	end
	loc = 1
	while true do
		-- 4 bytes at a time is a balance between minimizing the number of
		-- byte() calls without grabbing unnecessary extra bytes after the
		-- difference.
		local b_a1, b_a2, b_a3, b_a4 = byte(nonascii_a, loc, loc + 3)
		if b_a1 == nil then
			return true
		end
		local b_b1, b_b2, b_b3, b_b4 = byte(nonascii_b, loc, loc + 3)
		if b_a1 ~= b_b1 then
			return b_b1 and b_a1 < b_b1
		elseif b_a2 ~= b_b2 then
			return b_a2 == nil or b_b2 and b_a2 < b_b2
		elseif b_a3 ~= b_b3 then
			return b_a3 == nil or b_b3 and b_a3 < b_b3
		elseif b_a4 ~= b_b4 then
			return b_a4 == nil or b_b4 and b_a4 < b_b4
		end
		loc = loc + 4
	end
end