Module:encodings

Hello, you have come here looking for the meaning of the word Module:encodings. In DICTIOUS you will not only get to know all the dictionary meanings for the word Module:encodings, but we will also tell you about its etymology, its characteristics and you will know how to say Module:encodings in singular and plural. Everything you need to know about the word Module:encodings you have here. The definition of the word Module:encodings will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofModule:encodings, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.

The module defines a set "encoders" which are able to encode the text into a given encoding. More encoders can be added to the module as necessary.

encode

Function encode(text, encoding)

Takes 'text' in UTF-8 encoding, encodes into 'encoding' and into %-encoding, returns the resulting string.

Explanation

  • UTF-8: the wiki software and all its pages and output use UTF-8.
  • %-encoding: 1) encoded text is bound to contain invalid UTF-8 text and Scribunto does not allow modules to return invalid UTF-8 text (replaces any invalid bytes in the output with � ), 2) primary use of this function is to encode text for use in URLs (external links), for certain sites that use older encodings.

Custom encoder methods

Number difference

  • Paste input and output hex-numbers to LibreOffice Calc and sort by input
  • Convert base-16 to base-10 (e.g. with onlinenumbertools.com, toolslick.com, or see below custom converter)
  • Calculate difference (using references)
  • Sort by difference, color the background of repeating differences, sort by input
Custom converter
<!DOCTYPE html>
<html>
<head>
<script>
function baseconvert() {
	var base0 = parseInt( document.querySelector("#base0").innerText );
	var base1 = parseInt( document.querySelector("#base1").innerText );
	var dText0 = document.querySelector("#text0");
	var dText1 = document.querySelector("#text1");
	var ns = dText0.value.split("\n"); // numbers
	dText1.innerHTML = "";
	for (var i=0; i < ns.length; i++) {
		if (i)
			dText1.append( "\n" );
		if ( ns.search(/^+$/gi) != -1 )
			dText1.append( parseInt(ns,base0).toString(base1) ); // XX > 10 > ZZ
	}
	dText0.style = "32px";  // variable .text height
	dText1.style = "32px";
	dText0.style = dText0.scrollHeight + "px";
	dText1.style = dText1.scrollHeight + "px";
}
document.addEventListener("DOMContentLoaded", function() {
	baseconvert();
	document.querySelector("#text0").addEventListener("input", baseconvert);
});
</script>
<style>
body {background: lightgray; display: grid; grid-template-columns: auto auto; gap: 8px;}
#desc {grid-column: 1 / -1;}
#desc, #base0, #base1 {justify-self: center; width: auto;}
#text0, #text1 {font-family: monospace, monospace; resize: none;}
</style>
</head>
<body>
	<div id="desc">Convert from left to right</div>
	<div id="base0" contenteditable>16</div>
	<div id="base1" contenteditable>10</div>
	<textarea id="text0">BADCAFE</textarea>
	<textarea id="text1" readonly></textarea>
</body>
</html>

Examples

{{R:IEC2}} (Catalan IEC dictionary) required input in ISO 8859-1 encoding:

  • {{#invoke:encodings|encode|abundància|ISO 8859-1}}
  • abund%E0ncia (< abundància)

See also


local export = {}

local m_str_utils = require("Module:string utilities")

local char = string.char
local concat = table.concat
local gcodepoint = m_str_utils.gcodepoint
local insert = table.insert
local u = m_str_utils.char

local encoders = {}

encoders = function(text)
	local ret = {}
	
	for cp in gcodepoint(text) do
		if cp >= 256 then
			error("Invalid ISO 8859-1 character \"" .. u(cp) .. "\".")
		end
		
		insert(ret, char(cp))
	end
	
	return concat(ret)
end

encoders = function(text)	-- ]
	local ret = {}
	local range -- 0 1 2 3 4 5 6 7
	local diff2 = {=57, =98, =98, =58, =40, =52, =56, =69, =95, =94, =93, =95, =77, =96}
	local diff4 = {=21, =62, =76, =22, =19, =31, =20, =24, =59, =58, =57, =59, =56, =60, =103, =89}
	local diff6 = {=61, =61, =71, =71, =88, =73, =73, =90, =90, =90, =77, =97, =103, =110, =95, =228, =285, =329}
	
	for cp in gcodepoint(text) do
		range = (191<cp and 1 or 0) + (1023<cp and 1 or 0) + (1039<cp and 1 or 0) + (1104<cp and 1 or 0) + (1169<cp and 1 or 0) + (8210<cp and 1 or 0) + (8482<cp and 1 or 0)
		if range==2 and diff2 then
			cp = cp-800-diff2
		elseif range==3 then
			cp = cp-848
		elseif range==4 and diff4 then
			cp = cp - 900 - diff4
		elseif range==6 and diff6 then
			cp = cp -8000 - diff6
		elseif range>0 then
			cp = 63 -- '?'
		end
		insert(ret, char(cp))
	end
	return concat(ret)
end

function export.encode(text, encoding)
	if type(text) == "table" then
		local params = {
			 = {required = true, allow_empty = true},
			 = {required = true},
		}
		
		local args = require("Module:parameters").process(text.args, params)
		text = args
		encoding = args
	end
	
	local encoder = encoders
	
	if not encoder then
		error("No encoder exists for the encoding \"" .. encoding .. "\".")
	end
	
	return mw.uri.encode(encoder(text))
end

return export