Module:Languages/documentation
This module is used to retrieve and manage the languages that can have Wiktionary entries, and the information associated with them. See Wiktionary:Languages for more information.
For the languages and language varieties that may be used in etymologies, see Module:etymology languages. For language families, which sometimes also appear in etymologies, see Module:families.
This module provides access to other modules. To access the information from within a template, see Module:languages/templates.
The information itself is stored in the various data modules that are subpages of this module. These modules should not be used directly by any other module, the data should only be accessed through the functions provided by this module.
Data submodules:
- Two-letter codes
- Three-letter codes by their first letter: Script error: The function "gsub" does not exist.
- Codes containing hyphens (
-
)
Extra data submodules (for less frequently used data):
- Two-letter codes
- Three-letter codes by their first letter: Script error: The function "gsub" does not exist.
- Codes containing hyphens (
-
)
Finding and retrieving languages[edit source]
The module exports a number of functions that are used to find languages.
getByCode[edit source]
getByCode(code, paramForError, allowEtymLang, allowFamily)
Finds the language whose code matches the one provided. If it exists, it returns a Language
object representing the language. Otherwise, it returns lua
, unless paramForError
is given, in which case an error is generated. If paramForError
is lua
, a generic error message mentioning the bad code is generated; otherwise paramForError
should be a string or number specifying the parameter that the code came from, and this parameter will be mentioned in the error message along with the bad code. If allowEtymLang
is specified, etymology language codes are allowed and looked up along with normal language codes. If allowFamily
is specified, language family codes are allowed and looked up along with normal language codes.
getByCanonicalName[edit source]
getByCanonicalName(code, paramForError, allowEtymLang, allowFamily)
Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a Language
object representing the language. Otherwise, it returns lua
, unless paramForError
is given, in which case an error is generated. If allowEtymLang
is specified, etymology language codes are allowed and looked up along with normal language codes. If allowFamily
is specified, language family codes are allowed and looked up along with normal language codes.
The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result.
This function is powered by Module:languages/canonical names, which contains a pre-generated mapping of non-etymology-language canonical names to codes. It is generated by going through the Category:Language data modules for non-etymology languages. When allowEtymLang
is specified for the above function, Module:etymology languages/by name may also be used, and when allowFamily
is specified for the above function, Module:families/by name may also be used.
getByName[edit source]
getByName(name)
Like lua
, except it also looks at the otherNames
listed in the non-etymology language data modules, and does not (currently) have options to look up etymology languages and families.
getNonEtymological[edit source]
getNonEtymological(lang)
If given an etymology language, this iterates through parents until a regular language or family is found, and the corresponding object is returned. If given a regular language or family, the object itself is returned.
Finding all languages[edit source]
Use Module:languages/iterateAll to find all languages.
Language objects[edit source]
A Language
object is returned from one of the functions above. It is a Lua representation of a language and the data associated with it. It has a number of methods that can be called on it, using the :
syntax. For example:
local m_languages = require("Module:languages")
local lang = m_languages.getByCode("fr")
local name = lang:getCanonicalName()
-- "name" will now be "French"
Language:getCode[edit source]
:getCode()
Returns the language code of the language. Example: lua
for French.
Language:getCanonicalName[edit source]
:getCanonicalName()
Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: lua
for French.
Language:getDisplayForm[edit source]
:getDisplayForm()
Returns the display form of the language. The display form of a language, family or script is the form it takes when appearing as the SOURCE in categories such as English terms derived from SOURCE
or English given names from SOURCE
, and is also the displayed text in :makeCategoryLink
links. For regular and etymology languages, this is the same as the canonical name, but for families, it reads "NAME languages" (e.g. lua
), and for scripts, it reads "NAME script" (e.g. lua
).
Language:getOtherNames[edit source]
:getOtherNames(onlyOtherNames)
Returns a table of the "other names" that the language is known by, excluding the canonical name. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: lua
for Manx. If onlyOtherNames
is given and is non-lua
, only names explicitly listed in the otherNames
field are returned; otherwise, names listed under otherNames
, aliases
and varieties
are combined together and returned. For example, for Manx, Manx Gaelic is listed as an alias, while Northern Manx and Southern Manx are listed as varieties. It should be noted that the otherNames
field itself is deprecated, and entries listed there should eventually be moved to either aliases
or varieties
.
Language:getAliases[edit source]
:getAliases()
Returns a table of the aliases that the language is known by, excluding the canonical name. Aliases are synonyms for the language in question. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: lua
for German.
Language:getVarieties[edit source]
:getVarieties(flatten)
Returns a table of the known subvarieties of a given language, excluding subvarieties that have been given explicit etymology language codes. The names are not guaranteed to be unique, in that sometimes a given name refers to a subvariety of more than one language. Example: lua
for Aymara. Note that the returned value can have nested tables in it, when a subvariety goes by more than one name. Example: lua
for Azerbaijani. Here, for example, Afshar, Afshari, Afshar Azerbaijani and Afchar all refer to the same subvariety, whose preferred name is Afshar (the one listed first). To avoid a return value with nested tables in it, specify a non-lua
value for the flatten
parameter; in that case, the return value would be lua
.
Language:getType[edit source]
:getType()
Returns the type of language, which can be lua
, lua
or lua
.
Language:getWikimediaLanguages[edit source]
:getWikimediaLanguages()
Returns a table containing WikimediaLanguage
objects (see Module:wikimedia languages), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code sh
(Serbo-Croatian) maps to four Wikimedia codes: sh
(Serbo-Croatian), bs
(Bosnian), hr
(Croatian) and sr
(Serbian).
The code for the Wikimedia language is retrieved from the wikimedia_codes
property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.
Language:getWikipediaArticle[edit source]
:getWikipediaArticle()
Returns the name of the Wikipedia article for the language. If the property wikipedia_article
is present in the data module it will be used first, otherwise a sitelink will be generated from :getWikidataItem
(if set). Otherwise :getCategoryName
is used as fallback.
Language:getWikidataItem[edit source]
:getWikidataItem()
Returns the Wikidata item id for the language or nil
. This corresponds to the the second field in the data modules.
Language:getScripts[edit source]
:getScripts()
Returns a table of Script
objects for all scripts that the language is written in. See Module:scripts.
Language:getScriptCodes[edit source]
:getScriptCodes()
Returns the table of script codes in the language's data file.
Language:getFamily[edit source]
:getFamily()
Returns a Family
object for the language family that the language belongs to. See Module:families.
Language:getAncestors[edit source]
:getAncestors()
Returns a table of Language
objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.
Language:getCategoryName[edit source]
:getCategoryName(nocap)
Returns the name of the main category of that language. Example: lua
for French, whose category is at Category:French language. Unless optional argument nocap
is given, the language name at the beginning of the returned value will be capitalized. This capitalization is correct for category names, but not if the language name is lowercase and the returned value of this function is used in the middle of a sentence.
Language:makeCategoryLink[edit source]
:makeCategoryLink()
Creates a link to the category; the link text is the canonical name.
Language:makeEntryName[edit source]
:makeEntryName(term)
Converts the given term into the form used in the names of entries. This removes diacritical marks from the term if they are not considered part of the normal written form of the language, and which therefore are not permitted in page names. It also removes certain punctuation characters like final question marks or periods which are never present in page names. Example for Latin: lua
→ lua
(macron is removed).
The replacements made by this function are defined by the entry_name
setting for each language in the data modules.
Language:makeSortKey[edit source]
:makeSortKey(entryName)
Creates a sort key for the given entry name, following the rules appropriate for the language. This removes diacritical marks from the entry name if they are not considered significant for sorting, and may perform some other changes. Any initial hyphen is also removed, and anything parentheses is removed as well.
The sort_key
setting for each language in the data modules defines the replacements made by this function, or it gives the name of the module that takes the entry name and returns a sortkey.
Language:transliterate[edit source]
:transliterate(text, sc, module_override)
Transliterates the text from the given script into the Latin script (see Wiktionary:Transliteration and romanization). The language must have the translit_module
property for this to work; if it is not present, lua
is returned.
The sc
parameter is handled by the transliteration module, and how it is handled is specific to that module. Some transliteration modules may tolerate lua
as the script, others require it to be one of the possible scripts that the module can transliterate, and will show an error if it's not one of them. For this reason, the sc
parameter should always be provided when writing non-language-specific code.
The module_override
parameter is used to override the default module that is used to provide the transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no default module yet, or you want to demonstrate an alternative version of a transliteration module before making it official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked by Template:tracking/module_override.
Language:hasTranslit[edit source]
:hasTranslit()
Returns lua
if the language has a transliteration module, lua
if it doesn't.
Language:getRawData[edit source]
:getRawData()
- This function is not for use in entries or other content pages.
Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes.
Language:getRawExtraData[edit source]
:getRawExtraData()
- This function is not for use in entries or other content pages.
Returns a blob of data about the language that contains the "extra data". Much like with getRawData, the format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes.
Error function[edit source]
See also[edit source]
{{Module:etymology languages}}
{{Module:families}}
{{Module:languages/templates}}
{{Module:languages/JSON}}