pywikibot Package

pywikibot Package

The initialization file for the Pywikibot framework.

class pywikibot.__init__.UnicodeMixin[source]

Bases: object

Mixin class to add __str__ method in Python 2 or 3.

pywikibot.__init__.translate(code, xdict, parameters=None, fallback=False)[source]

Return the most appropriate translation from a translation dict.

Given a language code and a dictionary, returns the dictionary’s value for key ‘code’ if this key exists; otherwise tries to return a value for an alternative language that is most applicable to use on the wiki in language ‘code’ except fallback is False.

The language itself is always checked first, then languages that have been defined to be alternatives, and finally English. If none of the options gives result, we just take the first language in the list.

For PLURAL support have a look at the twntranslate method

Parameters:
  • code (string or Site object) – The language code
  • xdict (dict, string, unicode) – dictionary with language codes as keys or extended dictionary with family names as keys containing language dictionaries or a single (unicode) string. May contain PLURAL tags as described in twntranslate
  • parameters (dict, string, unicode, int) – For passing (plural) parameters
  • fallback (boolean) – Try an alternate language code
class pywikibot.__init__.Page(source, title='', ns=0, defaultNamespace='[deprecated name of ns]', insite=NotImplemented)[source]

Bases: pywikibot.page.BasePage

Page: A MediaWiki page.

class pywikibot.__init__.FilePage(source, title='', insite=NotImplemented)[source]

Bases: pywikibot.page.Page

A subclass of Page representing a file description page.

Supports the same interface as Page, with some added methods.

fileIsOnCommons()[source]

DEPRECATED. Check if the image is stored on Wikimedia Commons.

Returns:bool
fileIsShared()[source]

Check if the file is stored on any known shared repository.

Returns:bool
fileUrl()[source]

Return the URL for the file described on this page.

getFileMd5Sum()[source]

Return image file’s MD5 checksum.

getFileSHA1Sum()[source]

Return the file’s SHA1 checksum.

getFileVersionHistory()[source]

Return the file’s version history.

Returns:A list of dictionaries with the following keys:
[comment, sha1, url, timestamp, metadata,
height, width, mime, user, descriptionurl, size]
getFileVersionHistoryTable()[source]

Return the version history in the form of a wiki table.

getImagePageHtml()[source]

Download the file page, and return the HTML, as a unicode string.

Caches the HTML code, so that if you run this method twice on the same FilePage object, the page will only be downloaded once.

usingPages(step=None, total=None, content=False)[source]

Yield Pages on which the file is displayed.

Parameters:
  • step – limit each API call to this number of pages
  • total – iterate no more than this number of pages in total
  • content – if True, load the current content of each iterated page (default False)
class pywikibot.__init__.Category(source, title='', sortKey=None, insite=NotImplemented)[source]

Bases: pywikibot.page.Page

A page in the Category: namespace.

articles(recurse=False, step=None, total=None, content=False, namespaces=None, sortby='', starttime=None, endtime=None, startsort=None, endsort=None, startFrom='[deprecated name of startsort]')[source]

Yield all articles in the current category.

By default, yields all pages in the category that are not subcategories!

Parameters:
  • recurse (int or bool) – if not False or 0, also iterate articles in subcategories. If an int, limit recursion to this number of levels. (Example: recurse=1 will iterate articles in first-level subcats, but no deeper.)
  • step – limit each API call to this number of pages
  • total – iterate no more than this number of pages in total (at all levels)
  • namespaces (int or list of ints) – only yield pages in the specified namespaces
  • content – if True, retrieve the content of the current version of each page (default False)
  • sortby (str) – determines the order in which results are generated, valid values are “sortkey” (default, results ordered by category sort key) or “timestamp” (results ordered by time page was added to the category). This applies recursively.
  • starttime (pywikibot.Timestamp) – if provided, only generate pages added after this time; not valid unless sortby=”timestamp”
  • endtime (pywikibot.Timestamp) – if provided, only generate pages added before this time; not valid unless sortby=”timestamp”
  • startsort (str) – if provided, only generate pages >= this title lexically; not valid if sortby=”timestamp”
  • endsort (str) – if provided, only generate pages <= this title lexically; not valid if sortby=”timestamp”
articlesList(recurse=False)[source]

DEPRECATED: equivalent to list(self.articles(...)).

Return a link to place a page in this Category.

Use this only to generate a “true” category link, not for interwikis or text links to category pages.

Parameters:sortKey ((optional) unicode) – The sort key for the article to be placed in this Category; if omitted, default sort key is used.
categoryinfo

Return a dict containing information about the category.

The dict contains values for:

Numbers of pages, subcategories, files, and total contents.

Returns:dict
copyAndKeep(catname, cfdTemplates, message)[source]

Copy partial category page text (not contents) to a new title.

Like copyTo above, except this removes a list of templates (like deletion templates) that appear in the old category text. It also removes all text between the two HTML comments BEGIN CFD TEMPLATE and END CFD TEMPLATE. (This is to deal with CFD templates that are substituted.)

Returns true if copying was successful, false if target page already existed.

Parameters:
  • catname – New category title (without namespace)
  • cfdTemplates – A list (or iterator) of templates to be removed from the page text
Returns:

True if copying was successful, False if target page already existed.

copyTo(cat, message)[source]

Copy text of category page to a new page. Does not move contents.

Parameters:
  • cat (unicode or Category) – New category title (without namespace) or Category object
  • message – message to use for category creation message

If two %s are provided in message, will be replaced by (self.title, authorsList) :type message: unicode :return: True if copying was successful, False if target page

already existed.
isEmptyCategory()[source]

Return True if category has no members (including subcategories).

isHiddenCategory()[source]

Return True if the category is hidden.

members(recurse=False, namespaces=None, step=None, total=None, content=False)[source]

Yield all category contents (subcats, pages, and files).

subcategories(recurse=False, step=None, total=None, content=False, cacheResults=NotImplemented, startFrom=NotImplemented)[source]

Iterate all subcategories of the current category.

Parameters:
  • recurse (int or bool) – if not False or 0, also iterate subcategories of subcategories. If an int, limit recursion to this number of levels. (Example: recurse=1 will iterate direct subcats and first-level sub-sub-cats, but no deeper.)
  • step – limit each API call to this number of categories
  • total – iterate no more than this number of subcategories in total (at all levels)
  • content – if True, retrieve the content of the current version of each category description page (default False)
subcategoriesList(recurse=False)[source]

DEPRECATED: Equivalent to list(self.subcategories(...)).

supercategories()[source]

DEPRECATED: equivalent to self.categories().

supercategoriesList()[source]

DEPRECATED: equivalent to list(self.categories(...)).

Bases: pywikibot.tools.ComparableMixin

A MediaWiki link (local or interwiki).

Has the following attributes:

- site:  The Site object for the wiki linked to
- namespace: The namespace of the page linked to (int)
- title: The title of the page linked to (unicode); does not include
  namespace or section
- section: The section of the page linked to (unicode or None); this
  contains any text following a '#' character in the title
- anchor: The anchor text (unicode or None); this contains any text
  following a '|' character inside the link
anchor

Return the anchor of the link.

Returns:unicode
astext(onsite=None)[source]

Return a text representation of the link.

Parameters:onsite – if specified, present as a (possibly interwiki) link from the given site; otherwise, present as an internal link on the source site.
canonical_title()[source]

Return full page title, including localized namespace.

static fromPage(page, source=None)[source]

Create a Link to a Page.

Parameters:
  • page (Page) – target Page
  • source – Link from site source
  • source – Site
Returns:

Link

illegal_titles_pattern = re.compile('[\\x00-\\x1f\\x23\\x3c\\x3e\\x5b\\x5d\\x7b\\x7c\\x7d\\x7f]|%[0-9A-Fa-f]{2}|&[A-Za-z0-9\x80-ÿ]+;|&#[0-9]+;|&#x[0-9A-Fa-f]+;')
static langlinkUnsafe(lang, title, source)[source]

Create a “lang:title” Link linked from source.

Assumes that the lang & title come clean, no checks are made.

Parameters:
  • lang (str) – target site code (language)
  • title (unicode) – target Page
  • source – Link from site source
  • source – Site
Returns:

Link

namespace

Return the namespace of the link.

Returns:unicode
ns_title(onsite=None)[source]

Return full page title, including namespace.

Parameters:onsite

site object if specified, present title using onsite local namespace, otherwise use self canonical namespace.

if no corresponding namespace is found in onsite, pywikibot.Error is raised.

parse()[source]

Parse wikitext of the link.

Called internally when accessing attributes.

parse_site()[source]

Parse only enough text to determine which site the link points to.

This method does not parse anything after the first ”:”; links with multiple interwiki prefixes (such as “wikt:fr:Parlais”) need to be re-parsed on the first linked wiki to get the actual site.

Returns:tuple of (family-name, language-code) for the linked site.
section

Return the section of the link.

Returns:unicode
site

Return the site of the link.

Returns:unicode
title

Return the title of the link.

Returns:unicode
class pywikibot.__init__.User(source, title='', site='[deprecated name of source]', name='[deprecated name of title]')[source]

Bases: pywikibot.page.Page

A class that represents a Wiki user.

This class also represents the Wiki page User:<username>

block(expiry, reason, anononly=True, nocreate=True, autoblock=True, noemail=False, reblock=False)[source]

Block user.

Parameters:
  • expiry (pywikibot.Timestamp|str) – When the block should expire
  • reason (basestring) – Block reason
  • anononly (bool) – Whether block should only affect anonymous users
  • nocreate (bool) – Whether to block account creation
  • autoblock (bool) – Whether to enable autoblock
  • noemail (bool) – Whether to disable email access
  • reblock (bool) – Whether to reblock if a block already is set
Returns:

None

contributions(total=500, namespaces=[], namespace='[deprecated name of namespaces]', limit='[deprecated name of total]')[source]

Yield tuples describing this user edits.

Each tuple is composed of a pywikibot.Page object, the revision id (int), the edit timestamp (as a pywikibot.Timestamp object), and the comment (unicode). Pages returned are not guaranteed to be unique.

Parameters:
  • total (int) – limit result to this number of pages
  • namespaces (list) – only iterate links in these namespaces
editCount(force=False)[source]

Return edit count for a registered user.

Always returns 0 for ‘anonymous’ users.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:long
editedPages(total=500, limit='[deprecated name of total]')[source]

DEPRECATED. Use contributions().

Yields pywikibot.Page objects that this user has edited, with an upper bound of ‘total’. Pages returned are not guaranteed to be unique.

Parameters:total (int.) – limit result to this number of pages.
getUserPage(subpage='')[source]

Return a Page object relative to this user’s main page.

Parameters:subpage (unicode) – subpage part to be appended to the main page title (optional)
getUserTalkPage(subpage='')[source]

Return a Page object relative to this user’s main talk page.

Parameters:subpage (unicode) – subpage part to be appended to the main talk page title (optional)
getprops(force=False)[source]

Return a properties about the user.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:dict
groups(force=False)[source]

Return a list of groups to which this user belongs.

The list of groups may be empty.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:list
isAnonymous()[source]

Determine if the user is editing as an IP address.

Returns:bool
isBlocked(force=False)[source]

Determine whether the user is currently blocked.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:bool
isEmailable(force=False)[source]

Determine whether emails may be send to this user through MediaWiki.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:bool
isRegistered(force=False)[source]

Determine if the user is registered on the site.

It is possible to have a page named User:xyz and not have a corresponding user with username xyz.

The page does not need to exist for this method to return True.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:bool
name()[source]

The username.

Returns:unicode
registration(force=False)[source]

Fetch registration date for this user.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:pywikibot.Timestamp or None
registrationTime(force=False)[source]

DEPRECATED. Fetch registration date for this user.

Parameters:force (bool) – if True, forces reloading the data from API
Returns:long (MediaWiki’s internal timestamp format) or 0
sendMail(subject, text, ccme=False)[source]

Send an email to this user via MediaWiki’s email interface.

Return True on success, False otherwise. This method can raise an UserActionRefuse exception in case this user doesn’t allow sending email to him or the currently logged in bot doesn’t have the right to send emails.

Parameters:
  • subject (unicode) – the subject header of the mail
  • text (unicode) – mail body
  • ccme (bool) – if True, sends a copy of this email to the bot
uploadedImages(total=10, number='[deprecated name of total]')[source]

Yield tuples describing files uploaded by this user.

Each tuple is composed of a pywikibot.Page, the timestamp (str in ISO8601 format), comment (unicode) and a bool for pageid > 0. Pages returned are not guaranteed to be unique.

Parameters:total (int) – limit result to this number of pages
username

The username.

Convenience method that returns the title of the page with namespace prefix omitted, which is the username.

Returns:unicode
class pywikibot.__init__.ItemPage(site, title=None)[source]

Bases: pywikibot.page.WikibasePage

Wikibase entity of type ‘item’.

A Wikibase item may be defined by either a ‘Q’ id (qid), or by a site & title.

If an item is defined by site & title, once an item’s qid has been looked up, the item is then defined by the qid.

addClaim(claim, bot=True, **kwargs)[source]

Add a claim to the item.

Parameters:
  • claim (Claim) – The claim to add
  • bot (bool) – Whether to flag as bot (if possible)
classmethod fromPage(page, lazy_load=False)[source]

Get the ItemPage for a Page that links to it.

Parameters:
  • page (pywikibot.Page) – Page to look for corresponding data item
  • lazy_load (bool) – Do not raise NoPage if either page or corresponding ItemPage does not exist.
Returns:

ItemPage

Raises NoPage:

There is no corresponding ItemPage for the page

get(force=False, *args, **kwargs)[source]

Fetch all item data, and cache it.

Parameters:
  • force (bool) – override caching
  • args – values of props

Return the title for the specific site.

If the item doesn’t have that language, raise NoPage.

Parameters:
  • site (pywikibot.Site or database name) – Site to find the linked page of.
  • force – override caching
Returns:

unicode

Iterate through all the sitelinks.

Parameters:family (str|pywikibot.family.Family) – string/Family object which represents what family of links to iterate
Returns:iterator of pywikibot.Page objects
mergeInto(item, **kwargs)[source]

Merge the item into another item.

Parameters:item (pywikibot.ItemPage) – The item to merge into
removeClaims(claims, **kwargs)[source]

Remove the claims from the item.

Remove a sitelink.

A site can either be a Site object, or it can be a dbName.

Remove sitelinks.

Sites should be a list, with values either being Site objects, or dbNames.

Set sitelinks. Calls setSitelinks().

A sitelink can either be a Page object, or a {‘site’:dbname,’title’:title} dictionary.

Set sitelinks.

Sitelinks should be a list. Each item in the list can either be a Page object, or a dict with a value for ‘site’ and ‘title’.

title(**kwargs)[source]

Return ID as title of the ItemPage.

If the ItemPage was lazy-loaded via ItemPage.fromPage, this method will fetch the wikibase item ID for the page, potentially raising NoPage with the page on the linked wiki if it does not exist, or does not have a corresponding wikibase item ID.

This method also refreshes the title if the id property was set. i.e. item.id = ‘Q60’

All optional keyword parameters are passed to the superclass.

toJSON(diffto=None)[source]
class pywikibot.__init__.PropertyPage(source, title='')[source]

Bases: pywikibot.page.WikibasePage, pywikibot.page.Property

A Wikibase entity in the property namespace.

Should be created as::

PropertyPage(DataSite, 'P21')
get(force=False, *args)[source]

Fetch the property entity, and cache it.

Parameters:
  • force – override caching
  • args – values of props
newClaim(*args, **kwargs)[source]

Helper function to create a new claim object for this property.

Returns:Claim
class pywikibot.__init__.Claim(site, pid, snak=None, hash=None, isReference=False, isQualifier=False, **kwargs)[source]

Bases: pywikibot.page.Property

A Claim on a Wikibase entity.

Claims are standard claims as well as references.

addQualifier(qualifier, **kwargs)[source]

Add the given qualifier.

Parameters:qualifier (Claim) – the qualifier to add
addSource(claim, **kwargs)[source]

Add the claim as a source.

Parameters:claim (pywikibot.Claim) – the claim to add
addSources(claims, **kwargs)[source]

Add the claims as one source.

Parameters:claims (list of pywikibot.Claim) – the claims to add
changeSnakType(value=None, **kwargs)[source]

Save the new snak value.

TODO: Is this function really needed?

changeTarget(value=None, snaktype='value', **kwargs)[source]

Set the target value in the data repository.

Parameters:
  • value (object) – The new target value.
  • snaktype (str (‘value’, ‘somevalue’, or ‘novalue’)) – The new snak type.
static fromJSON(site, data)[source]

Create a claim object from JSON returned in the API call.

Parameters:data (dict) – JSON containing claim data
Returns:Claim
getRank()[source]

Return the rank of the Claim.

getSnakType()[source]

Return the type of snak.

Returns:str (‘value’, ‘somevalue’ or ‘novalue’)
getSources()[source]

Return a list of sources, each being a list of Claims.

Returns:list
getTarget()[source]

Return the target value of this Claim.

None is returned if no target is set

Returns:object
static qualifierFromJSON(site, data)[source]

Create a Claim for a qualifier from JSON.

Qualifier objects are represented a bit differently like references, but I’m not sure if this even requires it’s own function.

Returns:Claim
static referenceFromJSON(site, data)[source]

Create a dict of claims from reference JSON returned in the API call.

Reference objects are represented a bit differently, and require some more handling.

Returns:dict
removeSource(source, **kwargs)[source]

Remove the source. Calls removeSources().

Parameters:source (pywikibot.Claim) – the source to remove
removeSources(sources, **kwargs)[source]

Remove the sources.

Parameters:sources (list of pywikibot.Claim) – the sources to remove
setRank()[source]

Set the rank of the Claim.

Has not been implemented in the Wikibase API yet

setSnakType(value)[source]

Set the type of snak.

Parameters:value (str (‘value’, ‘somevalue’, or ‘novalue’)) – Type of snak
setTarget(value)[source]

Set the target value in the local object.

Parameters:value (object) – The new target value.
Raises ValueError:
 if value is not of the type required for the Claim type.
toJSON()[source]
pywikibot.__init__.html2unicode(text, ignore=None)[source]

Replace HTML entities with equivalent unicode.

Parameters:
  • ignore – HTML entities to ignore
  • ignore – list of int
Returns:

unicode

pywikibot.__init__.url2unicode(title, encodings='utf-8', site2=NotImplemented, site='[deprecated name of encodings]')[source]

Convert URL-encoded text to unicode using several encoding.

Uses the first encoding that doesn’t cause an error.

Parameters:
  • title (str) – URL-encoded character data to convert
  • encodings (str, list or Site) – Encodings to attempt to use during conversion.
Returns:

unicode

Raises UnicodeError:
 

Could not convert using any encoding.

pywikibot.__init__.unicode2html(x, encoding)[source]

Convert unicode string to requested HTML encoding.

Attempt to encode the string into the desired format; if that doesn’t work, encode the unicode into HTML &#; entities. If it does work, return it unchanged.

Parameters:
  • x (unicode) – String to update
  • encoding (str) – Encoding to use
Returns:

str

pywikibot.__init__.stdout(text, decoder=None, newline=True, **kwargs)[source]

Output script results to the user via the userinterface.

pywikibot.__init__.output(text, decoder=None, newline=True, toStdout=False, **kwargs)[source]

Output a message to the user via the userinterface.

Works like print, but uses the encoding used by the user’s console (console_encoding in the configuration file) instead of ASCII.

If decoder is None, text should be a unicode string. Otherwise it should be encoded in the given encoding.

If newline is True, a line feed will be added after printing the text.

If toStdout is True, the text will be sent to standard output, so that it can be piped to another process. All other text will be sent to stderr. See: https://en.wikipedia.org/wiki/Pipeline_%28Unix%29

text can contain special sequences to create colored output. These consist of the escape character 03 and the color name in curly braces, e. g. 03{lightpurple}. 03{default} resets the color.

Other keyword arguments are passed unchanged to the logger; so far, the only argument that is useful is “exc_info=True”, which causes the log message to include an exception traceback.

pywikibot.__init__.warning(text, decoder=None, newline=True, **kwargs)[source]

Output a warning message to the user via the userinterface.

pywikibot.__init__.error(text, decoder=None, newline=True, **kwargs)[source]

Output an error message to the user via the userinterface.

pywikibot.__init__.critical(text, decoder=None, newline=True, **kwargs)[source]

Output a critical record to the log file.

pywikibot.__init__.debug(text, layer, decoder=None, newline=True, **kwargs)[source]

Output a debug record to the log file.

Parameters:layer – The name of the logger that text will be sent to.
pywikibot.__init__.exception(msg=None, decoder=None, newline=True, tb=False, **kwargs)[source]

Output an error traceback to the user via the userinterface.

Use directly after an ‘except’ statement::

...
except::
    pywikibot.exception()
...

or alternatively::

...
except Exception as e::
    pywikibot.exception(e)
...
Parameters:tb – Set to True in order to output traceback also.
pywikibot.__init__.input_choice(question, answers, default=None, return_shortcut=True, automatic_quit=True)[source]

Ask the user the question and return one of the valid answers.

Parameters:
  • question (basestring) – The question asked without trailing spaces.
  • answers (Iterable containing an iterable of length two) – The valid answers each containing a full length answer and a shortcut. Each value must be unique.
  • default (basestring) – The result if no answer was entered. It must not be in the valid answers and can be disabled by setting it to None. If it should be linked with the valid answers it must be its shortcut.
  • return_shortcut (bool) – Whether the shortcut or the index of the answer is returned.
  • automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a QuitKeyboardInterrupt if selected.
Returns:

The selected answer shortcut or index. Is -1 if the default is selected, it does not return the shortcut and the default is not a valid shortcut.

Return type:

int (if not return shortcut), basestring (otherwise)

pywikibot.__init__.input(question, password=False)[source]

Ask the user a question, return the user’s answer.

Parameters:
  • question (unicode) – a string that will be shown to the user. Don’t add a space after the question mark/colon, this method will do this for you.
  • password (bool) – if True, hides the user’s input (for password entry).
Return type:

unicode

pywikibot.__init__.input_yn(question, default=None, automatic_quit=True)[source]

Ask the user a yes/no question and returns the answer as a bool.

Parameters:
  • question (basestring) – The question asked without trailing spaces.
  • default (basestring or bool) – The result if no answer was entered. It must be a bool or ‘y’ or ‘n’ and can be disabled by setting it to None.
  • automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a QuitKeyboardInterrupt if selected.
Returns:

Return True if the user selected yes and False if the user selected no. If the default is not None it’ll return True if default is True or ‘y’ and False if default is False or ‘n’.

Return type:

bool

pywikibot.__init__.inputChoice(question, answers, hotkeys, default=None)[source]

Ask the user a question with several options, return the user’s choice.

DEPRECATED: Use input_choice instead!

The user’s input will be case-insensitive, so the hotkeys should be distinctive case-insensitively.

Parameters:
  • question (basestring) – a string that will be shown to the user. Don’t add a space after the question mark/colon, this method will do this for you.
  • answers (list of basestring) – a list of strings that represent the options.
  • hotkeys – a list of one-letter strings, one for each answer.
  • default – an element of hotkeys, or None. The default choice that will be returned when the user just presses Enter.
Returns:

a one-letter string in lowercase.

Return type:

str

pywikibot.__init__.handle_args(args=None, do_help=True)[source]

Handle standard command line arguments, and return the rest as a list.

Takes the command line arguments as Unicode strings, processes all global parameters such as -lang or -log, initialises the logging layer, which emits startup information into log at level ‘verbose’.

This makes sure that global arguments are applied first, regardless of the order in which the arguments were given.

args may be passed as an argument, thereby overriding sys.argv

Parameters:
  • args (list of unicode) – Command line arguments
  • do_help (bool) – Handle parameter ‘-help’ to show help and invoke sys.exit
Returns:

list of arguments not recognised globally

Return type:

list of unicode

pywikibot.__init__.handleArgs(*args)[source]

DEPRECATED. Use handle_args().

pywikibot.__init__.showHelp(module_name=None)[source]

Show help for the Bot.

pywikibot.__init__.log(text, decoder=None, newline=True, **kwargs)[source]

Output a record to the log file.

pywikibot.__init__.calledModuleName()[source]

Return the name of the module calling this function.

This is required because the -help option loads the module’s docstring and because the module name will be used for the filename of the log.

Return type:unicode
class pywikibot.__init__.Bot(**kwargs)[source]

Bases: object

Generic Bot to be subclassed.

This class provides a run() method for basic processing of a generator one page at a time.

If the subclass places a page generator in self.generator, Bot will process each page in the generator, invoking the method treat() which must then be implemented by subclasses.

If the subclass does not set a generator, or does not override treat() or run(), NotImplementedError is raised.

availableOptions = {'always': False}
current_page

Return the current working page as a property.

getOption(option)[source]

Get the current value of an option.

Parameters:option – key defined in Bot.availableOptions
quit()[source]

Cleanup and quit processing.

run()[source]

Process all pages in generator.

setOptions(**kwargs)[source]

Set the instance options.

Parameters:kwargs (dict) – options
site

Site that the bot is using.

treat(page)[source]

Process one page (Abstract method).

userPut(page, oldtext, newtext, **kwargs)[source]

Save a new revision of a page, with user confirmation as required.

Print differences, ask user for confirmation, and puts the page if needed.

Option used:: * ‘always’

Keyword args used:: * ‘async’ - passed to page.save * ‘comment’ - passed to page.save * ‘show_diff’ - show changes between oldtext and newtext (enabled) * ‘ignore_save_related_errors’ - report and ignore (disabled) * ‘ignore_server_errors’ - report and ignore (disabled)

user_confirm(question)[source]

Obtain user response if bot option ‘always’ not enabled.

class pywikibot.__init__.WikidataBot(**kwargs)[source]

Bases: pywikibot.bot.Bot

Generic Wikidata Bot to be subclassed.

Source claims (P143) can be created for specific sites.

cacheSources()[source]

Fetch the sources from the list on Wikidata.

It is stored internally and reused by getSource()

getSource(site)[source]

Create a Claim usable as a source for Wikibase statements.

Parameters:site (Site) – site that is the source of assertions.
Returns:Claim
run()[source]

Process all pages in generator.

exception pywikibot.__init__.Error(arg)[source]

Bases: pywikibot.tools.UnicodeMixin, Exception

Pywikibot error

exception pywikibot.__init__.InvalidTitle(arg)[source]

Bases: pywikibot.exceptions.Error

Invalid page title

exception pywikibot.__init__.BadTitle(arg)[source]

Bases: pywikibot.exceptions.Error

Server responded with BadTitle.

exception pywikibot.__init__.NoPage(page, message=None)[source]

Bases: pywikibot.exceptions.PageRelatedError

Page does not exist

message = "Page %s doesn't exist."
exception pywikibot.__init__.SectionError(arg)[source]

Bases: pywikibot.exceptions.Error

The section specified by # does not exist

exception pywikibot.__init__.SiteDefinitionError(arg)[source]

Bases: pywikibot.exceptions.Error

Site does not exist

pywikibot.__init__.NoSuchSite

alias of SiteDefinitionError

exception pywikibot.__init__.UnknownSite(arg)[source]

Bases: pywikibot.exceptions.SiteDefinitionError

Site does not exist in Family

exception pywikibot.__init__.UnknownFamily(arg)[source]

Bases: pywikibot.exceptions.SiteDefinitionError

Family is not registered

exception pywikibot.__init__.NoUsername(arg)[source]

Bases: pywikibot.exceptions.Error

Username is not in user-config.py.

exception pywikibot.__init__.UserBlocked(arg)[source]

Bases: pywikibot.exceptions.Error

Your username or IP has been blocked

exception pywikibot.__init__.PageRelatedError(page, message=None)[source]

Bases: pywikibot.exceptions.Error

Abstract Exception, used when the exception concerns a particular Page.

This class should be used when the Exception concerns a particular Page, and when a generic message can be written once for all.

getPage()[source]
message = None
exception pywikibot.__init__.IsRedirectPage(page, message=None)[source]

Bases: pywikibot.exceptions.PageRelatedError

Page is a redirect page

message = 'Page %s is a redirect page.'
exception pywikibot.__init__.IsNotRedirectPage(page, message=None)[source]

Bases: pywikibot.exceptions.PageRelatedError

Page is not a redirect page

message = 'Page %s is not a redirect page.'
exception pywikibot.__init__.PageSaveRelatedError(page, message=None)[source]

Bases: pywikibot.exceptions.PageRelatedError

Saving the page has failed

args
message = 'Page %s was not saved.'
pywikibot.__init__.PageNotSaved

alias of PageSaveRelatedError

exception pywikibot.__init__.OtherPageSaveError(page, reason)[source]

Bases: pywikibot.exceptions.PageSaveRelatedError

Saving the page has failed due to uncatchable error.

args
message = 'Edit to page %(title)s failed:\n%(reason)s'
exception pywikibot.__init__.LockedPage(page, message=None)[source]

Bases: pywikibot.exceptions.PageSaveRelatedError

Page is locked

message = 'Page %s is locked.'
exception pywikibot.__init__.CascadeLockedPage(page, message=None)[source]

Bases: pywikibot.exceptions.LockedPage

Page is locked due to cascading protection

message = 'Page %s is locked due to cascading protection.'
exception pywikibot.__init__.LockedNoPage(page, message=None)[source]

Bases: pywikibot.exceptions.LockedPage

Title is locked against creation

message = 'Page %s does not exist and is locked preventing creation.'
exception pywikibot.__init__.NoCreateError(page, message=None)[source]

Bases: pywikibot.exceptions.PageSaveRelatedError

Parameter nocreate doesn’t allow page creation.

message = 'Page %s could not be created due to parameter nocreate'
exception pywikibot.__init__.EditConflict(page, message=None)[source]

Bases: pywikibot.exceptions.PageSaveRelatedError

There has been an edit conflict while uploading the page

message = 'Page %s could not be saved due to an edit conflict'
exception pywikibot.__init__.PageDeletedConflict(page, message=None)[source]

Bases: pywikibot.exceptions.EditConflict

Page was deleted since being retrieved

message = 'Page %s has been deleted since last retrieved.'
exception pywikibot.__init__.PageCreatedConflict(page, message=None)[source]

Bases: pywikibot.exceptions.EditConflict

Page was created by another user

message = 'Page %s has been created since last retrieved.'
exception pywikibot.__init__.UploadWarning(code, message)[source]

Bases: pywikibot.data.api.APIError

Upload failed with a warning message (passed as the argument).

message
exception pywikibot.__init__.ServerError(arg)[source]

Bases: pywikibot.exceptions.Error

Got unexpected server response

exception pywikibot.__init__.FatalServerError(arg)[source]

Bases: pywikibot.exceptions.ServerError

A fatal server error will not be corrected by resending the request.

exception pywikibot.__init__.Server504Error(arg)[source]

Bases: pywikibot.exceptions.Error

Server timed out with HTTP 504 code

exception pywikibot.__init__.CaptchaError(arg)[source]

Bases: pywikibot.exceptions.Error

Captcha is asked and config.solve_captcha == False.

exception pywikibot.__init__.SpamfilterError(page, url)[source]

Bases: pywikibot.exceptions.PageSaveRelatedError

Page save failed because MediaWiki detected a blacklisted spam URL.

message = 'Edit to page %(title)s rejected by spam filter due to content:\n%(url)s'
exception pywikibot.__init__.CircularRedirect(page, message=None)[source]

Bases: pywikibot.exceptions.PageRelatedError

Page is a circular redirect.

Exception argument is the redirect target; this may be the same title as this page or a different title (in which case the target page directly or indirectly redirects back to this one)

message = 'Page %s is a circular redirect.'
exception pywikibot.__init__.WikiBaseError(arg)[source]

Bases: pywikibot.exceptions.Error

Wikibase related error.

exception pywikibot.__init__.CoordinateGlobeUnknownException(arg)[source]

Bases: pywikibot.exceptions.WikiBaseError, NotImplementedError

This globe is not implemented yet in either WikiBase or pywikibot.

exception pywikibot.__init__.QuitKeyboardInterrupt[source]

Bases: KeyboardInterrupt

The user has cancelled processing at a prompt.

pywikibot.__init__.unescape(*a, **kw)
pywikibot.__init__.replaceExcept(*a, **kw)
pywikibot.__init__.removeDisabledParts(*a, **kw)
pywikibot.__init__.removeHTMLParts(*a, **kw)
pywikibot.__init__.isDisabled(*a, **kw)
pywikibot.__init__.interwikiFormat(*a, **kw)
pywikibot.__init__.interwikiSort(*a, **kw)
pywikibot.__init__.removeLanguageLinksAndSeparator(*a, **kw)
pywikibot.__init__.categoryFormat(*a, **kw)
pywikibot.__init__.removeCategoryLinksAndSeparator(*a, **kw)
pywikibot.__init__.replaceCategoryInPlace(*a, **kw)
pywikibot.__init__.compileLinkR(*a, **kw)
pywikibot.__init__.extract_templates_and_params(*a, **kw)

bot Module

User-interface related functions for building bots.

class pywikibot.bot.Bot(**kwargs)[source]

Bases: object

Generic Bot to be subclassed.

This class provides a run() method for basic processing of a generator one page at a time.

If the subclass places a page generator in self.generator, Bot will process each page in the generator, invoking the method treat() which must then be implemented by subclasses.

If the subclass does not set a generator, or does not override treat() or run(), NotImplementedError is raised.

availableOptions = {'always': False}
current_page

Return the current working page as a property.

getOption(option)[source]

Get the current value of an option.

Parameters:option – key defined in Bot.availableOptions
quit()[source]

Cleanup and quit processing.

run()[source]

Process all pages in generator.

setOptions(**kwargs)[source]

Set the instance options.

Parameters:kwargs (dict) – options
site

Site that the bot is using.

treat(page)[source]

Process one page (Abstract method).

userPut(page, oldtext, newtext, **kwargs)[source]

Save a new revision of a page, with user confirmation as required.

Print differences, ask user for confirmation, and puts the page if needed.

Option used:: * ‘always’

Keyword args used:: * ‘async’ - passed to page.save * ‘comment’ - passed to page.save * ‘show_diff’ - show changes between oldtext and newtext (enabled) * ‘ignore_save_related_errors’ - report and ignore (disabled) * ‘ignore_server_errors’ - report and ignore (disabled)

user_confirm(question)[source]

Obtain user response if bot option ‘always’ not enabled.

class pywikibot.bot.LoggingFormatter(fmt=None, datefmt=None, style='%')[source]

Bases: logging.Formatter

Format LogRecords for output to file.

This formatter ignores the ‘newline’ key of the LogRecord, because every record written to a file must end with a newline, regardless of whether the output to the user’s console does.

formatException(ei)[source]

Convert exception trace to unicode if necessary.

Make sure that the exception trace is converted to unicode.

exceptions.Error traces are encoded in our console encoding, which is needed for plainly printing them. However, when logging them using logging.exception, the Python logging module will try to use these traces, and it will fail if they are console encoded strings.

Formatter.formatException also strips the trailing n, which we need.

exception pywikibot.bot.QuitKeyboardInterrupt[source]

Bases: KeyboardInterrupt

The user has cancelled processing at a prompt.

class pywikibot.bot.RotatingFileHandler(filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=False)[source]

Bases: logging.handlers.RotatingFileHandler

Modified RotatingFileHandler supporting unlimited amount of backups.

doRollover()[source]

Modified naming system for logging files.

Overwrites the default Rollover renaming by inserting the count number between file name root and extension. If backupCount is >= 1, the system will successively create new files with the same pathname as the base file, but with inserting ”.1”, ”.2” etc. in front of the filename suffix. For example, with a backupCount of 5 and a base file name of “app.log”, you would get “app.log”, “app.1.log”, “app.2.log”, ... through to “app.5.log”. The file being written to is always “app.log” - when it gets filled up, it is closed and renamed to “app.1.log”, and if files “app.1.log”, “app.2.log” etc. already exist, then they are renamed to “app.2.log”, “app.3.log” etc. respectively. If backupCount is == -1 do not rotate but create new numbered filenames. The newest file has the highest number except some older numbered files where deleted and the bot was restarted. In this case the ordering starts from the lowest available (unused) number.

format(record)[source]

Strip trailing newlines before outputting text to file.

class pywikibot.bot.WikidataBot(**kwargs)[source]

Bases: pywikibot.bot.Bot

Generic Wikidata Bot to be subclassed.

Source claims (P143) can be created for specific sites.

cacheSources()[source]

Fetch the sources from the list on Wikidata.

It is stored internally and reused by getSource()

getSource(site)[source]

Create a Claim usable as a source for Wikibase statements.

Parameters:site (Site) – site that is the source of assertions.
Returns:Claim
run()[source]

Process all pages in generator.

pywikibot.bot.calledModuleName()[source]

Return the name of the module calling this function.

This is required because the -help option loads the module’s docstring and because the module name will be used for the filename of the log.

Return type:unicode
pywikibot.bot.critical(text, decoder=None, newline=True, **kwargs)[source]

Output a critical record to the log file.

pywikibot.bot.debug(text, layer, decoder=None, newline=True, **kwargs)[source]

Output a debug record to the log file.

Parameters:layer – The name of the logger that text will be sent to.
pywikibot.bot.error(text, decoder=None, newline=True, **kwargs)[source]

Output an error message to the user via the userinterface.

pywikibot.bot.exception(msg=None, decoder=None, newline=True, tb=False, **kwargs)[source]

Output an error traceback to the user via the userinterface.

Use directly after an ‘except’ statement::

...
except::
    pywikibot.exception()
...

or alternatively::

...
except Exception as e::
    pywikibot.exception(e)
...
Parameters:tb – Set to True in order to output traceback also.
pywikibot.bot.handleArgs(*args)[source]

DEPRECATED. Use handle_args().

pywikibot.bot.handle_args(args=None, do_help=True)[source]

Handle standard command line arguments, and return the rest as a list.

Takes the command line arguments as Unicode strings, processes all global parameters such as -lang or -log, initialises the logging layer, which emits startup information into log at level ‘verbose’.

This makes sure that global arguments are applied first, regardless of the order in which the arguments were given.

args may be passed as an argument, thereby overriding sys.argv

Parameters:
  • args (list of unicode) – Command line arguments
  • do_help (bool) – Handle parameter ‘-help’ to show help and invoke sys.exit
Returns:

list of arguments not recognised globally

Return type:

list of unicode

pywikibot.bot.init_handlers(strm=None)[source]

Initialize logging system for terminal-based bots.

This function must be called before using pywikibot.output(); and must be called again if the destination stream is changed.

Note: this function is called by handleArgs(), so it should normally not need to be called explicitly

All user output is routed through the logging module. Each type of output is handled by an appropriate handler object. This structure is used to permit eventual development of other user interfaces (GUIs) without modifying the core bot code.

The following output levels are defined::
  • DEBUG: only for file logging; debugging messages.

  • STDOUT: output that must be sent to sys.stdout (for bots that may

    have their output redirected to a file or other destination).

  • VERBOSE: optional progress information for display to user.

  • INFO: normal (non-optional) progress information for display to user.

  • INPUT: prompts requiring user response.

  • WARN: user warning messages.

  • ERROR: user error messages.

  • CRITICAL: fatal error messages.

Accordingly, do ‘’not’’ use print statements in bot code; instead, use pywikibot.output function.

Parameters:strm – Output stream. If None, re-uses the last stream if one was defined, otherwise uses sys.stderr
pywikibot.bot.input(question, password=False)[source]

Ask the user a question, return the user’s answer.

Parameters:
  • question (unicode) – a string that will be shown to the user. Don’t add a space after the question mark/colon, this method will do this for you.
  • password (bool) – if True, hides the user’s input (for password entry).
Return type:

unicode

pywikibot.bot.inputChoice(question, answers, hotkeys, default=None)[source]

Ask the user a question with several options, return the user’s choice.

DEPRECATED: Use input_choice instead!

The user’s input will be case-insensitive, so the hotkeys should be distinctive case-insensitively.

Parameters:
  • question (basestring) – a string that will be shown to the user. Don’t add a space after the question mark/colon, this method will do this for you.
  • answers (list of basestring) – a list of strings that represent the options.
  • hotkeys – a list of one-letter strings, one for each answer.
  • default – an element of hotkeys, or None. The default choice that will be returned when the user just presses Enter.
Returns:

a one-letter string in lowercase.

Return type:

str

pywikibot.bot.input_choice(question, answers, default=None, return_shortcut=True, automatic_quit=True)[source]

Ask the user the question and return one of the valid answers.

Parameters:
  • question (basestring) – The question asked without trailing spaces.
  • answers (Iterable containing an iterable of length two) – The valid answers each containing a full length answer and a shortcut. Each value must be unique.
  • default (basestring) – The result if no answer was entered. It must not be in the valid answers and can be disabled by setting it to None. If it should be linked with the valid answers it must be its shortcut.
  • return_shortcut (bool) – Whether the shortcut or the index of the answer is returned.
  • automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a QuitKeyboardInterrupt if selected.
Returns:

The selected answer shortcut or index. Is -1 if the default is selected, it does not return the shortcut and the default is not a valid shortcut.

Return type:

int (if not return shortcut), basestring (otherwise)

pywikibot.bot.input_yn(question, default=None, automatic_quit=True)[source]

Ask the user a yes/no question and returns the answer as a bool.

Parameters:
  • question (basestring) – The question asked without trailing spaces.
  • default (basestring or bool) – The result if no answer was entered. It must be a bool or ‘y’ or ‘n’ and can be disabled by setting it to None.
  • automatic_quit (bool) – Adds the option ‘Quit’ (‘q’) and throw a QuitKeyboardInterrupt if selected.
Returns:

Return True if the user selected yes and False if the user selected no. If the default is not None it’ll return True if default is True or ‘y’ and False if default is False or ‘n’.

Return type:

bool

pywikibot.bot.log(text, decoder=None, newline=True, **kwargs)[source]

Output a record to the log file.

pywikibot.bot.logoutput(text, decoder=None, newline=True, _level=20, _logger='', **kwargs)[source]

Format output and send to the logging module.

Helper function used by all the user-output convenience functions.

pywikibot.bot.output(text, decoder=None, newline=True, toStdout=False, **kwargs)[source]

Output a message to the user via the userinterface.

Works like print, but uses the encoding used by the user’s console (console_encoding in the configuration file) instead of ASCII.

If decoder is None, text should be a unicode string. Otherwise it should be encoded in the given encoding.

If newline is True, a line feed will be added after printing the text.

If toStdout is True, the text will be sent to standard output, so that it can be piped to another process. All other text will be sent to stderr. See: https://en.wikipedia.org/wiki/Pipeline_%28Unix%29

text can contain special sequences to create colored output. These consist of the escape character 03 and the color name in curly braces, e. g. 03{lightpurple}. 03{default} resets the color.

Other keyword arguments are passed unchanged to the logger; so far, the only argument that is useful is “exc_info=True”, which causes the log message to include an exception traceback.

pywikibot.bot.showHelp(module_name=None)[source]

Show help for the Bot.

pywikibot.bot.stdout(text, decoder=None, newline=True, **kwargs)[source]

Output script results to the user via the userinterface.

pywikibot.bot.warning(text, decoder=None, newline=True, **kwargs)[source]

Output a warning message to the user via the userinterface.

pywikibot.bot.writelogheader()[source]

Save additional version, system and status info to the log file in use.

This may help the user to track errors or report bugs.

botirc Module

config2 Module

Module to define and load pywikibot configuration.

Provides two family class methods which can be used in the user-config:

- register_family_file
- register_families_folder

Sets module global base_dir and provides utility methods to build paths relative to base_dir:

- makepath
- datafilepath
- shortpath
pywikibot.config2.datafilepath(*filename)[source]

Return an absolute path to a data file in a standard location.

Argument(s) are zero or more directory names, optionally followed by a data file name. The return path is offset to config.base_dir. Any directories in the path that do not already exist are created.

pywikibot.config2.get_base_dir(test_directory=None)[source]

Return the directory in which user-specific information is stored.

This is determined in the following order::
  1. If the script was called with a -dir: argument, use the directory provided in this argument.
  2. If the user has a PYWIKIBOT2_DIR environment variable, use the value of it.
  3. If user-config is present in current directory, use the current directory.
  4. If user-config is present in pwb.py directory, use that directory
  5. Use (and if necessary create) a ‘pywikibot’ folder under ‘Application Data’ or ‘AppDataRoaming’ (Windows) or ‘.pywikibot’ directory (Unix and similar) under the user’s home directory.

Set PYWIKIBOT2_NO_USER_CONFIG=1 to disable loading user-config.py

Parameters:test_directory (str or None) – Assume that a user config file exists in this directory. Used to test whether placing a user config file in this directory will cause it to be selected as the base directory.
Return type:unicode
pywikibot.config2.makepath(path)[source]

Return a normalized absolute version of the path argument.

  • if the given path already exists in the filesystem the filesystem is not modified.
  • otherwise makepath creates directories along the given path using the dirname() of the path. You may append a ‘/’ to the path if you want it to be a directory path.

from holger@trillke.net 2002/03/18

pywikibot.config2.register_families_folder(folder_path)[source]

Register all family class files contained in a directory.

pywikibot.config2.register_family_file(family_name, file_path)[source]

Register a single family class file.

pywikibot.config2.shortpath(path)[source]

Return a file path relative to config.base_dir.

date Module

Date data and manipulation module.

class pywikibot.date.FormatDate(site)[source]

Bases: object

pywikibot.date.MakeParameter(decoder, param)[source]
pywikibot.date.addFmt1(lang, isMnthOfYear, patterns)[source]

Add 12 month formats for a specific type (‘January’,’Feb..), for a given language.

The function must accept one parameter for the ->int or ->string conversions, just like everywhere else in the formats map. The patterns parameter is a list of 12 elements to be used for each month.

pywikibot.date.addFmt2(lang, isMnthOfYear, pattern, makeUpperCase=None)[source]
pywikibot.date.alwaysTrue(x)[source]

Return True, always.

It is used for multiple value selection function to accept all other values.

Parameters:x – not used
Returns:True
Return type:bool
pywikibot.date.decSinglVal(v)[source]
pywikibot.date.dh(value, pattern, encf, decf, filter=None)[source]

This function helps in year parsing.

Usually it will be used as a lambda call in a map::

lambda v: dh(v, u'pattern string', encf, decf)
Parameters:
  • encf

    Converts from an integer parameter to another integer or a tuple of integers. Depending on the pattern, each integer will be converted to a proper string representation, and will be passed as a format argument to the pattern::

    pattern % encf(value)
    

    This function is a complement of decf.

  • decf – Converts a tuple/list of non-negative integers found in the original value string into a normalized value. The normalized value can be passed right back into dh() to produce the original string. This function is a complement of encf. dh() interprets %d as a decimal and %s as a roman numeral number.
pywikibot.date.dh_centuryAD(value, pattern)[source]
pywikibot.date.dh_centuryBC(value, pattern)[source]
pywikibot.date.dh_constVal(value, ind, match)[source]

This function helps with matching a single value to a constant.

formats[‘CurrEvents’][‘en’](ind) => u’Current Events’ formats[‘CurrEvents’][‘en’](u’Current Events’) => ind

pywikibot.date.dh_dayOfMnth(value, pattern)[source]

Helper for decoding a single integer value.

The single integer should be <=31, no conversion, no rounding (used in days of month).

pywikibot.date.dh_decAD(value, pattern)[source]

Helper for decoding a single integer value.

It should be no conversion, round to decimals (used in decades)

pywikibot.date.dh_decBC(value, pattern)[source]

Helper for decoding a single integer value.

It should be no conversion, round to decimals (used in decades)

pywikibot.date.dh_millenniumAD(value, pattern)[source]
pywikibot.date.dh_millenniumBC(value, pattern)[source]
pywikibot.date.dh_mnthOfYear(value, pattern)[source]

Helper for decoding a single integer value.

The value should be >=1000, no conversion, no rounding (used in month of the year)

pywikibot.date.dh_noConv(value, pattern, limit)[source]

Helper for decoding a single integer value, no conversion, no rounding.

pywikibot.date.dh_number(value, pattern)[source]
pywikibot.date.dh_simpleYearAD(value)[source]

Helper for decoding a single integer value.

This value should be representing a year with no extra symbols.

pywikibot.date.dh_singVal(value, match)[source]
pywikibot.date.dh_yearAD(value, pattern)[source]

Helper for decoding a year value.

The value should have no conversion, no rounding, limits to 3000.

pywikibot.date.dh_yearBC(value, pattern)[source]

Helper for decoding a year value.

The value should have no conversion, no rounding, limits to 3000.

pywikibot.date.encDec0(i)[source]
pywikibot.date.encDec1(i)[source]
pywikibot.date.encNoConv(i)[source]
pywikibot.date.escapePattern2(pattern)[source]

Convert a string pattern into a regex expression and cache.

Allows matching of any _digitDecoders inside the string. Returns a compiled regex object and a list of digit decoders.

pywikibot.date.formatYear(lang, year)[source]
pywikibot.date.getAutoFormat(lang, title, ignoreFirstLetterCase=True)[source]

Return first matching formatted date value.

Parameters:
  • lang – language code
  • title – value to format
Returns:

dictName (‘YearBC’, ‘December’, ...) and value (a year, date, ...)

Return type:

tuple

pywikibot.date.getNumberOfDaysInMonth(month)[source]

Return the number of days in a given month, 1 being January, etc.

pywikibot.date.intToLocalDigitsStr(value, digitsToLocalDict)[source]
pywikibot.date.intToRomanNum(i)[source]
pywikibot.date.localDigitsStrToInt(value, digitsToLocalDict, localToDigitsDict)[source]
pywikibot.date.makeMonthList(pattern)[source]
pywikibot.date.makeMonthNamedList(lang, pattern, makeUpperCase=None)[source]

Create a list of 12 elements based on the name of the month.

The language-dependent month name is used as a formating argument to the pattern. The pattern must be have one %s that will be replaced by the localized month name. Use %%d for any other parameters that should be preserved.

pywikibot.date.monthName(lang, ind)[source]
pywikibot.date.multi(value, tuplst)[source]

Run multiple pattern checks for the same entry.

For example: 1st century, 2nd century, etc.

The tuplst is a list of tupples. Each tupple must contain two functions:: first to encode/decode a single value (e.g. simpleInt), second is a predicate function with an integer parameter that returns true or false. When the 2nd function evaluates to true, the 1st function is used.

pywikibot.date.romanNumToInt(v)[source]
pywikibot.date.slh(value, lst)[source]

This function helps in simple list value matching.

!!!!! The index starts at 1, so 1st element has index 1, not 0 !!!!!

Usually it will be used as a lambda call in a map::

lambda v: slh(v, [u'January',u'February',...])

Usage scenarios::

formats['MonthName']['en'](1) => u'January'
formats['MonthName']['en'](u'January') => 1
formats['MonthName']['en'](u'anything else') => raise ValueError

diff Module

Diff module.

class pywikibot.diff.Hunk(a, b, grouped_opcode)[source]

Bases: object

One change hunk between a and b.

Note: parts of this code are taken from by difflib.get_grouped_opcodes().

APPR = 1
NOT_APPR = -1
PENDING = 0
apply()[source]

Turn a into b for this hunk.

color_line(line, line_ref=None)[source]

Color line characters.

If line_ref is None, the whole line is colored. If line_ref[i] is not blank, line[i] is colored. Color depends if line starts with +/-.

line: string line_ref: string.

create_diff()[source]

Generator of diff text for this hunk, without formatting.

format_diff()[source]

Color diff lines.

get_header()[source]

Provide header of unified diff.

class pywikibot.diff.PatchManager(text_a, text_b, n=0, by_letter=False)[source]

Bases: object

Apply patches to text_a to obtain a new text.

If all hunks are approved, text_b will be obtained.

apply()[source]

Apply changes. If there are undecided changes, ask to review.

get_blocks()[source]

Return list with blocks of indexes which compose a and, where applicable, b.

Format of each block::

[-1, (i1, i2), (-1, -1)] -> block a[i1:i2] does not change from a to b
    then is there is no corresponding hunk.
[hunk index, (i1, i2), (j1, j2)] -> block a[i1:i2] becomes b[j1:j2]
print_hunks()[source]
review_hunks()[source]

Review hunks.

pywikibot.diff.cherry_pick(oldtext, newtext, n=0, by_letter=False)[source]

Propose a list of changes for approval.

Text with approved changes will be returned. n: int, line of context as defined in difflib.get_grouped_opcodes(). by_letter: if text_a and text_b are single lines, comparison can be done

echo Module

Classes and functions for working with the Echo extension.

class pywikibot.echo.Notification(site)[source]

Bases: object

A notification issued by the Echo extension.

classmethod fromJSON(site, data)[source]

Construct a Notification object from JSON data returned by the API.

Return type:Notification
mark_as_read()[source]

Mark the notification as read.

editor Module

Text editor class for your favourite editor.

class pywikibot.editor.TextEditor[source]

Bases: object

Text editor.

command(tempFilename, text, jumpIndex=None)[source]

Return editor selected in user-config.py.

convertLinebreaks(text)[source]

Convert line-breaks.

edit(text, jumpIndex=None, highlight=None)[source]

Call the editor and thus allows the user to change the text.

Halts the thread’s operation until the editor is closed.

Parameters:
  • text (unicode) – the text to be edited
  • jumpIndex (int) – position at which to put the caret
  • highlight (unicode) – each occurence of this substring will be highlighted
Returns:

the modified text, or None if the user didn’t save the text file in his text editor

Return type:

unicode or None

restoreLinebreaks(text)[source]

Restore line-breaks.

exceptions Module

Exception classes used throughout the framework.

Error: Base class, all exceptions should the subclass of this class.
  • NoUsername: Username is not in user-config.py, or it is invalid.
  • UserBlockedY: our username or IP has been blocked
  • AutoblockUser: requested action on a virtual autoblock user not valid
  • UserActionRefuse
  • BadTitle: Server responded with BadTitle
  • InvalidTitle: Invalid page title
  • PageNotFound: Page not found in list
  • CaptchaError: Captcha is asked and config.solve_captcha == False
  • Server504Error: Server timed out with HTTP 504 code
SiteDefinitionError: Site loading problem
  • UnknownSite: Site does not exist in Family
  • UnknownFamily: Family is not registered
PageRelatedError: any exception which is caused by an operation on a Page.
  • NoPage: Page does not exist
  • IsRedirectPage: Page is a redirect page
  • IsNotRedirectPage: Page is not a redirect page
  • CircularRedirect: Page is a circular redirect
  • SectionError: The section specified by # does not exist

PageSaveRelatedError: page exceptions within the save operation on a Page (alias: PageNotSaved).

  • SpamfilterError: MediaWiki spam filter detected a blacklisted URL

  • OtherPageSaveError: misc. other save related exception.

  • LockedPage: Page is locked
    • LockedNoPage: Title is locked against creation
    • CascadeLockedPage: Page is locked due to cascading protection
  • EditConflict: Edit conflict while uploading the page
    • PageDeletedConflict: Page was deleted since being retrieved
    • PageCreatedConflict: Page was created by another user
    • ArticleExistsConflict: Page article already exists
  • NoCreateError: parameter nocreate not allow page creation

ServerError: a problem with the server.
  • FatalServerError: A fatal/non-recoverable server error
WikiBaseError: any issue specific to Wikibase.
  • CoordinateGlobeUnknownException: globe is not implemented yet.
  • EntityTypeUnknownException: entity type is not available on the site.

family Module

Objects representing MediaWiki families.

class pywikibot.family.AutoFamily(name, url, site=None)[source]

Bases: pywikibot.family.Family

Family that automatically loads the site configuration.

protocol(code)[source]

Return the protocol of the URL.

scriptpath(code)[source]

Extract the script path from the URL.

class pywikibot.family.Family[source]

Bases: object

Parent class for all wiki families.

apipath(code)[source]
category_redirects(code, fallback='_default')[source]
code2encoding(code)[source]

Return the encoding for a specific language wiki.

code2encodings(code)[source]

Return list of historical encodings for a specific language Wiki.

dbName(code)[source]
disambig(code, fallback='_default')[source]
encoding(code)[source]

Return the encoding for a specific language Wiki.

encodings(code)[source]

Return list of historical encodings for a specific language Wiki.

from_url(url)[source]

Return whether this family matches the given url.

The protocol must match, if it is present in the URL. It must match URLs generated via self.langs and Family.nice_get_address or Family.path.

It uses Family._get_path_regex to generate a regex defining the path after the domain.

Returns:The language code of the url. None if that url is not from this family.
Return type:str or None
get_cr_templates(code, fallback)[source]
get_known_families(site)[source]
has_query_api(code)[source]

Check query.php installed in the wiki.

hostname(code)[source]

The hostname to use for standard http connections.

ignore_certificate_error(code)[source]

Return whether a HTTPS certificate error should be ignored.

Parameters:code (string) – language code
Returns:flag to allow access if certificate has an error.
Return type:bool
isPublic(code)[source]

Check the wiki require logging in before viewing it.

iwkeys
linktrail(code, fallback='_default')[source]

Return regex for trailing chars displayed as part of a link.

Returns a string, not a compiled regular expression object.

This reads from the family file, and ‘’not’’ from [[MediaWiki:Linktrail]], because the MW software currently uses a built-in linktrail from its message files and ignores the wiki value.

static load(fam=None, fatal=NotImplemented)[source]

Import the named family.

Parameters:fam (str) – family name (if omitted, uses the configured default)
Returns:a Family instance configured for the named family.
Raises UnknownFamily:
 family not known
maximum_GET_length(code)[source]
nice_get_address(code, title)[source]
nicepath(code)[source]
path(code)[source]
post_get_convert(site, getText)[source]

Do a conversion on the retrieved text from the Wiki.

For example a X-conversion in Esperanto https://en.wikipedia.org/wiki/Esperanto_orthography#X-system.

pre_put_convert(site, putText)[source]

Do a conversion on the text to insert on the Wiki.

For example a X-conversion in Esperanto https://en.wikipedia.org/wiki/Esperanto_orthography#X-system.

protocol(code)[source]

The protocol to use to connect to the site.

May be overridden to return ‘https’. Other protocols are not supported.

Parameters:code (string) – language code
Returns:protocol that this family uses
Return type:string
querypath(code)[source]
scriptpath(code)[source]

The prefix used to locate scripts on this wiki.

This is the value displayed when you enter {{SCRIPTPATH}} on a wiki page (often displayed at [[Help:Variables]] if the wiki has copied the master help page correctly).

The default value is the one used on Wikimedia Foundation wikis, but needs to be overridden in the family file for any wiki that uses a different value.

server_time(code)[source]

DEPRECATED, use Site.getcurrenttime() instead.

Return a datetime object representing server time.

shared_data_repository(code, transcluded=False)[source]

Return the shared Wikibase repository, if any.

shared_image_repository(code)[source]

Return the shared image repository, if any.

ssl_hostname(code)[source]

The hostname to use for SSL connections.

ssl_pathprefix(code)[source]

The path prefix for secure HTTP access.

version(code)[source]

Return MediaWiki version number as a string.

Use pywikibot.tools.MediaWikiVersion to compare version strings.

versionnumber(code)[source]

DEPRECATED, use version() instead.

Use pywikibot.tools.MediaWikiVersion to compare version strings. Return an int identifying MediaWiki version.

Currently this is implemented as returning the minor version number; i.e., ‘X’ in version ‘1.X.Y’

class pywikibot.family.WikimediaFamily[source]

Bases: pywikibot.family.Family

#Class for all wikimedia families.

protocol(code)[source]

Return ‘https’ as the protocol.

shared_image_repository(code)[source]

fixes Module

File containing all standard fixes. Currently available predefined fixes are:

* HTML        - Convert HTML tags to wiki syntax, and
fix XHTML.
  • isbn - Fix badly formatted ISBNs.

  • syntax - Try to fix bad wiki markup. Do not run

    this in automatic mode, as the bot may make mistakes.

  • syntax-safe - Like syntax, but less risky, so you can

    run this in automatic mode.

  • case-de - fix upper/lower case errors in German

  • grammar-de - fix grammar and typography in German

  • vonbis - Ersetze Binde-/Gedankenstrich durch “bis”

    in German

  • music - Links auf Begriffsklärungen in German

  • datum - specific date formats in German

  • correct-ar - Corrections for Arabic Wikipedia and any

    Arabic wiki.

  • yu-tld - the yu top-level domain will soon be

    disabled, see

  • fckeditor - Try to convert FCKeditor HTML tags to wiki

    syntax. https://lists.wikimedia.org/pipermail/wikibots-l/2009-February/000290.html

i18n Module

Various i18n functions.

Helper functions for both the internal translation system and for TranslateWiki-based translations.

exception pywikibot.i18n.TranslationError(arg)[source]

Bases: pywikibot.exceptions.Error

Raised when no correct translation could be found.

pywikibot.i18n.input(twtitle, parameters=None, password=False)[source]

Ask the user a question, return the user’s answer.

The prompt message is retrieved via twtranslate and either uses the config variable ‘userinterface_lang’ or the default locale as the language code.

Parameters:
  • twtitle – The TranslateWiki string title, in <package>-<key> format
  • parameters – The values which will be applied to the translated text
  • password – Hides the user’s input (for password entry)
Return type:

unicode string

pywikibot.i18n.translate(code, xdict, parameters=None, fallback=False)[source]

Return the most appropriate translation from a translation dict.

Given a language code and a dictionary, returns the dictionary’s value for key ‘code’ if this key exists; otherwise tries to return a value for an alternative language that is most applicable to use on the wiki in language ‘code’ except fallback is False.

The language itself is always checked first, then languages that have been defined to be alternatives, and finally English. If none of the options gives result, we just take the first language in the list.

For PLURAL support have a look at the twntranslate method

Parameters:
  • code (string or Site object) – The language code
  • xdict (dict, string, unicode) – dictionary with language codes as keys or extended dictionary with family names as keys containing language dictionaries or a single (unicode) string. May contain PLURAL tags as described in twntranslate
  • parameters (dict, string, unicode, int) – For passing (plural) parameters
  • fallback (boolean) – Try an alternate language code
pywikibot.i18n.twhas_key(code, twtitle)[source]

Check if a message has a translation in the specified language code.

The translations are retrieved from i18n.<package>, based on the callers import table.

No code fallback is made.

Parameters:
  • code – The language code
  • twtitle – The TranslateWiki string title, in <package>-<key> format
pywikibot.i18n.twntranslate(code, twtitle, parameters=None)[source]

Translate a message with plural support.

Support is implemented like in MediaWiki extension. If the TranslateWiki message contains a plural tag inside which looks like::

{{PLURAL:<number>|<variant1>|<variant2>[|<variantn>]}}

it takes that variant calculated by the plural_rules depending on the number value. Multiple plurals are allowed.

As an examples, if we had a test dictionary in test.py like::

msg = {
    'en': {
        # number value as format string is allowed
        'test-plural': u'Bot: Changing %(num)s {{PLURAL:%(num)d|page|pages}}.',
    },
    'nl': {
        # format string inside PLURAL tag is allowed
        'test-plural': u'Bot: Pas {{PLURAL:num|1 pagina|%(num)d pagina\'s}} aan.',
    },
    'fr': {
        # additional string inside or outside PLURAL tag is allowed
        'test-plural': u'Robot: Changer %(descr)s {{PLURAL:num|une page|quelques pages}}.',
    },
}
>>> from pywikibot import i18n
>>> i18n.messages_package_name = 'tests.i18n'
>>> # use a number
>>> str(i18n.twntranslate('en', 'test-plural', 0) % {'num': 'no'})
'Bot: Changing no pages.'
>>> # use a string
>>> str(i18n.twntranslate('en', 'test-plural', '1') % {'num': 'one'})
'Bot: Changing one page.'
>>> # use a dictionary
>>> str(i18n.twntranslate('en', 'test-plural', {'num':2}))
'Bot: Changing 2 pages.'
>>> # use additional format strings
>>> str(i18n.twntranslate('fr', 'test-plural', {'num': 1, 'descr': 'seulement'}))
'Robot: Changer seulement une page.'
>>> # use format strings also outside
>>> str(i18n.twntranslate('fr', 'test-plural', 10) % {'descr': 'seulement'})
'Robot: Changer seulement quelques pages.'
>>> i18n.messages_package_name = 'scripts.i18n'

The translations are retrieved from i18n.<package>, based on the callers import table.

Parameters:
  • code – The language code
  • twtitle – The TranslateWiki string title, in <package>-<key> format
  • parameters – For passing (plural) parameters.
pywikibot.i18n.twtranslate(code, twtitle, parameters=None)[source]

Translate a message.

The translations are retrieved from i18n.<package>, based on the callers import table.

Parameters:
  • code – The language code
  • twtitle – The TranslateWiki string title, in <package>-<key> format
  • parameters – For passing parameters.

interwiki_graph Module

Module with the Graphviz drawing calls.

class pywikibot.interwiki_graph.GraphDrawer(subject)[source]

Bases: object

Graphviz (dot) code creator.

addDirectedEdge(page, refPage)[source]
addNode(page)[source]
createGraph()[source]

Create graph of the interwiki links.

For more info see http://meta.wikimedia.org/wiki/Interwiki_graphs

getLabel(page)[source]
saveGraphFile()[source]
exception pywikibot.interwiki_graph.GraphImpossible[source]

Bases: Exception

Drawing a graph is not possible on your system.

class pywikibot.interwiki_graph.GraphSavingThread(graph, originPage)[source]

Bases: threading.Thread

Threaded graph renderer.

Rendering a graph can take extremely long. We use multithreading because of that.

TODO: Find out if several threads running in parallel can slow down the system too much. Consider adding a mechanism to kill a thread if it takes too long.

run()[source]
pywikibot.interwiki_graph.getFilename(page, extension=None)[source]

Create a filename that is unique for the page.

Parameters:
  • page (Page) – page used to create the new filename
  • extension (str) – file extension
Returns:

filename of <family>-<lang>-<page>.<ext>

Return type:

str

logentries Module

Objects representing Mediawiki log entries.

class pywikibot.logentries.BlockEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Block log entry.

duration()[source]

Return a datetime.timedelta representing the block duration.

Returns:datetime.timedelta, or None if block is indefinite.
Raises Error:the entry is an unblocking log entry.
expiry()[source]

Return a Timestamp representing the block expiry date.

Raises Error:the entry is an unblocking log entry.
flags()[source]

Return a list of (str) flags associated with the block entry.

It raises an Error if the entry is an unblocking log entry.

isAutoblockRemoval()[source]
title()[source]

Return the blocked account or IP.

Returns:the Page object of username or IP if this block action targets a username or IP, or the blockid if this log reflects the removal of an autoblock
Return type:Page or int
class pywikibot.logentries.DeleteEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Deletion log entry.

class pywikibot.logentries.ImportEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Import log entry.

class pywikibot.logentries.LogDict[source]

Bases: dict

Simple custom dict that raises a custom KeyError when a key is missing.

It also logs debugging information when a key is missing.

class pywikibot.logentries.LogEntry(apidata)[source]

Bases: object

Generic log entry.

action()[source]
comment()[source]
logid()[source]
ns()[source]
pageid()[source]
timestamp()[source]

Timestamp object corresponding to event timestamp.

title()[source]

Page on which action was performed.

type()[source]
user()[source]
class pywikibot.logentries.LogEntryFactory(logtype=None)[source]

Bases: object

LogEntry Factory.

Only available method is create()

create(logdata)[source]

Instantiate the LogEntry object representing logdata.

Parameters:logdata (dict) – <item> returned by the api
Returns:LogEntry object representing logdata
class pywikibot.logentries.MoveEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Move log entry.

new_ns()[source]
new_title()[source]

Return page object of the new title.

suppressedredirect()[source]

Return True if no redirect was created during the move.

Return type:bool
class pywikibot.logentries.NewUsersEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

New user log entry.

class pywikibot.logentries.PatrolEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Patrol log entry.

class pywikibot.logentries.ProtectEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Protection log entry.

class pywikibot.logentries.RightsEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Rights log entry.

class pywikibot.logentries.UploadEntry(apidata)[source]

Bases: pywikibot.logentries.LogEntry

Upload log entry.

login Module

Library to log the bot in to a wiki account.

class pywikibot.login.LoginManager(password=None, sysop=False, site=None, user=None, verbose=NotImplemented, username='[deprecated name of user]')[source]

Bases: object

Site login manager.

botAllowed()[source]

Check whether the bot is listed on a specific page.

This allows bots to comply with the policy on the respective wiki.

getCookie(remember=True, captcha=None)[source]

Login to the site.

remember Remember login (default: True) captchaId A dictionary containing the captcha id and answer, if any

Returns cookie data if successful, None otherwise.

login(retry=False)[source]

Attempt to log into the server.

Parameters:retry (bool) – infinitely retry if the API returns an unknown error
Raises NoUsername:
 Username is not recognised by the site.
readPassword()[source]

Read passwords from a file.

DO NOT FORGET TO REMOVE READ ACCESS FOR OTHER USERS!!! Use chmod 600 password-file.

All lines below should be valid Python tuples in the form (code, family, username, password), (family, username, password) or (username, password) to set a default password for an username. The last matching entry will be used, so default usernames should occur above specific usernames.

If the username or password contain non-ASCII characters, they should be stored using the utf-8 encoding.

Example:

(u"my_username", u"my_default_password")
(u"my_sysop_user", u"my_sysop_password")
(u"wikipedia", u"my_wikipedia_user", u"my_wikipedia_pass")
(u"en", u"wikipedia", u"my_en_wikipedia_user", u"my_en_wikipedia_pass")
showCaptchaWindow(url)[source]
storecookiedata(data)[source]

Store cookie data.

The argument data is the raw data, as returned by getCookie().

Returns nothing.

page Module

Objects representing various types of MediaWiki, including Wikibase, pages.

This module also includes objects:: * Link: an internal or interwiki link in wikitext. * Revision: a single change to a wiki page. * Property: a type of semantic data. * Claim: an instance of a semantic assertion.

pagegenerators Module

This module offers a wide variety of page generators.

A page generator is an object that is iterable (see http://legacy.python.org/dev/peps/pep-0255/ ) and that yields page objects on which other scripts can then work.

Pagegenerators.py cannot be run as script. For testing purposes listpages.py can be used instead, to print page titles to standard output.

These parameters are supported to specify which pages titles to print:

-cat              Work on all pages which are in a specific category.
                  Argument can also be given as "-cat:categoryname" or
                  as "-cat:categoryname|fromtitle" (using # instead of |
                  is also allowed in this one and the following)

-catr             Like -cat, but also recursively includes pages in
                  subcategories, sub-subcategories etc. of the
                  given category.
                  Argument can also be given as "-catr:categoryname" or
                  as "-catr:categoryname|fromtitle".

-subcats          Work on all subcategories of a specific category.
                  Argument can also be given as "-subcats:categoryname" or
                  as "-subcats:categoryname|fromtitle".

-subcatsr         Like -subcats, but also includes sub-subcategories etc. of
                  the given category.
                  Argument can also be given as "-subcatsr:categoryname" or
                  as "-subcatsr:categoryname|fromtitle".

-uncat            Work on all pages which are not categorised.

-uncatcat         Work on all categories which are not categorised.

-uncatfiles       Work on all files which are not categorised.

-file             Read a list of pages to treat from the named text file.
                  Page titles in the file may be either enclosed with
                  [[brackets]], or be separated by new lines.
                  Argument can also be given as "-file:filename".

-filelinks        Work on all pages that use a certain image/media file.
                  Argument can also be given as "-filelinks:filename".

-search           Work on all pages that are found in a MediaWiki search
                  across all namespaces.

-namespaces       Filter the page generator to only yield pages in the
-namespace        specified namespaces. Separate multiple namespace
-ns               numbers with commas. Example "-ns:0,2,4"
                  If used with -newpages, -namepace/ns must be provided
                  before -newpages.
                  If used with -recentchanges, efficiency is improved if
                  -namepace/ns is provided before -recentchanges.

-interwiki        Work on the given page and all equivalent pages in other
                  languages. This can, for example, be used to fight
                  multi-site spamming.
                  Attention: this will cause the bot to modify
                  pages on several wiki sites, this is not well tested,
                  so check your edits!

-limit:n          When used with any other argument that specifies a set
                  of pages, work on no more than n pages in total.

-links            Work on all pages that are linked from a certain page.
                  Argument can also be given as "-links:linkingpagetitle".

-imagesused       Work on all images that contained on a certain page.
                  Argument can also be given as "-imagesused:linkingpagetitle".

-newimages        Work on the 100 newest images. If given as -newimages:x,
                  will work on the x newest images.

-newpages         Work on the most recent new pages. If given as -newpages:x,
                  will work on the x newest pages.

-recentchanges    Work on the pages with the most recent changes. If
                  given as -recentchanges:x, will work on the x most recently
                  changed pages.

-ref              Work on all pages that link to a certain page.
                  Argument can also be given as "-ref:referredpagetitle".

-start            Specifies that the robot should go alphabetically through
                  all pages on the home wiki, starting at the named page.
                  Argument can also be given as "-start:pagetitle".

                  You can also include a namespace. For example,
                  "-start:Template:!" will make the bot work on all pages
                  in the template namespace.

-prefixindex      Work on pages commencing with a common prefix.

-step:n           When used with any other argument that specifies a set
                  of pages, only retrieve n pages at a time from the wiki
                  server.

-titleregex       Work on titles that match the given regular expression.

-transcludes      Work on all pages that use a certain template.
                  Argument can also be given as "-transcludes:Title".

-unusedfiles      Work on all description pages of images/media files that are
                  not used anywhere.
                  Argument can be given as "-unusedfiles:n" where
                  n is the maximum number of articles to work on.

-lonelypages      Work on all articles that are not linked from any other
                  article.
                  Argument can be given as "-lonelypages:n" where
                  n is the maximum number of articles to work on.

-unwatched        Work on all articles that are not watched by anyone.
                  Argument can be given as "-unwatched:n" where
                  n is the maximum number of articles to work on.

-usercontribs     Work on all articles that were edited by a certain user.
                  (Example : -usercontribs:DumZiBoT)

-weblink          Work on all articles that contain an external link to
                  a given URL; may be given as "-weblink:url"

-withoutinterwiki Work on all pages that don't have interlanguage links.
                  Argument can be given as "-withoutinterwiki:n" where
                  n is the total to fetch.

-mysqlquery       Takes a Mysql query string like
                  "SELECT page_namespace, page_title, FROM page
                  WHERE page_namespace = 0" and works on the resulting pages.

-wikidataquery    Takes a WikidataQuery query string like claim[31:12280]
                  and works on the resulting pages.

-random           Work on random pages returned by [[Special:Random]].
                  Can also be given as "-random:n" where n is the number
                  of pages to be returned, otherwise the default is 10 pages.

-randomredirect   Work on random redirect pages returned by
                  [[Special:RandomRedirect]]. Can also be given as
                  "-randomredirect:n" where n is the number of pages to be
                  returned, else 10 pages are returned.

-untagged         Work on image pages that don't have any license template on a
                  site given in the format "<language>.<project>.org, e.g.
                  "ja.wikipedia.org" or "commons.wikimedia.org".
                  Using an external Toolserver tool.

-google           Work on all pages that are found in a Google search.
                  You need a Google Web API license key. Note that Google
                  doesn't give out license keys anymore. See google_key in
                  config.py for instructions.
                  Argument can also be given as "-google:searchstring".

-yahoo            Work on all pages that are found in a Yahoo search.
                  Depends on python module pYsearch.  See yahoo_appid in
                  config.py for instructions.

-page             Work on a single page. Argument can also be given as
                  "-page:pagetitle", and supplied multiple times for
                  multiple pages.

-grep             A regular expression that needs to match the article
                  otherwise the page won't be returned.
                  Multiple -grep:regexpr can be provided and the page will
                  be returned if content is matched by any of the regexpr
                  provided.
                  Case insensitive regular expressions will be used and
                  dot matches any character, including a newline.

-intersect        Work on the intersection of all the provided generators.
pywikibot.pagegenerators.AllpagesPageGenerator(start='!', namespace=0, includeredirects=True, site=None, step=None, total=None, content=False)[source]

Iterate Page objects for all titles in a single namespace.

If includeredirects is False, redirects are not included. If includeredirects equals the string ‘only’, only redirects are added.

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • content – If True, load current version of each page (default False)
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.AncientPagesPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Ancient page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.CategorizedPageGenerator(category, recurse=False, start=None, step=None, total=None, content=False)[source]

Yield all pages in a specific category.

If recurse is True, pages in subcategories are included as well; if recurse is an int, only subcategories to that depth will be included (e.g., recurse=2 will get pages in subcats and sub-subcats, but will not go any further).

If start is a string value, only pages whose sortkey comes after start alphabetically are included.

If content is True (default is False), the current page text of each retrieved page will be downloaded.

pywikibot.pagegenerators.CategoryGenerator(generator)[source]

Yield pages from another generator as Category objects.

Makes sense only if it is ascertained that only categories are being retrieved.

pywikibot.pagegenerators.CombinedPageGenerator(generators)[source]

Yield from each iterable until exhausted, then proceed with the next.

pywikibot.pagegenerators.DayPageGenerator(startMonth=1, endMonth=12, site=None)[source]

Day page generator.

Parameters:site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.DeadendPagesPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Dead-end page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.DequePreloadingGenerator(generator, step=50)[source]

Preload generator of type DequeGenerator.

pywikibot.pagegenerators.DuplicateFilterPageGenerator(generator)[source]

Yield all unique pages from another generator, omitting duplicates.

pywikibot.pagegenerators.EdittimeFilterPageGenerator(generator, last_edit_start=None, last_edit_end=None, first_edit_start=None, first_edit_end=None, show_filtered=False, begintime='[deprecated name of last_edit_start]', endtime='[deprecated name of last_edit_end]')[source]

Wrap a generator to filter pages outside last or first edit range.

Parameters:
  • generator – A generator object
  • last_edit_start (datetime) – Only yield pages last edited after this time
  • last_edit_end (datetime) – Only yield pages last edited before this time
  • first_edit_start (datetime) – Only yield pages first edited after this time
  • first_edit_end (datetime) – Only yield pages first edited before this time
  • show_filtered (bool) – Output a message for each page not yielded
pywikibot.pagegenerators.FileGenerator(generator)[source]

Yield pages from another generator as FilePage objects.

Makes sense only if it is ascertained that only images are being retrieved.

pywikibot.pagegenerators.FileLinksGenerator(referredFilePage, step=None, total=None, content=False)[source]

Yield Pages on which the file referredFilePage is displayed.

class pywikibot.pagegenerators.GeneratorFactory(site=None)[source]

Bases: object

Process command line arguments and return appropriate page generator.

This factory is responsible for processing command line arguments that are used by many scripts and that determine which pages to work on.

getCategoryGen(arg, recurse=False, content=False, gen_func=None)[source]

Return generator based on Category defined by arg and gen_func.

getCombinedGenerator(gen=None)[source]

Return the combination of all accumulated generators.

Only call this after all arguments have been parsed.

handleArg(arg)[source]

Parse one argument at a time.

If it is recognized as an argument that specifies a generator, a generator is created and added to the accumulation list, and the function returns true. Otherwise, it returns false, so that caller can try parsing the argument. Call getCombinedGenerator() after all arguments have been parsed to get the final output generator.

site

Generator site.

Returns:Site given to constructor, otherwise the default Site.
Return type:pywikibot.site.BaseSite
class pywikibot.pagegenerators.GoogleSearchPageGenerator(query=None, site=None)[source]

Bases: object

Page generator using Google search results.

To use this generator, you need to install the package ‘google’. https://pypi.python.org/pypi/google

This package has been available since 2010, hosted on github since 2012, and provided by pypi since 2013.

As there are concerns about Google’s Terms of Service, this generator prints a warning for each query.

queryGoogle(query)[source]

Perform a query using python package ‘google’.

The terms of service as at June 2014 give two conditions that may apply to use of search:

1. Dont access [Google Services] using a method other than
   the interface and the instructions that [they] provide.
2. Don't remove, obscure, or alter any legal notices
   displayed in or along with [Google] Services.

Both of those issues should be managed by the package ‘google’, however Pywikibot will at least ensure the user sees the TOS in order to comply with the second condition.

pywikibot.pagegenerators.ImageGenerator(generator)

Yield pages from another generator as FilePage objects.

Makes sense only if it is ascertained that only images are being retrieved.

pywikibot.pagegenerators.ImagesPageGenerator(pageWithImages, step=None, total=None, content=False)[source]

Yield FilePages displayed on pageWithImages.

pywikibot.pagegenerators.InterwikiPageGenerator(page)[source]

Iterate over all interwiki (non-language) links on a page.

pywikibot.pagegenerators.LanguageLinksPageGenerator(page, step=None, total=None)[source]

Iterate over all interwiki language links on a page.

pywikibot.pagegenerators.LinkedPageGenerator(linkingPage, step=None, total=None, content=False)[source]

Yield all pages linked from a specific page.

pywikibot.pagegenerators.LinksearchPageGenerator(link, namespaces=None, step=None, total=None, site=None)[source]

Yield all pages that include a specified link.

Obtains data from [[Special:Linksearch]].

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.LonelyPagesPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Lonely page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.LongPagesPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Long page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.MySQLPageGenerator(query, site=None)[source]

Yield a list of pages based on a MySQL query.

Each query should provide the page namespace and page title. An example query that yields all ns0 pages might look like::

SELECT
 page_namespace,
 page_title,
FROM page
WHERE page_namespace = 0;

Requires oursql <https://pythonhosted.org/oursql/> or MySQLdb <https://sourceforge.net/projects/mysql-python/>

Parameters:
Returns:

iterator of pywikibot.Page

pywikibot.pagegenerators.NamespaceFilterPageGenerator(generator, namespaces, site=None)[source]

A generator yielding pages from another generator in given namespaces.

The namespace list can contain both integers (namespace numbers) and strings/unicode strings (namespace names).

NOTE: API-based generators that have a “namespaces” parameter perform namespace filtering more efficiently than this generator.

Parameters:
  • namespaces (list of int) – list of namespace numbers to limit results
  • site (pywikibot.site.BaseSite) – Site for generator results, only needed if namespaces contains namespace names.
pywikibot.pagegenerators.NewimagesPageGenerator(step=None, total=None, site=None, number='[deprecated name of total]')[source]

New file generator.

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.NewpagesPageGenerator(get_redirect=False, site=None, namespaces=[0], step=None, total=None, namespace='[deprecated name of namespaces]', number='[deprecated name of total]', repeat=NotImplemented)[source]

Iterate Page objects for all new titles in a single namespace.

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.PageTitleFilterPageGenerator(generator, ignore_list, ignoreList='[deprecated name of ignore_list]')[source]

Yield only those pages are not listed in the ignore list.

Parameters:ignore_list (dict) – family names are mapped to dictionaries in which language codes are mapped to lists of page titles. Each title must be a valid regex as they are compared using re.search.
pywikibot.pagegenerators.PageWithTalkPageGenerator(generator)[source]

Yield pages and associated talk pages from another generator.

Only yields talk pages if the original generator yields a non-talk page, and does not check if the talk page in fact exists.

pywikibot.pagegenerators.PagesFromTitlesGenerator(iterable, site=None)[source]

Generate pages from the titles (unicode strings) yielded by iterable.

Parameters:site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.PrefixingPageGenerator(prefix, namespace=None, includeredirects=True, site=None, step=None, total=None, content=False)[source]

Prefixed Page generator.

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • content – If True, load current version of each page (default False)
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.PreloadingGenerator(generator, step=50, lookahead=NotImplemented, pageNumber='[deprecated name of step]')[source]

Yield preloaded pages taken from another generator.

Parameters:
  • generator – pages to iterate over
  • step (int) – how many pages to preload at once
pywikibot.pagegenerators.PreloadingItemGenerator(generator, step=50)[source]

Yield preloaded pages taken from another generator.

Function basically is copied from above, but for ItemPage’s

Parameters:
  • generator – pages to iterate over
  • step (int) – how many pages to preload at once
pywikibot.pagegenerators.RandomPageGenerator(total=10, site=None, number='[deprecated name of total]')[source]

Random page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.RandomRedirectPageGenerator(total=10, site=None, number='[deprecated name of total]')[source]

Random redirect generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.RecentChangesPageGenerator(start=None, end=None, reverse=False, namespaces=None, pagelist=None, changetype=None, showMinor=None, showBot=None, showAnon=None, showRedirects=None, showPatrolled=None, topOnly=False, step=None, total=None, user=None, excludeuser=None, site=None)[source]

Generate pages that are in the recent changes list.

Parameters:
  • start (pywikibot.Timestamp) – Timestamp to start listing from
  • end (pywikibot.Timestamp) – Timestamp to end listing at
  • reverse (bool) – if True, start with oldest changes (default: newest)
  • pagelist – iterate changes to pages in this list only
  • pagelist – list of Pages
  • changetype (basestring) – only iterate changes of this type (“edit” for edits to existing pages, “new” for new pages, “log” for log entries)
  • showMinor (bool or None) – if True, only list minor edits; if False, only list non-minor edits; if None, list all
  • showBot (bool or None) – if True, only list bot edits; if False, only list non-bot edits; if None, list all
  • showAnon (bool or None) – if True, only list anon edits; if False, only list non-anon edits; if None, list all
  • showRedirects (bool or None) – if True, only list edits to redirect pages; if False, only list edits to non-redirect pages; if None, list all
  • showPatrolled (bool or None) – if True, only list patrolled edits; if False, only list non-patrolled edits; if None, list all
  • topOnly (bool) – if True, only list changes that are the latest revision (default False)
  • user (basestring|list) – if not None, only list edits by this user or users
  • excludeuser (basestring|list) – if not None, exclude edits by this user or users
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.RedirectFilterPageGenerator(generator, no_redirects=True, show_filtered=False)[source]

Yield pages from another generator that are redirects or not.

Parameters:
  • no_redirects – Exclude redirects if True, else only include redirects.
  • no_redirects – bool
  • show_filtered (bool) – Output a message for each page not yielded
pywikibot.pagegenerators.ReferringPageGenerator(referredPage, followRedirects=False, withTemplateInclusion=True, onlyTemplateInclusion=False, step=None, total=None, content=False)[source]

Yield all pages referring to a specific page.

class pywikibot.pagegenerators.RegexFilter[source]

Bases: object

Regex filter.

classmethod contentfilter(generator, regex, quantifier='any')[source]

Yield pages from another generator whose body matches regex.

Uses regex option re.IGNORECASE depending on the quantifier parameter.

For parameters see titlefilter above.

classmethod titlefilter(generator, regex, quantifier='any', ignore_namespace=True, inverse='[deprecated name of quantifier]')[source]

Yield pages from another generator whose title matches regex.

Uses regex option re.IGNORECASE depending on the quantifier parameter.

If ignore_namespace is False, the whole page title is compared. NOTE: if you want to check for a match at the beginning of the title, you have to start the regex with “^”

Parameters:
  • generator (any generator or iterator) – another generator
  • regex (a single regex string or a list of regex strings or a compiled regex or a list of compiled regexes) – a regex which should match the page title
  • quantifier (string of (‘all’, ‘any’, ‘none’)) – must be one of the following values: ‘all’ - yields page if title is matched by all regexes ‘any’ - yields page if title is matched by any regexes ‘none’ - yields page if title is NOT matched by any regexes
  • ignore_namespace (bool) – ignore the namespace when matching the title
Returns:

return a page depending on the matching parameters

pywikibot.pagegenerators.RepeatingGenerator(generator, key_func=<function <lambda> at 0x7f92bc6a2d08>, sleep_duration=60, total=None, **kwargs)[source]

Yield items in live time.

The provided generator must support parameter ‘start’, ‘end’, ‘reverse’, and ‘total’ such as site.recentchanges(), site.logevents().

To fetch revisions in recentchanges in live time::

gen = RepeatingGenerator(site.recentchanges, lambda x: x['revid'])

To fetch new pages in live time::

gen = RepeatingGenerator(site.newpages, lambda x: x[0])

Note that other parameters not listed below will be passed to the generator function. Parameter ‘reverse’, ‘start’, ‘end’ will always be discarded to prevent the generator yielding items in wrong order.

Parameters:
  • generator – a function returning a generator that will be queried
  • key_func – a function returning key that will be used to detect duplicate entry
  • sleep_duration – duration between each query
  • total (int or None) – if it is a positive number, iterate no more than this number of items in total. Otherwise, iterate forever
Returns:

a generator yielding items in ascending order by time

pywikibot.pagegenerators.SearchPageGenerator(query, step=None, total=None, namespaces=None, site=None)[source]

Yield pages from the MediaWiki internal search engine.

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.ShortPagesPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Short page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.SubCategoriesPageGenerator(category, recurse=False, start=None, step=None, total=None, content=False)[source]

Yield all subcategories in a specific category.

If recurse is True, pages in subcategories are included as well; if recurse is an int, only subcategories to that depth will be included (e.g., recurse=2 will get pages in subcats and sub-subcats, but will not go any further).

If start is a string value, only categories whose sortkey comes after start alphabetically are included.

If content is True (default is False), the current page text of each category description page will be downloaded.

pywikibot.pagegenerators.TextfilePageGenerator(filename=None, site=None)[source]

Iterate pages from a list in a text file.

The file must contain page links between double-square-brackets or, in alternative, separated by newlines. The generator will yield each corresponding Page object.

Parameters:
  • filename (unicode) – the name of the file that should be read. If no name is given, the generator prompts the user.
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UnCategorizedCategoryGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Uncategorized category generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UnCategorizedImageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Uncategorized file generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UnCategorizedPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Uncategorized page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UnCategorizedTemplateGenerator(total=100, site=None)[source]

Uncategorized template generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UntaggedPageGenerator(untaggedProject, limit=500, site=None)[source]

Yield pages from defunct toolserver UntaggedImages.php.

It was using this tool:: https://toolserver.org/~daniel/WikiSense/UntaggedImages.php

Parameters:site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UnusedFilesGenerator(total=100, site=None, extension=NotImplemented, number='[deprecated name of total]', repeat=NotImplemented)[source]

Unused files generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UnwatchedPagesPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Unwatched page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.UserContributionsGenerator(username, namespaces=None, site=None, step=None, total=None, number='[deprecated name of total]')[source]

Yield unique pages edited by user:username.

Parameters:
  • step (int) – Maximum number of pages to retrieve per API query
  • total (int) – Maxmum number of pages to retrieve in total
  • namespaces (list of int) – list of namespace numbers to fetch contribs from
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.WantedPagesPageGenerator(total=100, site=None)[source]

Wanted page generator.

Parameters:
  • total (int) – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
pywikibot.pagegenerators.WikibaseItemFilterPageGenerator(generator, has_item=True, show_filtered=False)[source]

A wrapper generator used to exclude if page has a wikibase item or not.

Parameters:
  • gen (generator) – Generator to wrap.
  • has_item (bool) – Exclude pages without an item if True, or only include pages without an item if False
  • show_filtered (bool) – Output a message for each page not yielded
Returns:

Wrapped generator

Return type:

generator

pywikibot.pagegenerators.WikibaseItemGenerator(gen)[source]

A wrapper generator used to yield Wikibase items of another generator.

Parameters:gen (generator) – Generator to wrap.
Returns:Wrapped generator
Return type:generator
pywikibot.pagegenerators.WikidataItemGenerator(gen)

A wrapper generator used to yield Wikibase items of another generator.

Parameters:gen (generator) – Generator to wrap.
Returns:Wrapped generator
Return type:generator
pywikibot.pagegenerators.WikidataQueryPageGenerator(query, site=None)[source]

Generate pages that result from the given WikidataQuery.

Parameters:
pywikibot.pagegenerators.WithoutInterwikiPageGenerator(total=100, site=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Page lacking interwikis generator.

Parameters:
  • total – Maxmum number of pages to retrieve in total
  • site (pywikibot.site.BaseSite) – Site for generator results.
class pywikibot.pagegenerators.YahooSearchPageGenerator(query=None, count=100, site=None)[source]

Bases: object

Page generator using Yahoo! search results.

To use this generator, you need to install the package ‘pYsearch’. https://pypi.python.org/pypi/pYsearch

To use this generator, install pYsearch

queryYahoo(query)[source]

Perform a query using python package ‘pYsearch’.

pywikibot.pagegenerators.YearPageGenerator(start=1, end=2050, site=None)[source]

Year page generator.

Parameters:site (pywikibot.site.BaseSite) – Site for generator results.

plural Module

Module containing plural rules of various languages.

site Module

Objects representing MediaWiki sites (wikis).

This module also includes functions to load families, which are groups of wikis on the same topic in different languages.

class pywikibot.site.APISite(code, fam=None, user=None, sysop=None)[source]

Bases: pywikibot.site.BaseSite

API interface to MediaWiki site.

Do not use directly; use pywikibot.Site function.

class OnErrorExc

Bases: tuple

OnErrorExc(exception, on_new_page)

_asdict()

Return a new OrderedDict which maps field names to their values.

_fields = ('exception', 'on_new_page')
classmethod _make(iterable, new=<built-in method __new__ of type object at 0x9ca6e0>, len=<built-in function len>)

Make a new OnErrorExc object from a sequence or iterable

_replace(_self, **kwds)

Return a new OnErrorExc object replacing specified fields with new values

_source = "from builtins import property as _property, tuple as _tuple\nfrom operator import itemgetter as _itemgetter\nfrom collections import OrderedDict\n\nclass OnErrorExc(tuple):\n 'OnErrorExc(exception, on_new_page)'\n\n __slots__ = ()\n\n _fields = ('exception', 'on_new_page')\n\n def __new__(_cls, exception, on_new_page):\n 'Create new instance of OnErrorExc(exception, on_new_page)'\n return _tuple.__new__(_cls, (exception, on_new_page))\n\n @classmethod\n def _make(cls, iterable, new=tuple.__new__, len=len):\n 'Make a new OnErrorExc object from a sequence or iterable'\n result = new(cls, iterable)\n if len(result) != 2:\n raise TypeError('Expected 2 arguments, got %d' % len(result))\n return result\n\n def _replace(_self, **kwds):\n 'Return a new OnErrorExc object replacing specified fields with new values'\n result = _self._make(map(kwds.pop, ('exception', 'on_new_page'), _self))\n if kwds:\n raise ValueError('Got unexpected field names: %r' % list(kwds))\n return result\n\n def __repr__(self):\n 'Return a nicely formatted representation string'\n return self.__class__.__name__ + '(exception=%r, on_new_page=%r)' % self\n\n @property\n def __dict__(self):\n 'A new OrderedDict mapping field names to their values'\n return OrderedDict(zip(self._fields, self))\n\n def _asdict(self):\n 'Return a new OrderedDict which maps field names to their values.'\n return self.__dict__\n\n def __getnewargs__(self):\n 'Return self as a plain tuple. Used by copy and pickle.'\n return tuple(self)\n\n def __getstate__(self):\n 'Exclude the OrderedDict from pickling'\n return None\n\n exception = _property(_itemgetter(0), doc='Alias for field number 0')\n\n on_new_page = _property(_itemgetter(1), doc='Alias for field number 1')\n\n"
exception

Alias for field number 0

on_new_page

Alias for field number 1

APISite.TOKENS_0 = {'import', 'move', 'watch', 'protect', 'delete', 'email', 'edit', 'block', 'unblock'}
APISite.TOKENS_1 = {'centralauth', 'import', 'move', 'watch', 'setglobalaccountstatus', 'protect', 'delete', 'email', 'edit', 'options', 'deleteglobalaccount', 'block', 'patrol', 'unblock'}
APISite.TOKENS_2 = {'watch', 'rollback', 'setglobalaccountstatus', 'userrights', 'csrf', 'deleteglobalaccount', 'patrol'}
APISite._build_namespaces()[source]
APISite._dl_errors = {'cantdelete': 'Could not delete [[%(title)s]]. Maybe it was deleted already.', 'writeapidenied': 'User %(user)s not allowed to edit through the API', 'permissiondenied': 'User %(user)s not authorized to delete pages on %(site)s wiki.', 'noapiwrite': 'API editing not enabled on %(site)s wiki'}
APISite._ep_errors = {'editconflict': <class 'pywikibot.exceptions.EditConflict'>, 'articleexists': <class 'pywikibot.exceptions.PageCreatedConflict'>, 'pagedeleted': <class 'pywikibot.exceptions.PageDeletedConflict'>, 'noimageredirect-anon': 'Bot is not logged in, and anon users are not authorized to create image redirects on %(site)s wiki', 'contenttoobig': '%(info)s', 'filtered': '%(info)s', 'missingtitle': <class 'pywikibot.exceptions.NoCreateError'>, 'cantcreate': 'User %(user)s not authorized to create new pages on %(site)s wiki', 'cascadeprotected': <class 'pywikibot.exceptions.CascadeLockedPage'>, 'noapiwrite': 'API editing not enabled on %(site)s wiki', 'protectedtitle': <class 'pywikibot.exceptions.LockedNoPage'>, 'protectedpage': <class 'pywikibot.exceptions.LockedPage'>, 'noimageredirect': 'User %(user)s not authorized to create image redirects on %(site)s wiki', 'cantcreate-anon': 'Bot is not logged in, and anon users are not authorized to create new pages on %(site)s wiki', 'noedit-anon': 'Bot is not logged in, and anon users are not authorized to edit on %(site)s wiki', 'writeapidenied': 'User %(user)s is not authorized to edit on %(site)s wiki', 'noedit': 'User %(user)s not authorized to edit pages on %(site)s wiki'}
APISite._generator(gen_class, type_arg=None, namespaces=None, step=None, total=None, **args)[source]

Convenience method that returns an API generator.

All keyword args not listed below are passed to the generator’s constructor unchanged.

Parameters:
  • gen_class – the type of generator to construct (must be a subclass of pywikibot.data.api.QueryGenerator)
  • type_arg (str) – query type argument to be passed to generator’s constructor unchanged (not all types require this)
  • namespaces (int, or list of ints) – if not None, limit the query to namespaces in this list
  • step (int) – if not None, limit each API call to this many items
  • total (int) – if not None, limit the generator to yielding this many items in total
APISite._mv_errors = {'immobilenamespace': 'Pages in %(oldnamespace)s namespace cannot be moved on %(site)s wiki', 'protectedtitle': OnErrorExc(exception=<class 'pywikibot.exceptions.LockedNoPage'>, on_new_page=True), 'protectedpage': OnErrorExc(exception=<class 'pywikibot.exceptions.LockedPage'>, on_new_page=None), 'articleexists': OnErrorExc(exception=<class 'pywikibot.exceptions.ArticleExistsConflict'>, on_new_page=True), 'filetypemismatch': '[[%(newtitle)s]] file extension does not match content of [[%(oldtitle)s]]', 'cantmove': 'User %(user)s is not authorized to move pages on %(site)s wiki', 'cantmove-anon': 'Bot is not logged in, and anon users are not authorized to move pages on\n%(site)s wiki', 'nonfilenamespace': 'Cannot move a file to %(newnamespace)s namespace on %(site)s wiki', 'nosuppress': 'User %(user)s is not authorized to move pages without creating redirects', 'writeapidenied': 'User %(user)s is not authorized to edit on %(site)s wiki', 'noapiwrite': 'API editing not enabled on %(site)s wiki'}
APISite._patrol_errors = {'notpatrollable': "The revision %(revid)s can't be patrolled as it's too old.", 'nosuchrcid': 'There is no change with rcid %(rcid)s', 'patroldisabled': 'Patrolling is disabled on %(site)s wiki', 'noautopatrol': "User %(user)s has no permission to patrol its own changes, 'autopatrol' is needed", 'nosuchrevid': 'There is no change with revid %(revid)s'}
APISite._protect_errors = {'cantedit': "User %(user) can't protect this page because user %(user) can't edit it.", 'protect-invalidlevel': 'Invalid protection level', 'writeapidenied': 'User %(user)s not allowed to edit through the API', 'permissiondenied': 'User %(user)s not authorized to protect pages on %(site)s wiki.', 'noapiwrite': 'API editing not enabled on %(site)s wiki'}
APISite._rb_errors = {'alreadyrolled': 'Page [[%(title)s]] already rolled back; action aborted.', 'writeapidenied': 'User %(user)s not allowed to edit through the API', 'noapiwrite': 'API editing not enabled on %(site)s wiki'}
APISite._update_page(page, query, method_name)[source]
APISite.allcategories(start='!', prefix='', step=None, total=None, reverse=False, content=False)[source]

Iterate categories used (which need not have a Category page).

Iterator yields Category objects. Note that, in practice, links that were found on pages that have been deleted may not have been removed from the database table, so this method can return false positives.

Parameters:
  • start – Start at this category title (category need not exist).
  • prefix – Only yield categories starting with this string.
  • reverse – if True, iterate in reverse Unicode lexigraphic order (default: iterate in forward order)
  • content – if True, load the current content of each iterated page (default False); note that this means the contents of the category description page, not the pages that are members of the category
APISite.allimages(start='!', prefix='', minsize=None, maxsize=None, reverse=False, sha1=None, sha1base36=None, step=None, total=None, content=False)[source]

Iterate all images, ordered by image title.

Yields FilePages, but these pages need not exist on the wiki.

Parameters:
  • start – start at this title (name need not exist)
  • prefix – only iterate titles starting with this substring
  • minsize – only iterate images of at least this many bytes
  • maxsize – only iterate images of no more than this many bytes
  • reverse – if True, iterate in reverse lexigraphic order
  • sha1 – only iterate image (it is theoretically possible there could be more than one) with this sha1 hash
  • sha1base36 – same as sha1 but in base 36
  • content – if True, load the current content of each iterated page (default False); note that this means the content of the image description page, not the image itself

Iterate all links to pages (which need not exist) in one namespace.

Note that, in practice, links that were found on pages that have been deleted may not have been removed from the links table, so this method can return false positives.

Parameters:
  • start – Start at this title (page need not exist).
  • prefix – Only yield pages starting with this string.
  • namespace – Iterate pages from this (single) namespace (default: 0)
  • unique – If True, only iterate each link title once (default: iterate once for each linking page)
  • fromids – if True, include the pageid of the page containing each link (default: False) as the ‘_fromid’ attribute of the Page; cannot be combined with unique
APISite.allpages(start='!', prefix='', namespace=0, filterredir=None, filterlanglinks=None, minsize=None, maxsize=None, protect_type=None, protect_level=None, reverse=False, includeredirects='[deprecated name of filterredir]', step=None, total=None, content=False, throttle=NotImplemented, limit='[deprecated name of total]')[source]

Iterate pages in a single namespace.

Note: parameters includeRedirects and throttle are deprecated and included only for backwards compatibility.

Parameters:
  • start – Start at this title (page need not exist).
  • prefix – Only yield pages starting with this string.
  • namespace – Iterate pages from this (single) namespace (default: 0)
  • filterredir – if True, only yield redirects; if False (and not None), only yield non-redirects (default: yield both)
  • filterlanglinks – if True, only yield pages with language links; if False (and not None), only yield pages without language links (default: yield both)
  • minsize – if present, only yield pages at least this many bytes in size
  • maxsize – if present, only yield pages at most this many bytes in size
  • protect_type (str) – only yield pages that have a protection of the specified type
  • protect_level – only yield pages that have protection at this level; can only be used if protect_type is specified
  • reverse – if True, iterate in reverse Unicode lexigraphic order (default: iterate in forward order)
  • includeredirects – DEPRECATED, use filterredir instead
  • content – if True, load the current content of each iterated page (default False)
APISite.allusers(start='!', prefix='', group=None, step=None, total=None)[source]

Iterate registered users, ordered by username.

Iterated values are dicts containing ‘name’, ‘editcount’, ‘registration’, and (sometimes) ‘groups’ keys. ‘groups’ will be present only if the user is a member of at least 1 group, and will be a list of unicodes; all the other values are unicodes and should always be present.

Parameters:
  • start – start at this username (name need not exist)
  • prefix – only iterate usernames starting with this substring
  • group (str) – only iterate users that are members of this group
APISite.ancientpages(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Pages, datestamps from Special:Ancientpages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.assert_valid_iter_params(msg_prefix, start, end, reverse)[source]

Validate iterating API parameters.

APISite.blocks(starttime=None, endtime=None, reverse=False, blockids=None, users=None, step=None, total=None)[source]

Iterate all current blocks, in order of creation.

Note that logevents only logs user blocks, while this method iterates all blocks including IP ranges. The iterator yields dicts containing keys corresponding to the block properties (see https://www.mediawiki.org/wiki/API:Query_-_Lists for documentation).

Parameters:
  • starttime – start iterating at this Timestamp
  • endtime – stop iterating at this Timestamp
  • reverse – if True, iterate oldest blocks first (default: newest)
  • blockids – only iterate blocks with these id numbers
  • users – only iterate blocks affecting these usernames or IPs
APISite.blockuser(user, expiry, reason, anononly=True, nocreate=True, autoblock=True, noemail=False, reblock=False)[source]
APISite.botusers(step=None, total=None)[source]

Iterate bot users.

Iterated values are dicts containing ‘name’, ‘userid’, ‘editcount’, ‘registration’, and ‘groups’ keys. ‘groups’ will be present only if the user is a member of at least 1 group, and will be a list of unicodes; all the other values are unicodes and should always be present.

APISite.broken_redirects(step=None, total=None)[source]

Yield Pages without language links from Special:BrokenRedirects.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.case()[source]

Return this site’s capitalization rule.

APISite.categories(number=10, repeat=False)[source]

DEPRECATED.

APISite.categoryinfo(category)[source]
APISite.categorymembers(category, namespaces=None, sortby='', reverse=False, starttime=None, endtime=None, startsort=None, endsort=None, step=None, total=None, content=False)[source]

Iterate members of specified category.

Parameters:
  • category – The Category to iterate.
  • namespaces (list of ints) – If present, only return category members from these namespaces. For example, use namespaces=[14] to yield subcategories, use namespaces=[6] to yield image files, etc. Note, however, that the iterated values are always Page objects, even if in the Category or Image namespace.
  • sortby (str) – determines the order in which results are generated, valid values are “sortkey” (default, results ordered by category sort key) or “timestamp” (results ordered by time page was added to the category)
  • reverse – if True, generate results in reverse order (default False)
  • starttime (pywikibot.Timestamp) – if provided, only generate pages added after this time; not valid unless sortby=”timestamp”
  • endtime (pywikibot.Timestamp) – if provided, only generate pages added before this time; not valid unless sortby=”timestamp”
  • startsort (str) – if provided, only generate pages >= this title lexically; not valid if sortby=”timestamp”
  • endsort (str) – if provided, only generate pages <= this title lexically; not valid if sortby=”timestamp”
  • content – if True, load the current content of each iterated page (default False)
APISite.checkBlocks(sysop=False)[source]

Check if the user is blocked, and raise an exception if so.

APISite.data_repository()[source]

Return Site object for data repository e.g. Wikidata.

APISite.dbName()[source]

Return this site’s internal id.

APISite.deadendpages(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Page objects retrieved from Special:Deadendpages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.deletedrevs(page, start=None, end=None, reverse=None, get_text=False, step=None, total=None)[source]

Iterate deleted revisions.

Each value returned by the iterator will be a dict containing the ‘title’ and ‘ns’ keys for a particular Page and a ‘revisions’ key whose value is a list of revisions in the same format as recentchanges (plus a ‘content’ element if requested). If get_text is true, the toplevel dict will contain a ‘token’ key as well.

Parameters:
  • page – The page to check for deleted revisions
  • start – Iterate revisions starting at this Timestamp
  • end – Iterate revisions ending at this Timestamp
  • reverse – Iterate oldest revisions first (default: newest)
  • get_text – If True, retrieve the content of each revision and an undelete token
APISite.deletepage(page, summary)[source]

Delete page from the wiki. Requires appropriate privilege level.

Parameters:
  • page – Page to be deleted.
  • summary – Edit summary (required!).
APISite.double_redirects(step=None, total=None)[source]

Yield Pages without language links from Special:BrokenRedirects.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.editpage(page, summary, minor=True, notminor=False, bot=True, recreate=True, createonly=False, nocreate=False, watch=None)[source]

Submit an edited Page object to be saved to the wiki.

Parameters:
  • page – The Page to be saved; its .text property will be used as the new text to be saved to the wiki
  • summary – the edit summary (required!)
  • minor – if True (default), mark edit as minor
  • notminor – if True, override account preferences to mark edit as non-minor
  • recreate – if True (default), create new page even if this title has previously been deleted
  • createonly – if True, raise an error if this title already exists on the wiki
  • nocreate – if True, raise an error if the page does not exist
  • watch – Specify how the watchlist is affected by this edit, set to one of “watch”, “unwatch”, “preferences”, “nochange”:: * watch: add the page to the watchlist * unwatch: remove the page from the watchlist * preferences: use the preference settings (Default) * nochange: don’t change the watchlist
  • bot – if True, mark edit with bot flag
Returns:

True if edit succeeded, False if it failed

APISite.expand_text(text, title=None, includecomments=None)[source]

Parse the given text for preprocessing and rendering.

e.g expand templates and strip comments if includecomments parameter is not True. Keeps text inside <nowiki></nowiki> tags unchanges etc. Can be used to parse magic parser words like {{CURRENTTIMESTAMP}}.

Parameters:
  • text (unicode) – text to be expanded
  • title (unicode) – page title without section
  • includecomments (bool) – if True do not strip comments
Returns:

unicode

APISite.exturlusage(url, protocol='http', namespaces=None, step=None, total=None, content=False)[source]

Iterate Pages that contain links to the given URL.

Parameters:
  • url – The URL to search for (without the protocol prefix); this many include a ‘*’ as a wildcard, only at the start of the hostname
  • protocol – The protocol prefix (default: “http”)
APISite.forceLogin(*a, **kw)
static APISite.fromDBName(dbname)[source]
APISite.getFilesFromAnHash(hash_found=None)[source]

Return all images that have the same hash.

Useful to find duplicates or nowcommons.

NOTE: it returns also the image itself, if you don’t want it, just filter the list returned.

NOTE 2: it returns the image title WITHOUT the image namespace.

APISite.getImagesFromAnHash(hash_found=None)[source]
APISite.get_searched_namespaces(force=False)[source]

Retrieve the default searched namespaces for the user.

If no user is logged in, it returns the namespaces used by default. Otherwise it returns the user preferences. It caches the last result and returns it, if the username or login status hasn’t changed.

Parameters:force – Whether the cache should be discarded.
Returns:The namespaces which are searched by default.
Return type:set of Namespace
APISite.get_tokens(types, all=False)[source]

Preload one or multiple tokens.

For all MediaWiki versions prior to 1.20, only one token can be retrieved at once. For MediaWiki versions since 1.24wmfXXX a new token system was introduced which reduced the amount of tokens available. Most of them were merged into the ‘csrf’ token. If the token type in the parameter is not known it will default to the ‘csrf’ token. The other token types available are:

- deleteglobalaccount
- patrol
- rollback
- setglobalaccountstatus
- userrights
- watch
Parameters:
  • types (iterable) – the types of token (e.g., “edit”, “move”, “delete”); see API documentation for full list of types
  • all (bool) – load all available tokens, if None only if it can be done in one request.

return: a dict with retrieved valid tokens.

APISite.getcategoryinfo(category)[source]

Retrieve data on contents of category.

APISite.getcurrenttime()[source]

Return a Timestamp object representing the current server time.

For wikis with a version newer than 1.16 it uses the ‘time’ property of the siteinfo ‘general’. It’ll force a reload before returning the time. It requests to expand the text ‘{{CURRENTTIMESTAMP}}’ for older wikis.

Returns:the current server time
Return type:Timestamp
APISite.getcurrenttimestamp()[source]

Return the server time as a MediaWiki timestamp string.

It calls getcurrenttime first so it queries the server to get the current server time.

Returns:the server time
Return type:str (as ‘yyyymmddhhmmss’)
APISite.getglobaluserinfo()[source]

Retrieve globaluserinfo from site and cache it.

self._globaluserinfo will be a dict with the following keys and values:

- id: user id (numeric str)
- home: dbname of home wiki
- registration: registration date as Timestamp
- groups: list of groups (could be empty)
- rights: list of rights (could be empty)
- editcount: global editcount
APISite.getmagicwords(word)[source]

Return list of localized “word” magic words for the site.

APISite.getredirtarget(page)[source]

Return Page object for the redirect target of page.

APISite.getuserinfo()[source]

Retrieve userinfo from site and store in _userinfo attribute.

self._userinfo will be a dict with the following keys and values:

- id: user id (numeric str)
- name: username (if user is logged in)
- anon: present if user is not logged in
- groups: list of groups (could be empty)
- rights: list of rights (could be empty)
- message: present if user has a new message on talk page
- blockinfo: present if user is blocked (dict)
APISite.globaluserinfo

Retrieve userinfo from site and store in _userinfo attribute.

self._userinfo will be a dict with the following keys and values:

- id: user id (numeric str)
- name: username (if user is logged in)
- anon: present if user is not logged in
- groups: list of groups (could be empty)
- rights: list of rights (could be empty)
- message: present if user has a new message on talk page
- blockinfo: present if user is blocked (dict)
APISite.hasExtension(name, unknown=None)[source]

Determine whether extension name is loaded.

Use has_extension instead!

Parameters:
  • name (str) – The extension to check for, case insenstive
  • unknown – Old parameter which shouldn’t be used anymore.
Returns:

If the extension is loaded

Return type:

bool

APISite.has_all_mediawiki_messages(keys)[source]

Confirm that the site defines a set of MediaWiki messages.

Parameters:keys (set of str) – names of MediaWiki messages
Returns:bool
APISite.has_data_repository

Return True if site has a shared data repository like Wikidata.

APISite.has_extension(name)[source]

Determine whether extension name is loaded.

Parameters:name (str) – The extension to check for, case insenstive
Returns:If the extension is loaded
Return type:bool
APISite.has_group(group, sysop=False)[source]

Return true if and only if the user is a member of specified group.

Possible values of ‘group’ may vary depending on wiki settings, but will usually include bot.

APISite.has_image_repository

Return True if site has a shared image repository like Commons.

APISite.has_mediawiki_message(key)[source]

Determine if the site defines a MediaWiki message.

Parameters:key (str) – name of MediaWiki message
Returns:bool
APISite.has_right(right, sysop=False)[source]

Return true if and only if the user has a specific right.

Possible values of ‘right’ may vary depending on wiki settings, but will usually include:

* Actions: edit, move, delete, protect, upload
* User levels: autoconfirmed, sysop, bot
APISite.has_transcluded_data

Return True if site has a shared data repository like Wikidata.

APISite.image_repository()[source]

Return Site object for image repository e.g. commons.

APISite.imageusage(image, namespaces=None, filterredir=None, step=None, total=None, content=False)[source]

Iterate Pages that contain links to the given FilePage.

Parameters:
  • image (FilePage) – the image to search for (FilePage need not exist on the wiki)
  • filterredir – if True, only yield redirects; if False (and not None), only yield non-redirects (default: yield both)
  • content – if True, load the current content of each iterated page (default False)
APISite.isAllowed(right, sysop=False)[source]

DEPRECATED.

APISite.isBlocked(sysop=False)[source]

DEPRECATED.

APISite.isBot(username)[source]

Return True is username is a bot user.

APISite.is_blocked(sysop=False)[source]

Return true if and only if user is blocked.

Parameters:sysop – If true, log in to sysop account (if available)
APISite.is_data_repository()[source]

Return True if Site object is the data repository.

APISite.is_image_repository()[source]

Return True if Site object is the image repository.

APISite.is_uploaddisabled()[source]

Return True if upload is disabled on site.

If not called directly, it is cached by the first attempted upload action.

APISite.lang

Return the code for the language of this Site.

APISite.language()[source]

Return the code for the language of this Site.

APISite.linksearch(siteurl, limit=None)[source]

Backwards-compatible interface to exturlusage().

APISite.list_to_text(args)[source]

Convert a list of strings into human-readable text.

The MediaWiki message ‘and’ is used as separator between the last two arguments. If present, other arguments are joined using a comma.

Parameters:args (iterable) – text to be expanded
Returns:unicode
APISite.live_version(force=False)[source]

Return the ‘real’ version number found on [[Special:Version]].

By default the version number is cached for one day.

Parameters:force (bool) – If the version should be read always from the server and never from the cache.
Returns:A tuple containing the major, minor version number and any text after that. If an error occured (0, 0, 0) is returned.
Return type:int, int, str
APISite.loadcoordinfo(page)[source]

Load [[mw:Extension:GeoData]] info.

APISite.loadflowinfo(page)[source]

Load Flow-related information about a given page.

FIXME: Assumes that the Flow extension is installed.

APISite.loadimageinfo(page, history=False)[source]

Load image info from api and save in page attributes.

Parameters:history – if true, return the image’s version history
APISite.loadpageinfo(page, preload=False)[source]

Load page info from api and store in page attributes.

APISite.loadpageprops(page)[source]
APISite.loadrevisions(page, getText=False, revids=None, startid=None, endid=None, starttime=None, endtime=None, rvdir=None, user=None, excludeuser=None, section=None, sysop=False, step=None, total=None, rollback=False)[source]

Retrieve and store revision information.

By default, retrieves the last (current) revision of the page, unless any of the optional parameters revids, startid, endid, starttime, endtime, rvdir, user, excludeuser, or limit are specified. Unless noted below, all parameters not specified default to False.

If rvdir is False or not specified, startid must be greater than endid if both are specified; likewise, starttime must be greater than endtime. If rvdir is True, these relationships are reversed.

Parameters:
  • page – retrieve revisions of this Page (required unless ids is specified)
  • getText – if True, retrieve the wiki-text of each revision; otherwise, only retrieve the revision metadata (default)
  • section (int) – if specified, retrieve only this section of the text (getText must be True); section must be given by number (top of the article is section 0), not name
  • revids (an int, a str or a list of ints or strings) – retrieve only the specified revision ids (raise Exception if any of revids does not correspond to page
  • startid – retrieve revisions starting with this revid
  • endid – stop upon retrieving this revid
  • starttime – retrieve revisions starting at this Timestamp
  • endtime – stop upon reaching this Timestamp
  • rvdir – if false, retrieve newest revisions first (default); if true, retrieve earliest first
  • user – retrieve only revisions authored by this user
  • excludeuser – retrieve all revisions not authored by this user
  • sysop – if True, switch to sysop account (if available) to retrieve this page
APISite.logevents(logtype=None, user=None, page=None, start=None, end=None, reverse=False, step=None, total=None)[source]

Iterate all log entries.

Parameters:
  • logtype – only iterate entries of this type (see wiki documentation for available types, which will include “block”, “protect”, “rights”, “delete”, “upload”, “move”, “import”, “patrol”, “merge”)
  • user – only iterate entries that match this user name
  • page – only iterate entries affecting this page
  • start – only iterate entries from and after this Timestamp
  • end – only iterate entries up to and through this Timestamp
  • reverse – if True, iterate oldest entries first (default: newest)
APISite.loggedInAs(sysop=False)[source]

Return the current username if logged in, otherwise return None.

DEPRECATED (use .user() method instead)

Parameters:sysop (bool) – if True, test if user is logged in as the sysop user instead of the normal user.
Returns:bool
APISite.logged_in(sysop=False)[source]

Verify the bot is logged into the site as the expected user.

The expected usernames are those provided as either the user or sysop parameter at instantiation.

Parameters:sysop (bool) – if True, test if user is logged in as the sysop user instead of the normal user.
Returns:bool
APISite.login(sysop=False)[source]

Log the user in if not already logged in.

APISite.logout()[source]

Logout of the site and load details for the logged out user.

Also logs out of the global account if linked to the user.

APISite.lonelypages(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Pages retrieved from Special:Lonelypages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.longpages(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Pages and lengths from Special:Longpages.

Yields a tuple of Page object, length(int).

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.mediawiki_message(key)[source]

Fetch the text for a MediaWiki message.

Parameters:key (str) – name of MediaWiki message
Returns:unicode
APISite.mediawiki_messages(keys)[source]

Fetch the text of a set of MediaWiki messages.

If keys is ‘*’ or [‘*’], all messages will be fetched. The returned dict uses each key to store the associated message.

Parameters:keys (set of str, ‘*’ or [‘*’]) – MediaWiki messages to fetch
Returns:dict
APISite.messages(sysop=False)[source]

Return true if the user has new messages, and false otherwise.

APISite.months_names

Obtain month names from the site messages.

The list is zero-indexed, ordered by month in calendar, and should be in the original site language.

Returns:list of tuples (month name, abbreviation)
APISite.movepage(page, newtitle, summary, movetalk=True, noredirect=False)[source]

Move a Page to a new title.

Parameters:
  • page – the Page to be moved (must exist)
  • newtitle (unicode) – the new title for the Page
  • summary – edit summary (required!)
  • movetalk – if True (default), also move the talk page if possible
  • noredirect – if True, suppress creation of a redirect from the old title to the new one
Returns:

Page object with the new title

APISite.namespace(num, all=False)[source]

Return string containing local name of namespace ‘num’.

If optional argument ‘all’ is true, return a list of all recognized values for this namespace.

APISite.namespaces

Return dict of valid namespaces on this wiki.

APISite.newfiles(user=None, start=None, end=None, reverse=False, step=None, total=None)[source]

Yield information about newly uploaded files.

Yields a tuple of FilePage, Timestamp, user(unicode), comment(unicode).

N.B. the API does not provide direct access to Special:Newimages, so this is derived from the “upload” log events instead.

APISite.newimages(number=NotImplemented, repeat=NotImplemented, *args, **kwargs)[source]
APISite.newpages(user=None, returndict=False, start=None, end=None, reverse=False, showBot=False, showRedirects=False, excludeuser=None, showPatrolled=None, namespaces=None, step=None, total=None, namespace='[deprecated name of namespaces]', number='[deprecated name of step]', repeat=NotImplemented, rc_show=NotImplemented, get_redirect=NotImplemented)[source]

Yield new articles (as Page objects) from recent changes.

Starts with the newest article and fetches the number of articles specified in the first argument.

The objects yielded are dependent on parameter returndict. When true, it yields a tuple composed of a Page object and a dict of attributes. When false, it yields a tuple composed of the Page object, timestamp (unicode), length (int), an empty unicode string, username or IP address (str), comment (unicode).

APISite.nice_get_address(title)[source]

Return shorter URL path to retrieve page titled ‘title’.

APISite.notifications(**kwargs)[source]

Yield Notification objects from the Echo extension.

APISite.notifications_mark_read(**kwargs)[source]

Mark selected notifications as read.

Returns:whether the action was successful
Return type:bool
APISite.page_can_be_edited(page)[source]

Determine if the page can be edited.

Return True if and only if::
  • page is unprotected, and bot has an account for this site, or
  • page is protected, and bot has a sysop account for this site.
Returns:bool
APISite.page_embeddedin(page, filterRedirects=None, namespaces=None, step=None, total=None, content=False)[source]

Iterate all pages that embedded the given page as a template.

Parameters:
  • page – The Page to get inclusions for.
  • filterRedirects – If True, only return redirects that embed the given page. If False, only return non-redirect links. If None, return both (no filtering).
  • namespaces – If present, only return links from the namespaces in this list.
  • content – if True, load the current content of each iterated page (default False)
APISite.page_exists(page)[source]

Return True if and only if page is an existing page on site.

Iterate all external links on page, yielding URL strings.

APISite.page_isredirect(page)[source]

Return True if and only if page is a redirect.

APISite.page_restrictions(page)[source]

Return a dictionary reflecting page protections.

Iterate all pages that link to the given page.

Parameters:
  • page – The Page to get links to.
  • followRedirects – Also return links to redirects pointing to the given page.
  • filterRedirects – If True, only return redirects to the given page. If False, only return non-redirect links. If None, return both (no filtering).
  • namespaces – If present, only return links from the namespaces in this list.
  • step – Limit on number of pages to retrieve per API query.
  • total – Maximum number of pages to retrieve in total.
  • content – if True, load the current content of each iterated page (default False)
APISite.pagecategories(page, step=None, total=None, content=False, withSortKey=NotImplemented)[source]

Iterate categories to which page belongs.

Parameters:content – if True, load the current content of each iterated page (default False); note that this means the contents of the category description page, not the pages contained in the category
APISite.pageimages(page, step=None, total=None, content=False)[source]

Iterate images used (not just linked) on the page.

Parameters:content – if True, load the current content of each iterated page (default False); note that this means the content of the image description page, not the image itself
APISite.pageinterwiki(page)[source]

Iterate all interlanguage links on page, yielding Link objects.

Parameters:include_obsolete – if true, yield even Link objects whose site is obsolete

Iterate internal wikilinks contained (or transcluded) on page.

Parameters:
  • namespaces (list of ints) – Only iterate pages in these namespaces (default: all)
  • follow_redirects – if True, yields the target of any redirects, rather than the redirect page
  • content – if True, load the current content of each iterated page (default False)
APISite.pagename2codes(default=NotImplemented)[source]

Return list of localized PAGENAMEE tags for the site.

APISite.pagenamecodes(default=NotImplemented)[source]

Return list of localized PAGENAME tags for the site.

APISite.pagereferences(page, followRedirects=False, filterRedirects=None, withTemplateInclusion=True, onlyTemplateInclusion=False, namespaces=None, step=None, total=None, content=False)[source]

Convenience method combining pagebacklinks and page_embeddedin.

APISite.pagetemplates(page, namespaces=None, step=None, total=None, content=False)[source]

Iterate templates transcluded (not just linked) on the page.

Parameters:content – if True, load the current content of each iterated page (default False)
APISite.patrol(rcid=None, revid=None, revision=None)[source]

Return a generator of patrolled pages.

Pages to be patrolled are identified by rcid, revid or revision. At least one of the parameters is mandatory. See https://www.mediawiki.org/wiki/API:Patrol.

Parameters:
  • rcid (iterable/iterator which returns a number or string which contains only digits; it also supports a string (as above) or int) – an int/string/iterable/iterator providing rcid of pages to be patrolled.
  • revid (iterable/iterator which returns a number or string which contains only digits; it also supports a string (as above) or int.) – an int/string/iterable/iterator providing revid of pages to be patrolled.
  • revision (iterable/iterator which returns a Revision object; it also supports a single Revision.) – an Revision/iterable/iterator providing Revision object of pages to be patrolled.
Return type:

iterator of dict with ‘rcid’, ‘ns’ and ‘title’ of the patrolled page.

APISite.prefixindex(prefix, namespace=0, includeredirects=True)[source]

Yield all pages with a given prefix. Deprecated.

Use allpages() with the prefix= parameter instead of this method.

APISite.preloadpages(pagelist, groupsize=50, templates=False, langlinks=False)[source]

Return a generator to a list of preloaded pages.

Note that [at least in current implementation] pages may be iterated in a different order than in the underlying pagelist.

Parameters:
  • pagelist – an iterable that returns Page objects
  • groupsize (int) – how many Pages to query at a time
  • templates – preload list of templates in the pages
  • langlinks – preload list of language links found in the pages
APISite.protect(page, protections, reason, expiry=None, summary='[deprecated name of reason]', **kwargs)[source]

(Un)protect a wiki page. Requires administrator status.

Parameters:
  • protections (dict) – A dict mapping type of protection to protection level of that type. Valid types of protection are ‘edit’, ‘move’, ‘create’, and ‘upload’. Valid protection levels (in MediaWiki 1.12) are ‘’ (equivalent to ‘none’), ‘autoconfirmed’, and ‘sysop’. If None is given, however, that protection will be skipped.
  • reason (basestring) – Reason for the action
  • expiry (pywikibot.Timestamp, string in GNU timestamp format (including ISO 8601).) – When the block should expire. This expiry will be applied to all protections. If None, ‘infinite’, ‘indefinite’, ‘never’, or ‘’ is given, there is no expiry.
APISite.protection_levels()[source]

Return the protection levels available on this site.

Returns:protection types available
Return type:set of unicode instances
See:Siteinfo._get_default()
APISite.protection_types()[source]

Return the protection types available on this site.

Returns:protection types available
Return type:set of unicode instances
See:Siteinfo._get_default()
APISite.purgepages(pages, **kwargs)[source]

Purge the server’s cache for one or multiple pages.

Parameters:pages – list of Page objects
Returns:True if API returned expected response; False otherwise
APISite.randompage(redirect=False)[source]

DEPRECATED.

Parameters:redirect – Return a random redirect page
Returns:pywikibot.Page
APISite.randompages(step=None, total=10, namespaces=None, redirects=False, content=False)[source]

Iterate a number of random pages.

Pages are listed in a fixed sequence, only the starting point is random.

Parameters:
  • total – the maximum number of pages to iterate (default: 1)
  • namespaces – only iterate pages in these namespaces.
  • redirects – if True, include only redirect pages in results (default: include only non-redirects)
  • content – if True, load the current content of each iterated page (default False)
APISite.randomredirectpage()[source]
APISite.recentchanges(start=None, end=None, reverse=False, namespaces=None, pagelist=None, changetype=None, showMinor=None, showBot=None, showAnon=None, showRedirects=None, showPatrolled=None, topOnly=False, step=None, total=None, user=None, excludeuser=None)[source]

Iterate recent changes.

Parameters:
  • start (pywikibot.Timestamp) – Timestamp to start listing from
  • end (pywikibot.Timestamp) – Timestamp to end listing at
  • reverse (bool) – if True, start with oldest changes (default: newest)
  • pagelist – iterate changes to pages in this list only
  • pagelist – list of Pages
  • changetype (basestring) – only iterate changes of this type (“edit” for edits to existing pages, “new” for new pages, “log” for log entries)
  • showMinor (bool or None) – if True, only list minor edits; if False, only list non-minor edits; if None, list all
  • showBot (bool or None) – if True, only list bot edits; if False, only list non-bot edits; if None, list all
  • showAnon (bool or None) – if True, only list anon edits; if False, only list non-anon edits; if None, list all
  • showRedirects (bool or None) – if True, only list edits to redirect pages; if False, only list edits to non-redirect pages; if None, list all
  • showPatrolled (bool or None) – if True, only list patrolled edits; if False, only list non-patrolled edits; if None, list all
  • topOnly (bool) – if True, only list changes that are the latest revision (default False)
  • user (basestring|list) – if not None, only list edits by this user or users
  • excludeuser (basestring|list) – if not None, exclude edits by this user or users
APISite.redirect(default=NotImplemented)[source]

Return the localized #REDIRECT keyword.

APISite.redirectRegex()[source]

Return a compiled regular expression matching on redirect pages.

Group 1 in the regex match object will be the target title.

APISite.redirectpages(step=None, total=None)[source]

Yield redirect pages from Special:ListRedirects.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.rollbackpage(page, **kwargs)[source]

Roll back page to version before last user’s edits.

The keyword arguments are those supported by the rollback API.

As a precaution against errors, this method will fail unless the page history contains at least two revisions, and at least one that is not by the same user who made the last edit.

Parameters:page – the Page to be rolled back (must exist)
APISite.search(searchstring, namespaces=None, where='text', getredirects=False, step=None, total=None, content=False, number='[deprecated name of total]')[source]

Iterate Pages that contain the searchstring.

Note that this may include non-existing Pages if the wiki’s database table contains outdated entries.

Parameters:
  • searchstring (unicode) – the text to search for
  • where – Where to search; value must be “text” or “titles” (many wikis do not support title search)
  • namespaces (list of ints, or an empty list to signal all namespaces) – search only in these namespaces (defaults to 0)
  • getredirects – if True, include redirects in results. Since version MediaWiki 1.23 it will always return redirects.
  • content – if True, load the current content of each iterated page (default False)
APISite.shortpages(step=None, total=None, number='[deprecated name of total]', repeat=NotImplemented)[source]

Yield Pages and lengths from Special:Shortpages.

Yields a tuple of Page object, length(int).

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.siteinfo

Site information dict.

APISite.token(page, tokentype)[source]

Return token retrieved from wiki to allow changing page content.

Parameters:
  • page – the Page for which a token should be retrieved
  • tokentype – the type of token (e.g., “edit”, “move”, “delete”); see API documentation for full list of types
APISite.unblockuser(user, reason)[source]
APISite.uncategorizedcategories(number=NotImplemented, repeat=NotImplemented, step=None, total=None)[source]

Yield Categories from Special:Uncategorizedcategories.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.uncategorizedfiles(number=NotImplemented, repeat=NotImplemented, step=None, total=None)

Yield FilePages from Special:Uncategorizedimages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.uncategorizedimages(number=NotImplemented, repeat=NotImplemented, step=None, total=None)[source]

Yield FilePages from Special:Uncategorizedimages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.uncategorizedpages(number=NotImplemented, repeat=NotImplemented, step=None, total=None)[source]

Yield Pages from Special:Uncategorizedpages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.uncategorizedtemplates(number=NotImplemented, repeat=NotImplemented, step=None, total=None)[source]

Yield Pages from Special:Uncategorizedtemplates.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.unusedcategories(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Category objects from Special:Unusedcategories.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.unusedfiles(step=None, total=None)[source]

Yield FilePage objects from Special:Unusedimages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.unusedimages(number=NotImplemented, repeat=NotImplemented, *args, **kwargs)[source]
APISite.unwatchedpages(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Pages from Special:Unwatchedpages (requires Admin privileges).

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.upload(filepage, source_filename=None, source_url=None, comment=None, text=None, watch=False, ignore_warnings=False, chunk_size=0, imagepage='[deprecated name of filepage]')[source]

Upload a file to the wiki.

Either source_filename or source_url, but not both, must be provided.

Parameters:
  • filepage – a FilePage object from which the wiki-name of the file will be obtained.
  • source_filename – path to the file to be uploaded
  • source_url – URL of the file to be uploaded
  • comment – Edit summary; if this is not provided, then filepage.text will be used. An empty summary is not permitted. This may also serve as the initial page text (see below).
  • text – Initial page text; if this is not set, then filepage.text will be used, or comment.
  • watch – If true, add filepage to the bot user’s watchlist
  • ignore_warnings – if true, ignore API warnings and force upload (for example, to overwrite an existing file); default False
  • chunk_size (int) – The chunk size in bytesfor chunked uploading (see https://www.mediawiki.org/wiki/API:Upload#Chunked_uploading). It will only upload in chunks, if the version number is 1.20 or higher and the chunk size is positive but lower than the file size.
APISite.usercontribs(user=None, userprefix=None, start=None, end=None, reverse=False, namespaces=None, showMinor=None, step=None, total=None, top_only=False)[source]

Iterate contributions by a particular user.

Iterated values are in the same format as recentchanges.

Parameters:
  • user – Iterate contributions by this user (name or IP)
  • userprefix – Iterate contributions by all users whose names or IPs start with this substring
  • start – Iterate contributions starting at this Timestamp
  • end – Iterate contributions ending at this Timestamp
  • reverse – Iterate oldest contributions first (default: newest)
  • showMinor – if True, iterate only minor edits; if False and not None, iterate only non-minor edits (default: iterate both)
  • top_only – if True, iterate only edits which are the latest revision
APISite.userinfo

Retrieve userinfo from site and store in _userinfo attribute.

self._userinfo will be a dict with the following keys and values:

- id: user id (numeric str)
- name: username (if user is logged in)
- anon: present if user is not logged in
- groups: list of groups (could be empty)
- rights: list of rights (could be empty)
- message: present if user has a new message on talk page
- blockinfo: present if user is blocked (dict)
APISite.users(usernames)[source]

Iterate info about a list of users by name or IP.

Parameters:usernames (list, or other iterable, of unicodes) – a list of user names
APISite.validate_tokens(types)[source]

Validate if requested tokens are acceptable.

Valid tokens depend on mw version.

APISite.version()[source]

Return live project version number as a string.

This overwrites the corresponding family method for APISite class. Use pywikibot.tools.MediaWikiVersion to compare MediaWiki versions.

APISite.wantedcategories(step=None, total=None)[source]

Yield Pages from Special:Wantedcategories.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.wantedpages(step=None, total=None)[source]

Yield Pages from Special:Wantedpages.

Parameters:
  • step – request batch size
  • total – number of pages to return
APISite.watchlist_revs(start=None, end=None, reverse=False, namespaces=None, showMinor=None, showBot=None, showAnon=None, step=None, total=None)[source]

Iterate revisions to pages on the bot user’s watchlist.

Iterated values will be in same format as recentchanges.

Parameters:
  • start – Iterate revisions starting at this Timestamp
  • end – Iterate revisions ending at this Timestamp
  • reverse – Iterate oldest revisions first (default: newest)
  • showMinor – if True, only list minor edits; if False (and not None), only list non-minor edits
  • showBot – if True, only list bot edits; if False (and not None), only list non-bot edits
  • showAnon – if True, only list anon edits; if False (and not None), only list non-anon edits
APISite.watchpage(page, unwatch=False)[source]

Add or remove page from watchlist.

Parameters:unwatch – If True, remove page from watchlist; if False (default), add it.
Returns:True if API returned expected response; False otherwise
APISite.withoutinterwiki(step=None, total=None, number=NotImplemented, repeat=NotImplemented)[source]

Yield Pages without language links from Special:Withoutinterwiki.

Parameters:
  • step – request batch size
  • total – number of pages to return
class pywikibot.site.BaseSite(code, fam=None, user=None, sysop=None)[source]

Bases: pywikibot.tools.ComparableMixin

Site methods that are independent of the communication interface.

_cache_interwikimap(force=False)[source]

Cache the interwikimap with usable site instances.

_cmpkey()[source]

Perform equality and inequality tests on Site objects.

category_namespace()[source]
category_namespaces()[source]
category_on_one_line()[source]

Return True if this site wants all category links on one line.

code

The identifying code for this Site.

By convention, this is usually an ISO language code, but it does not have to be.

disambcategory()[source]

Return Category in which disambig pages are listed.

doc_subpage

Return the documentation subpage for this Site.

Returns:tuple
fam()[source]

Return Family object for this Site.

family

The Family object for this Site’s wiki family.

getNamespaceIndex(*a, **kw)
getSite(code)[source]

Return Site object for language ‘code’ in this Family.

getUrl(path, retry=True, sysop=False, data=None, compress=True, no_hostname=False, cookie_only=False)[source]

DEPRECATED.

Retained for compatibility only. All arguments except path and data are ignored.

image_namespace()[source]
interwiki(prefix)[source]

Return the site for a corresponding interwiki prefix.

@raise SiteDefinitionError: if the url given in the interwiki table
doesn’t match any of the existing families.

@raise KeyError: if the prefix is not an interwiki prefix.

interwiki_prefix(site)[source]

Return the interwiki prefixes going to that site.

The interwiki prefixes are ordered first by length (shortest first) and then alphabetically.

Parameters:site (BaseSite) – The targeted site, which might be it’s own.
Returns:The interwiki prefixes
Return type:list (guaranteed to be not empty)

@raise KeyError: if there is no interwiki prefix for that site.

interwiki_putfirst()[source]

Return list of language codes for ordering of interwiki links.

interwiki_putfirst_doubled(list_of_links)[source]

Return True if text is in the form of an interwiki link.

If a link object constructed using “text” as the link text parses as belonging to a different site, this method returns True.

lang

The ISO language code for this Site.

Presumed to be equal to the wiki prefix, but this can be overridden.

languages()[source]

Return list of all valid language codes for this site’s Family.

linkto(title, othersite=None)[source]

DEPRECATED. Return a wikilink to a page.

Parameters:
  • title (unicode) – Title of the page to link to
  • othersite (Site (optional)) – Generate a interwiki link for use on this site.
Returns:

unicode

local_interwiki(prefix)[source]

Return whether the interwiki prefix is local.

A local interwiki prefix is handled by the target site like a normal link. So if that link also contains an interwiki link it does follow it as long as it’s a local link.

@raise SiteDefinitionError: if the url given in the interwiki table
doesn’t match any of the existing families.

@raise KeyError: if the prefix is not an interwiki prefix.

lock_page(page, block=True)[source]

Lock page for writing. Must be called before writing any page.

We don’t want different threads trying to write to the same page at the same time, even to different sections.

Parameters:
  • page (pywikibot.Page) – the page to be locked
  • block – if true, wait until the page is available to be locked; otherwise, raise an exception if page can’t be locked
mediawiki_namespace()[source]
namespaces

Return dict of valid namespaces on this wiki.

normalizeNamespace(*a, **kw)
ns_index(namespace)[source]

Return the Namespace for a given namespace name.

Parameters:namespace (unicode) – name
Returns:The matching Namespace object on this Site
Return type:Namespace, or None if invalid
ns_normalize(value)[source]

Return canonical local form of namespace name.

Parameters:value (unicode) – A namespace name
pagename2codes(default=NotImplemented)[source]

Return list of localized PAGENAMEE tags for the site.

pagenamecodes(default=NotImplemented)[source]

Return list of localized PAGENAME tags for the site.

postData(address, data, contentType=None, sysop=False, compress=True, cookies=None)[source]

DEPRECATED.

postForm(address, predata, sysop=False, cookies=None)[source]

DEPRECATED.

redirect(default=NotImplemented)[source]

Return list of localized redirect tags for the site.

redirectRegex(pattern=None)[source]

Return a compiled regular expression matching on redirect pages.

Group 1 in the regex match object will be the target title.

sametitle(title1, title2)[source]

Return True if title1 and title2 identify the same wiki page.

title1 and title2 may be unequal but still identify the same page, if they use different aliases for the same namespace.

sitename

String representing this Site’s name and code.

special_namespace()[source]
template_namespace()[source]
throttle

Return this Site’s throttle. Initialize a new one if needed.

unlock_page(page)[source]

Unlock page. Call as soon as a write operation has completed.

Parameters:page (pywikibot.Page) – the page to be locked
urlEncode(query)[source]

DEPRECATED.

user()[source]

Return the currently-logged in bot user, or None.

username(sysop=False)[source]

Return list of language codes that can be used in interwiki links.

class pywikibot.site.DataSite(code, fam, user, sysop)[source]

Bases: pywikibot.site.APISite

Wikibase data capable site.

_cache_entity_namespaces()[source]

Find namespaces for each known wikibase entity type.

_get_propertyitem(props, source, **params)[source]

Generic method to get the data for multiple Wikibase items.

addClaim(item, claim, bot=True, **kwargs)[source]
categories(*args, **kwargs)[source]
changeClaimTarget(claim, snaktype='value', bot=True, **kwargs)[source]

Set the claim target to the value of the provided claim target.

Parameters:
  • claim (Claim) – The source of the claim target value
  • snaktype (str (‘value’, ‘novalue’ or ‘somevalue’)) – An optional snaktype. Default: ‘value’
createNewItemFromPage(page, bot=True, **kwargs)[source]

Create a new Wikibase item for a provided page.

Parameters:
  • page (pywikibot.Page) – page to fetch links from
  • bot – whether to mark the edit as bot
Returns:

pywikibot.ItemPage of newly created item

editEntity(identification, data, bot=True, **kwargs)[source]
editQualifier(claim, qualifier, new=False, bot=True, **kwargs)[source]

Create/Edit a qualifier.

Parameters:
  • claim (Claim) – A Claim object to add the qualifier to
  • qualifier (Claim) – A Claim object to be used as a qualifier
editSource(claim, source, new=False, bot=True, **kwargs)[source]

Create/Edit a source.

Parameters:
  • claim (Claim) – A Claim object to add the source to
  • source (Claim) – A Claim object to be used as a source
  • new (bool) – Whether to create a new one if the “source” already exists
fam()[source]
getPropertyType(prop)[source]

Obtain the type of a property.

This is used specifically because we can cache the value for a much longer time (near infinite).

getUrl(*args, **kwargs)[source]
get_item(source, **params)[source]

Get the data for multiple Wikibase items.

isAllowed(*args, **kwargs)[source]
isBlocked(*args, **kwargs)[source]
item_namespace

Return namespace for items.

Returns:item namespace
Return type:Namespace
linkTitles(page1, page2, bot=True)[source]

Link two pages together.

Parameters:
  • page1 (pywikibot.Page) – First page to link
  • page2 (pywikibot.Page) – Second page to link
  • bot – whether to mark edit as bot
Returns:

dict API output

linksearch(*args, **kwargs)[source]
linkto(*args, **kwargs)[source]
loadcontent(identification, *props)[source]

Fetch the current content of a Wikibase item.

This is called loadcontent since wbgetentities does not support fetching old revisions. Eventually this will get replaced by an actual loadrevisions.

Parameters:
  • identification (dict) – Parameters used to identify the page(s)
  • props – the optional properties to fetch.
loggedInAs(*args, **kwargs)[source]
mergeItems(fromItem, toItem, **kwargs)[source]

Merge two items together.

Parameters:
  • fromItem (pywikibot.ItemPage) – Item to merge from
  • toItem (pywikibot.ItemPage) – Item to merge into
Returns:

dict API output

newimages(*args, **kwargs)[source]
postData(*args, **kwargs)[source]
postForm(*args, **kwargs)[source]
prefixindex(*args, **kwargs)[source]
preloaditempages(pagelist, groupsize=50)[source]

Yield ItemPages with content prefilled.

Note that pages will be iterated in a different order than in the underlying pagelist.

Parameters:
  • pagelist – an iterable that yields either WikibasePage objects, or Page objects linked to an ItemPage.
  • groupsize (int) – how many pages to query at a time
property_namespace

Return namespace for properties.

Returns:property namespace
Return type:Namespace
removeClaims(claims, bot=True, **kwargs)[source]
removeSources(claim, sources, bot=True, **kwargs)[source]

Remove sources.

Parameters:
  • claim (Claim) – A Claim object to remove the sources from
  • sources (Claim) – A list of Claim objects that are sources
urlEncode(*args, **kwargs)[source]
class pywikibot.site.LoginStatus(state)[source]

Bases: object

Enum for Login statuses.

>>> LoginStatus.NOT_ATTEMPTED
 -3
>>> LoginStatus.AS_USER
0
>>> LoginStatus.name(-3)
'NOT_ATTEMPTED'
>>> LoginStatus.name(0)
'AS_USER'
AS_SYSOP = 1
AS_USER = 0
IN_PROGRESS = -2
NOT_ATTEMPTED = -3
NOT_LOGGED_IN = -1
classmethod name(search_value)[source]
class pywikibot.site.Namespace(id, canonical_name=None, custom_name=None, aliases=None, use_image_name=False, **kwargs)[source]

Bases: collections.abc.Iterable, pywikibot.tools.ComparableMixin, pywikibot.tools.UnicodeMixin

Namespace site data object.

This is backwards compatible with the structure of entries in site._namespaces which were a list of::

[customised namespace,
 canonical namespace name?,
 namespace alias*]

If the canonical_name is not provided for a namespace between -2 and 15, the MediaWiki 1.14+ built-in names are used. Enable use_image_name to use built-in names from MediaWiki 1.13 and earlier as the details.

Image and File are aliases of each other by default.

If only one of canonical_name and custom_name are available, both properties will have the same value.

_abc_cache = <_weakrefset.WeakSet object at 0x7f92bd1f0c88>
_abc_negative_cache = <_weakrefset.WeakSet object at 0x7f92bd1f4080>
_abc_negative_cache_version = 28
_abc_registry = <_weakrefset.WeakSet object at 0x7f92bd1f0f98>
_cmpkey()[source]

Return the ID as a comparison key.

static _colons(id, name)[source]

Return the name with required colons, depending on the ID.

_contains_lowercase_name(name)[source]

Determine a lowercase normalised name is a name of this namespace.

Return type:bool
_distinct()[source]
static builtin_namespaces(use_image_name=False)[source]

Return a dict of the builtin namespaces.

canonical_namespaces = {0: '', 1: 'Talk', 2: 'User', 3: 'User talk', 4: 'Project', 5: 'Project talk', 6: 'File', 7: 'File talk', 8: 'MediaWiki', 9: 'MediaWiki talk', 10: 'Template', 11: 'Template talk', 12: 'Help', 13: 'Help talk', 14: 'Category', 15: 'Category talk', -1: 'Special', -2: 'Media'}
canonical_prefix()[source]

Return the canonical name with required colons.

custom_prefix()[source]

Return the custom name with required colons.

static lookup_name(name, namespaces=None)[source]

Find the namespace for a name.

Parameters:
  • name – Name of the namespace.
  • namespaces (dict of Namespace) – namespaces to search default: builtins only
Returns:

Namespace or None

static normalize_name(name)[source]

Remove an optional colon before and after name.

TODO: reject illegal characters.

exception pywikibot.site.PageInUse(arg)[source]

Bases: pywikibot.exceptions.Error

Page cannot be reserved for writing due to existing lock.

class pywikibot.site.Siteinfo(site)[source]

Bases: collections.abc.Container

A ‘dictionary’ like container for siteinfo.

This class queries the server to get the requested siteinfo property. Optionally it can cache this directly in the instance so that later requests don’t need to query the server.

All values of the siteinfo property ‘general’ are directly available.

WARNING_REGEX = re.compile("^Unrecognized values? for parameter 'siprop': ([^,]+(?:, [^,]+)*)$")
_abc_cache = <_weakrefset.WeakSet object at 0x7f92bd1f4f98>
_abc_negative_cache = <_weakrefset.WeakSet object at 0x7f92bd1f4fd0>
_abc_negative_cache_version = 28
_abc_registry = <_weakrefset.WeakSet object at 0x7f92bd1f4dd8>
_get_cached(key)[source]

Return the cached value or a KeyError exception if not cached.

static _get_default(key)[source]

Return the default value for different properties.

If the property is ‘restrictions’ it returns a dictionary with::
  • ‘cascadinglevels’: ‘sysop’
  • ‘semiprotectedlevels’: ‘autoconfirmed’
  • ‘levels’: ‘’ (everybody), ‘autoconfirmed’, ‘sysop’
  • ‘types’: ‘create’, ‘edit’, ‘move’, ‘upload’

Otherwise it returns pywikibot.tools.EMPTY_DEFAULT.

Parameters:key (str) – The property name
Returns:The default value
Return type:dict or pywikibot.tools.EmptyDefault
_get_general(key, expiry)[source]

Return a siteinfo property which is loaded by default.

The property ‘general’ will be queried if it wasn’t yet or it’s forced. Additionally all uncached default properties are queried. This way multiple default properties are queried with one request. It’ll cache always all results.

Parameters:
  • key (str) – The key to search for.
  • expiry (int (days), datetime.timedelta, False (never)) – If the cache is older than the expiry it ignores the cache and queries the server to get the newest value.
Returns:

If that property was retrived via this method. Returns None if the key was not in the retreived values.

Return type:

various (the value), bool (if the default value is used)

_get_siteinfo(prop, expiry)[source]

Retrieve a siteinfo property.

All properties which the site doesn’t support contain the default value. Because pre-1.12 no data was returned when a property doesn’t exists, it queries each property independetly if a property is invalid.

Parameters:
  • prop (str or iterable) – The property names of the siteinfo.
  • expiry (int (days), datetime.timedelta, False (config)) – The expiry date of the cached request.
Returns:

A dictionary with the properties of the site. Each entry in the dictionary is a tuple of the value and a boolean to save if it is the default value.

Return type:

dict (the values)

See:

https://www.mediawiki.org/wiki/API:Meta#siteinfo_.2F_si

static _is_expired(cache_date, expire)[source]

Return true if the cache date is expired.

get(key, get_default=True, cache=True, expiry=False)[source]

Return a siteinfo property.

It will never throw an APIError if it only stated, that the siteinfo property doesn’t exist. Instead it will use the default value.

Parameters:
  • key (str) – The name of the siteinfo property.
  • get_default (bool) – Whether to throw an KeyError if the key is invalid.
  • cache (bool) – Caches the result interally so that future accesses via this method won’t query the server.
  • expiry (int/float (days), datetime.timedelta, False (never)) – If the cache is older than the expiry it ignores the cache and queries the server to get the newest value.
Returns:

The gathered property

Return type:

various

@raise KeyError: If the key is not a valid siteinfo property and the
get_default option is set to False.
See:_get_siteinfo
get_requested_time(key)[source]

Return when ‘key’ was successfully requested from the server.

If the property is actually in the siprop ‘general’ it returns the last request from the ‘general’ siprop.

Parameters:key (basestring) – The siprop value or a property of ‘general’.
Returns:The last time the siprop of ‘key’ was requested.
Return type:None (never), False (default), datetime.datetime (cached)
is_recognised(key)[source]

Return if ‘key’ is a valid property name. ‘None’ if not cached.

class pywikibot.site.TokenWallet(site)[source]

Bases: object

Container for tokens.

load_tokens(types, all=False)[source]

Preload one or multiple tokens.

Parameters:
  • types (iterable) – the types of token.
  • all (bool) – load all available tokens, if None only if it can be done in one request.
pywikibot.site.must_be(group=None, right=None)[source]

Decorator to require a certain user status when method is called.

Parameters:
  • group (str (‘user’ or ‘sysop’)) – The group the logged in user should belong to this parameter can be overridden by keyword argument ‘as_group’.
  • right – The rights the logged in user should have. Not supported yet and thus ignored.
Returns:

method decorator

pywikibot.site.need_version(version)[source]

Decorator to require a certain MediaWiki version number.

Parameters:version (str) – the mw version number required
Returns:a decorator to make sure the requirement is satisfied when the decorated function is called.

textlib Module

Functions for manipulating wiki-text.

Unless otherwise noted, all functions take a unicode string as the argument and return a unicode string.

class pywikibot.textlib.TimeStripper(site=None)[source]

Bases: object

Find timestamp in page and return it as timezone aware datetime object.

findmarker(text, base='@@', delta='@')[source]

Find a string which is not part of text.

fix_digits(line)[source]

Make non-latin digits like Persian to latin to parse.

last_match_and_replace(txt, pat)[source]

Take the rightmost match and replace with marker.

It does so to prevent spurious earlier matches.

timestripper(line)[source]

Find timestamp in line and convert it to time zone aware datetime.

All the following items must be matched, otherwise None is returned::
-. year, month, hour, time, day, minute, tzinfo
pywikibot.textlib.categoryFormat(categories, insite=None)[source]

Return a string containing links to all categories in a list.

‘categories’ should be a list of Category or Page objects or strings which can be either the raw name, [[Category:..]] or [[cat_localised_ns:...]].

The string is formatted for inclusion in insite. Category namespace is converted to localised namespace.

pywikibot.textlib.compileLinkR(withoutBracketed=False, onlyBracketed=False)[source]

Return a regex that matches external links.

pywikibot.textlib.does_text_contain_section(pagetext, section)[source]

Determine whether the page text contains the given section title.

It does not care whether a section string may contain spaces or underlines. Both will match.

If a section parameter contains a internal link, it will match the section with or without a preceding colon which is required for a text link e.g. for categories and files.

Parameters:
  • pagetext (unicode or string) – The wikitext of a page
  • section (unicode or string) – a section of a page including wikitext markups
pywikibot.textlib.expandmarker(text, marker='', separator='')[source]
pywikibot.textlib.extract_templates_and_params(text)[source]

Return a list of templates found in text.

Return value is a list of tuples. There is one tuple for each use of a template in the page, with the template title as the first entry and a dict of parameters as the second entry. Parameters are indexed by strings; as in MediaWiki, an unnamed parameter is given a parameter name with an integer value corresponding to its position among the unnamed parameters, and if this results multiple parameters with the same name only the last value provided will be returned.

This uses a third party library (mwparserfromhell) if it is installed and enabled in the user-config.py. Otherwise it falls back on a regex based function defined below.

Parameters:text (unicode or string) – The wikitext from which templates are extracted
Returns:list of template name and params
Return type:list of tuple
pywikibot.textlib.extract_templates_and_params_regex(text)[source]

Extract templates with params using a regex.

This function should not be called directly.

Use extract_templates_and_params, which will fallback to using this regex based implementation when the mwparserfromhell implementation is not used.

Parameters:text (unicode or string) – The wikitext from which templates are extracted
Returns:list of template name and params
Return type:list of tuple
pywikibot.textlib.findmarker(text, startwith='@@', append=None)[source]

Return a list of category links found in text.

Parameters:include (list) – list of tags which should not be removed by removeDisabledParts() and where CategoryLinks can be searched.
Returns:all category links found
Return type:list of Category objects

Return a dict of inter-language links found in text.

The returned dict uses language codes as keys and Page objects as values.

Do not call this routine directly, use Page.interwiki() method instead.

pywikibot.textlib.glue_template_and_params(template_and_params)[source]

Return wiki text of template glued from params.

You can use items from extract_templates_and_params here to get an equivalent template wiki text (it may happen that the order of the params changes).

pywikibot.textlib.interwikiFormat(links, insite=None)[source]

Convert interwiki link dict into a wikitext string.

‘links’ should be a dict with the Site objects as keys, and Page or Link objects as values.

Return a unicode string that is formatted for inclusion in insite (defaulting to the current site).

pywikibot.textlib.interwikiSort(sites, insite=None)[source]
pywikibot.textlib.isDisabled(text, index, tags=['*'])[source]

Return True if text[index] is disabled, e.g. by a comment or by nowiki tags.

For the tags parameter, see removeDisabledParts.

Return text with all category links removed.

Put the string marker after the last replacement (at the end of the text if there is no replacement).

pywikibot.textlib.removeCategoryLinksAndSeparator(text, site=None, marker='', separator='')[source]

Return text with all category links and preceding separators removed.

Put the string marker after the last replacement (at the end of the text if there is no replacement).

pywikibot.textlib.removeDisabledParts(text, tags=['*'], include=[])[source]

Return text without portions where wiki markup is disabled.

Parts that can/will be removed are – * HTML comments * nowiki tags * pre tags * includeonly tags

The exact set of parts which should be removed can be passed as the ‘tags’ parameter, which defaults to all. Or, in alternative, default parts that shall not be removed can be specified in the ‘include’ param.

pywikibot.textlib.removeHTMLParts(text, keeptags=['tt', 'nowiki', 'small', 'sup'])[source]

Return text without portions where HTML markup is disabled.

Parts that can/will be removed are – * HTML and all wiki tags

The exact set of parts which should NOT be removed can be passed as the ‘keeptags’ parameter, which defaults to [‘tt’, ‘nowiki’, ‘small’, ‘sup’].

Return text with all inter-language links removed.

If a link to an unknown language is encountered, a warning is printed. If a marker is defined, that string is placed at the location of the last occurrence of an interwiki link (at the end if there are no interwiki links).

pywikibot.textlib.removeLanguageLinksAndSeparator(text, site=None, marker='', separator='')[source]

Return text with inter-language links and preceding separators removed.

If a link to an unknown language is encountered, a warning is printed. If a marker is defined, that string is placed at the location of the last occurrence of an interwiki link (at the end if there are no interwiki links).

pywikibot.textlib.replaceCategoryInPlace(oldtext, oldcat, newcat, site=None)[source]

Replace old category with new one and return the modified text.

Parameters:
  • oldtext – Content of the old category
  • oldcat – pywikibot.Category object of the old category
  • newcat – pywikibot.Category object of the new category
Returns:

the modified text

Return type:

unicode

Replace all existing category links with new category links.

Parameters:
  • oldtext – The text that needs to be replaced.
  • new – Should be a list of Category objects or strings which can be either the raw name or [[Category:..]].
  • addOnly – If addOnly is True, the old category won’t be deleted and the category(s) given will be added (and so they won’t replace anything).
pywikibot.textlib.replaceExcept(text, old, new, exceptions, caseInsensitive=False, allowoverlap=False, marker='', site=None)[source]

Return text with ‘old’ replaced by ‘new’, ignoring specified types of text.

Skips occurrences of ‘old’ within exceptions; e.g., within nowiki tags or HTML comments. If caseInsensitive is true, then use case insensitive regex matching. If allowoverlap is true, overlapping occurrences are all replaced (watch out when using this, it might lead to infinite loops!).

Parameters:
  • old – a compiled or uncompiled regular expression
  • new – a unicode string (which can contain regular expression references), or a function which takes a match object as parameter. See parameter repl of re.sub().
  • exceptions – a list of strings which signal what to leave out, e.g. [‘math’, ‘table’, ‘template’]
  • marker – a string that will be added to the last replacement; if nothing is changed, it is added at the end

Replace inter-language links in the text with a new set of links.

‘new’ should be a dict with the Site objects as keys, and Page or Link objects as values (i.e., just like the dict returned by getLanguageLinks function).

pywikibot.textlib.to_local_digits(phrase, lang)[source]

Change Latin digits based on language to localized version.

Be aware that this function only returns for several language And doesn’t touch the input if other languages are asked. :param phrase: The phrase to convert to localized numerical :param lang: language code :return: The localized version :rtype: unicode

class pywikibot.textlib.tzoneFixedOffset(offset, name)[source]

Bases: datetime.tzinfo

Class building tzinfo objects for fixed-offset time zones.

Parameters:
  • offset – a number indicating fixed offset in minutes east from UTC
  • name – a string with name of the timezone
dst(dt)[source]
tzname(dt)[source]
utcoffset(dt)[source]
pywikibot.textlib.unescape(s)[source]

Replace escaped HTML-special characters by their originals.

throttle Module

Mechanics to slow down wiki read and/or write rate.

class pywikibot.throttle.Throttle(site, mindelay=None, maxdelay=None, writedelay=None, multiplydelay=True)[source]

Bases: object

Control rate of access to wiki server.

Calling this object blocks the calling thread until at least ‘delay’ seconds have passed since the previous call.

Each Site initiates one Throttle object (site.throttle) to control the rate of access.

checkMultiplicity()[source]

Count running processes for site and set process_multiplicity.

drop()[source]

Remove me from the list of running bot processes.

getDelay(write=False)[source]

Return the actual delay, accounting for multiple processes.

This value is the maximum wait between reads/writes, not taking account of how much time has elapsed since the last access.

lag(lagtime)[source]

Seize the throttle lock due to server lag.

This will prevent any thread from accessing this site.

setDelays(delay=None, writedelay=None, absolute=False)[source]

Set the nominal delays in seconds. Defaults to config values.

wait(seconds)[source]

Wait for seconds seconds.

Announce the delay if it exceeds a preset limit.

waittime(write=False)[source]

Return waiting time in seconds if a query would be made right now.

titletranslate Module

Title translate module.

pywikibot.titletranslate.appendFormatedDates(result, dictName, value)[source]

Return a list of known corrupted links that should be removed if seen.

pywikibot.titletranslate.translate(page, hints=None, auto=True, removebrackets=False, site=None, family=None)[source]

Return a list of links to pages on other sites based on hints.

Entries for single page titles list those pages. Page titles for entries such as “all:” or “xyz:” or “20:” are first built from the page title of ‘page’ and then listed. When ‘removebrackets’ is True, a trailing pair of brackets and the text between them is removed from the page title. If ‘auto’ is true, known year and date page titles are autotranslated to all known target languages and inserted into the list.

tools Module

Miscellaneous helper functions (not wiki-dependent).

exception pywikibot.tools.CombinedError[source]

Bases: KeyError, IndexError

An error that gets caught by both KeyError and IndexError.

class pywikibot.tools.ComparableMixin[source]

Bases: object

Mixin class to allow comparing to other objects which are comparable.

class pywikibot.tools.DequeGenerator[source]

Bases: collections.deque

A generator that allows items to be added during generating.

next()[source]

Python 3 iterator method.

class pywikibot.tools.EmptyDefault[source]

Bases: str, collections.abc.Mapping

A default for a not existing siteinfo property.

It should be chosen if there is no better default known. It acts like an empty collections, so it can be iterated through it savely if treated as a list, tuple, set or dictionary. It is also basically an empty string.

Accessing a value via __getitem__ will result in an combined KeyError and IndexError.

_abc_cache = <_weakrefset.WeakSet object at 0x7f92bf16e438>
_abc_negative_cache = <_weakrefset.WeakSet object at 0x7f92bfbcdb00>
_abc_negative_cache_version = 29
_abc_registry = <_weakrefset.WeakSet object at 0x7f92bf16e4e0>
_empty_iter()[source]

An iterator which does nothing and drops the argument.

iteritems()

An iterator which does nothing and drops the argument.

iterkeys()

An iterator which does nothing and drops the argument.

itervalues()

An iterator which does nothing and drops the argument.

class pywikibot.tools.MediaWikiVersion(vstring=None)[source]

Bases: distutils.version.Version

Version object to allow comparing ‘wmf’ versions with normal ones.

MEDIAWIKI_VERSION = re.compile('(\\d+(?:\\.\\d+)*)(?:wmf(\\d+))?')
_cmp(other)[source]
parse(vstring)[source]
class pywikibot.tools.ModuleDeprecationWrapper(module)[source]

Bases: object

A wrapper for a module to deprecate classes or variables of it.

_add_deprecated_attr(name, replacement=None, replacement_name=None)[source]

Add the name to the local deprecated names dict.

Parameters:
  • name (str) – The name of the deprecated class or variable. It may not be already deprecated.
  • replacement (any) – The replacement value which should be returned instead. If the name is already an attribute of that module this must be None. If None it’ll return the attribute of the module.
  • replacement_name (str) – The name of the new replaced value. Required if replacement is not None and it has no __name__ attribute.
class pywikibot.tools.SelfCallDict[source]

Bases: pywikibot.tools.SelfCallMixin, dict

Dict with SelfCallMixin.

class pywikibot.tools.SelfCallMixin[source]

Bases: object

Return self when called.

class pywikibot.tools.SelfCallString[source]

Bases: pywikibot.tools.SelfCallMixin, str

Unicode string with SelfCallMixin.

class pywikibot.tools.ThreadList(limit=128, *args)[source]

Bases: list

A simple threadpool class to limit the number of simultaneous threads.

Any threading.Thread object can be added to the pool using the append() method. If the maximum number of simultaneous threads has not been reached, the Thread object will be started immediately; if not, the append() call will block until the thread is able to start.

>>> pool = ThreadList(limit=10)
>>> def work()::
...     time.sleep(1)
...
>>> for x in range(20)::
...     pool.append(threading.Thread(target=work))
...
_logger = 'threadlist'
active_count()[source]

Return the number of alive threads, and delete all non-alive ones.

append(thd)[source]

Add a thread to the pool and start it.

stop_all()[source]

Stop all threads the pool.

class pywikibot.tools.ThreadedGenerator(group=None, target=None, name='GeneratorThread', args=(), kwargs=None, qsize=65536)[source]

Bases: threading.Thread

Look-ahead generator class.

Runs a generator in a separate thread and queues the results; can be called like a regular generator.

Subclasses should override self.generator, I{not} self.run

Important: the generator thread will stop itself if the generator’s internal queue is exhausted; but, if the calling program does not use all the generated values, it must call the generator’s stop() method to stop the background thread. Example usage:

>>> gen = ThreadedGenerator(target=range, args=(20,))
>>> try::

... data = list(gen) ... finally:: ... gen.stop() >>> data [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

run()[source]

Run the generator and store the results on the queue.

stop()[source]

Stop the background thread.

class pywikibot.tools.UnicodeMixin[source]

Bases: object

Mixin class to add __str__ method in Python 2 or 3.

pywikibot.tools.add_decorated_full_name(obj)[source]

Extract full object name, including class, and store in __full_name__.

This must be done on all decorators that are chained together, otherwise the second decorator will have the wrong full name.

Parameters:obj (object) – A object being decorated
pywikibot.tools.add_full_name(obj)[source]

A decorator to add __full_name__ to the function being decorated.

This should be done for all decorators used in pywikibot, as any decorator that does not add __full_name__ will prevent other decorators in the same chain from being able to obtain it.

This can be used to monkey-patch decorators in other modules. e.g. <xyz>.foo = add_full_name(<xyz>.foo)

Parameters:obj (callable) – The function to decorate
Returns:decorating function
Return type:function
pywikibot.tools.deprecate_arg(old_arg, new_arg)[source]

Decorator to declare old_arg deprecated and replace it with new_arg.

pywikibot.tools.deprecated(*outer_args, **outer_kwargs)[source]

Outer wrapper.

The outer wrapper may be the replacement function if the decorated decorator was called without arguments, or the replacement decorator if the decorated decorator was called without arguments.

Parameters:
  • outer_args (list) – args
  • outer_kwargs – kwargs
Type:

outer_kwargs: dict

pywikibot.tools.deprecated_args(**arg_pairs)[source]

Decorator to declare multiple args deprecated.

Parameters:arg_pairs – Each entry points to the new argument name. With True or None it drops the value and prints a warning. If False it just drops the value.
pywikibot.tools.empty_iterator()[source]

An iterator which does nothing.

pywikibot.tools.intersect_generators(genlist)[source]

Intersect generators listed in genlist.

Yield items only if they are yielded by all generators in genlist. Threads (via ThreadedGenerator) are used in order to run generators in parallel, so that items can be yielded before generators are exhausted.

Threads are stopped when they are either exhausted or Ctrl-C is pressed. Quitting before all generators are finished is attempted if there is no more chance of finding an item in all queues.

Parameters:genlist (list) – list of page generators
pywikibot.tools.itergroup(iterable, size)[source]

Make an iterator that returns lists of (up to) size items from iterable.

Example:

>>> i = itergroup(range(25), 10)
>>> print(next(i))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> print(next(i)) [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] >>> print(next(i)) [20, 21, 22, 23, 24] >>> print(next(i)) Traceback (most recent call last):

...

StopIteration

pywikibot.tools.redirect_func(target, source_module=None, target_module=None, old_name=None, class_name=None)[source]

Return a function which can be used to redirect to ‘target’.

It also acts like marking that function deprecated and copies all parameters.

Parameters:
  • target (callable) – The targeted function which is to be executed.
  • source_module (basestring) – The module of the old function. If ‘.’ defaults to target_module. If ‘None’ (default) it tries to guess it from the executing function.
  • target_module (basestring) – The module of the target function. If ‘None’ (default) it tries to get it from the target. Might not work with nested classes.
  • old_name (basestring) – The old function name. If None it uses the name of the new function.
  • class_name (basestring) – The name of the class. It’s added to the target and source module (separated by a ‘.’).
Returns:

A new function which adds a warning prior to each execution.

Return type:

callable

pywikibot.tools.signature(obj)[source]

Safely return function Signature object (PEP 362).

inspect.signature was introduced in 3.3, however backports are available. In Python 3.3, it does not support all types of callables, and should not be relied upon. Python 3.4 works correctly.

Any exception calling inspect.signature is ignored and None is returned.

Parameters:obj – Function to inspect
Rtype obj:callable
Return type:inpect.Signature or None

version Module

Module to determine the pywikibot version (tag, revision and date).

exception pywikibot.version.ParseError[source]

Bases: Exception

Parsing went wrong.

pywikibot.version.getfileversion(filename)[source]

Retrieve revision number of file.

Extracts __version__ variable containing Id tag, without importing it. (thus can be done for any file)

The version variable containing the Id tag is read and returned. Because it doesn’t import it, the version can be retrieved from any file. :param filename: Name of the file to get version :type filename: string

pywikibot.version.getversion(online=True)[source]

Return a pywikibot version string.

Parameters:online – (optional) Include information obtained online
pywikibot.version.getversion_git(path=None)[source]
pywikibot.version.getversion_nightly()[source]
pywikibot.version.getversion_onlinerepo(repo=None)[source]

Retrieve current framework revision number from online repository.

Parameters:repo (URL or string) – (optional) Online repository location
pywikibot.version.getversion_svn(path=None)[source]

Get version info for a Subversion checkout.

Parameters:path – directory of the Subversion checkout
Returns:
  • tag (name for the repository),
  • rev (current Subversion revision identifier),
  • date (date of current revision),
  • hash (git hash for the Subversion revision)
Return type:tuple of 4 str
pywikibot.version.getversiondict()[source]
pywikibot.version.github_svn_rev2hash(tag, rev)[source]

Convert a Subversion revision to a Git hash using Github.

Parameters:
  • tag – name of the Subversion repo on Github
  • rev – Subversion revision identifier
Returns:

the git hash

Return type:

str

pywikibot.version.package_versions(modules=None, builtins=False, standard_lib=None)[source]

Retrieve package version information.

When builtins or standard_lib are None, they will be included only if a version was found in the package.

Parameters:
  • modules (list of strings) – Modules to inspect
  • builtins (Boolean, or None for automatic selection) – Include builtins
  • standard_lib (Boolean, or None for automatic selection) – Include standard library packages
pywikibot.version.svn_rev_info(path)[source]

Fetch information about the current revision of an Subversion checkout.

Parameters:path – directory of the Subversion checkout
Returns:
  • tag (name for the repository),
  • rev (current Subversion revision identifier),
  • date (date of current revision),
Return type:tuple of 3 str

weblib Module

Functions for manipulating external links or querying third-party sites.

pywikibot.weblib.getInternetArchiveURL(url, timestamp=None)[source]

Return archived URL by Internet Archive.

See [[:mw:Archived Pages]] and https://archive.org/help/wayback_api.php for more details.

Parameters:
  • url – url to search an archived version for
  • timestamp – requested archive date. The version closest to that moment is returned. Format: YYYYMMDDhhmmss or part thereof.
pywikibot.weblib.getWebCitationURL(url, timestamp=None)[source]

Return archived URL by Web Citation.

See http://www.webcitation.org/doc/WebCiteBestPracticesGuide.pdf for more details

Parameters:
  • url – url to search an archived version for
  • timestamp – requested archive date. The version closest to that moment is returned. Format: YYYYMMDDhhmmss or part thereof.

xmlreader Module

XML reading module.

Each XmlEntry object represents a page, as read from an XML source

The XmlDump class reads a pages_current XML dump (like the ones offered on https://dumps.wikimedia.org/backup-index.html) and offers a generator over XmlEntry objects which can be used by other bots.

class pywikibot.xmlreader.XmlDump(filename, allrevisions=False)[source]

Bases: object

Represents an XML dump file.

Reads the local file at initialization, parses it, and offers access to the resulting XmlEntries via a generator.

Parameters:allrevisions – boolean If True, parse all revisions instead of only the latest one. Default: False.
parse()[source]

Generator using cElementTree iterparse function.

class pywikibot.xmlreader.XmlEntry(title, ns, id, text, username, ipedit, timestamp, editRestriction, moveRestriction, revisionid, comment, redirect)[source]

Bases: object

Represent a page.

class pywikibot.xmlreader.XmlParserThread(filename, handler)[source]

Bases: threading.Thread

XML parser that will run as a single thread.

This allows the XmlDump generator to yield pages before the parser has finished reading the entire dump.

There surely are more elegant ways to do this.

run()[source]
pywikibot.xmlreader.parseRestrictions(restrictions)[source]

Parse the characters within a restrictions tag.

Returns strings representing user groups allowed to edit and to move a page, where None means there are no restrictions.