scripts Package

scripts Package

THIS DIRECTORY IS TO HOLD BOT SCRIPTS FOR THE NEW FRAMEWORK.

add_text Module

This is a Bot to add a text at the end of the content of the page.

By default it adds the text above categories, interwiki and template for the stars of the interwiki.

Alternatively it may also add a text at the top of the page. These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-page             Use a page as generator

-talkpage         Put the text onto the talk page instead the generated on
-talk

-text             Define which text to add. "\n" are interpreted as newlines.

-textfile         Define a texfile name which contains the text to add

-summary          Define the summary to use

-except           Use a regex to check if the text is already in the page

-excepturl        Use the html page as text where you want to see if there's
                  the text, not the wiki-page.

-newimages        Add text in the new images

-always           If used, the bot won't ask if it should add the text
                  specified

-up               If used, put the text at the top of the page

-noreorder        Avoid to reorder cats and interwiki

--- Example ---

1. # This is a script to add a template to the top of the pages with # category:catname # Warning! Put it in one line, otherwise it won’t work correctly.

python add_text.py -cat:catname -summary:”Bot: Adding a template”

-text:”{{Something}}” -except:”{{([Tt]emplate:|)[Ss]omething” -up

2. # Command used on it.wikipedia to put the template in the page without any # category. # Warning! Put it in one line, otherwise it won’t work correctly.

python add_text.py -excepturl:”class=’catlinks’>” -uncat

-text:”{{Categorizzare}}” -except:”{{([Tt]emplate:|)[Cc]ategorizzare” -summary:”Bot: Aggiungo template Categorizzare”

—Credits and Help —

This script has been written by Botwiki’s staff, if you want to help us or you need some help regarding this script, you can find us here:

* http://botwiki.sno.cc/wiki/Main_Page
scripts.add_text.add_text(page, addText, summary=None, regexSkip=None, regexSkipUrl=None, always=False, up=False, putText=True, oldTextGiven=None, reorderEnabled=True, create=False)[source]

Add text to a page.

Return type:tuple of (text, newtext, always)
scripts.add_text.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

archivebot Module

archivebot.py - discussion page archiving bot.

usage:

python pwb.py archivebot [OPTIONS] TEMPLATE_PAGE

Bot examines backlinks (Special:WhatLinksHere) to TEMPLATE_PAGE. Then goes through all pages (unless a specific page specified using options) and archives old discussions. This is done by breaking a page into threads, then scanning each thread for timestamps. Threads older than a specified treshold are then moved to another page (the archive), which can be named either basing on the thread’s name or then name can contain a counter which will be incremented when the archive reaches a certain size.

Trancluded template may contain the following parameters:

{{TEMPLATE_PAGE

|archive = |algo = |counter = |maxarchivesize = |minthreadsleft = |minthreadstoarchive = |archiveheader = |key = }}

Meanings of parameters are:

archive Name of the page to which archived threads will be put.
Must be a subpage of the current page. Variables are supported.
algo specifies the maximum age of a thread. Must be in the form
old(<delay>) where <delay> specifies the age in hours or days like 24h or 5d. Default is old(24h)
counter The current value of a counter which could be assigned as
variable. Will be actualized by bot. Initial value is 1.
maxarchivesize The maximum archive size before incrementing the counter.
Value can be given with appending letter like K or M which indicates KByte or MByte. Default value is 1000M.
minthreadsleft Minimum number of threads that should be left on a page.
Default value is 5.
minthreadstoarchive The minimum number of threads to archive at once. Default
value is 2.
archiveheader Content that will be put on new archive pages as the
header. This parameter supports the use of variables. Default value is {{talkarchive}}
key A secret key that (if valid) allows archives to not be
subpages of the page being archived.

Variables below can be used in the value for “archive” in the template above:

%(counter)d          the current value of the counter
%(year)d             year of the thread being archived
%(isoyear)d          ISO year of the thread being archived
%(isoweek)d          ISO week number of the thread being archived
%(quarter)d          quarter of the year of the thread being archived
%(month)d            month (as a number 1-12) of the thread being archived
%(monthname)s        English name of the month above
%(monthnameshort)s   first three letters of the name above
%(week)d             week number of the thread being archived

The ISO calendar starts with the Monday of the week which has at least four days in the new Gregorian calendar. If January 1st is between Monday and Thursday (including), the first week of that year started the Monday of that week, which is in the year before if January 1st is not a Monday. If it’s between Friday or Sunday (including) the following week is then the first week of the year. So up to three days are still counted as the year before.

See also::
Options (may be omitted)::
-help show this help message and exit

-calc:PAGE calculate key for PAGE and exit -file:FILE load list of pages from FILE -force override security options -locale:LOCALE switch to locale LOCALE -namespace:NS only archive pages from a given namespace -page:PAGE archive a single PAGE, default ns is a user talk page -salt:SALT specify salt

exception scripts.archivebot.AlgorithmError(arg)[source]

Bases: scripts.archivebot.MalformedConfigError

Invalid specification of archiving algorithm.

exception scripts.archivebot.ArchiveSecurityError(arg)[source]

Bases: pywikibot.exceptions.Error

Page title is not a valid archive of page being archived.

The page title is neither a subpage of the page being archived, nor does it match the key specified in the archive configuration template.

class scripts.archivebot.DiscussionPage(source, archiver, params=None)[source]

Bases: pywikibot.page.Page

A class that represents a single page of discussion threads.

Feed threads to it and run an update() afterwards.

feed_thread(thread, max_archive_size=(256000, 'B'))[source]
load_page()[source]

Load the page to be archived and break it up into threads.

size()[source]
update(summary, sort_threads=False)[source]
class scripts.archivebot.DiscussionThread(title, now, timestripper)[source]

Bases: object

An object representing a discussion thread on a page.

It represents something that is of the form:

== Title of thread ==

Thread content here. ~~~~ :Reply, etc. ~~~~

feed_line(line)[source]
should_be_archived(archiver)[source]
size()[source]
to_text()[source]
exception scripts.archivebot.MalformedConfigError(arg)[source]

Bases: pywikibot.exceptions.Error

There is an error in the configuration template.

exception scripts.archivebot.MissingConfigError(arg)[source]

Bases: pywikibot.exceptions.Error

The config is missing in the header.

It’s in one of the threads or transcluded from another page.

class scripts.archivebot.PageArchiver(page, tpl, salt, force=False)[source]

Bases: object

A class that encapsulates all archiving methods.

__init__ expects a pywikibot.Page object. Execute by running the .run() method.

algo = 'none'
analyze_page()[source]
attr2text()[source]
feed_archive(archive, thread, max_archive_size, params=None)[source]

Feed the thread to one of the archives.

If it doesn’t exist yet, create it. If archive name is an empty string (or None), discard the thread (/dev/null). Also checks for security violations.

get_attr(attr, default='')[source]
key_ok()[source]
load_config()[source]
run()[source]
saveables()[source]
set_attr(attr, value, out=True)[source]
class scripts.archivebot.TZoneUTC[source]

Bases: datetime.tzinfo

Class building a UTC tzinfo object.

dst(dt)[source]
tzname(dt)[source]
utcoffset(dt)[source]
scripts.archivebot.generate_transclusions(site, template, namespaces=[])[source]
scripts.archivebot.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.archivebot.str2localized_duration(site, string)[source]

Localise a shorthand duration.

Translates a duration written in the shorthand notation (ex. “24h”, “7d”) into an expression in the local language of the wiki (“24 hours”, “7 days”).

scripts.archivebot.str2size(string)[source]

Return a size for a shorthand size.

Accepts a string defining a size:: 1337 - 1337 bytes 150K - 150 kilobytes 2M - 2 megabytes Returns a tuple (size,unit), where size is an integer and unit is ‘B’ (bytes) or ‘T’ (threads).

scripts.archivebot.str2time(string)[source]

Return a timedelta for a shorthand duration.

Accepts a string defining a time period:: 7d - 7 days 36h - 36 hours Returns the corresponding timedelta object.

basic Module

An incomplete sample script.

This is not a complete bot; rather, it is a template from which simple bots can be made. You can rename it to mybot.py, then edit it in whatever way you want.

The following parameters are supported:

This script supports use of pywikibot.pagegenerators arguments.

-dry If given, doesn’t do any real changes, but only shows what would have been changed.
class scripts.basic.BasicBot(generator, dry)[source]

Bases: object

An incomplete sample bot.

load(page)[source]

Load the text of the given page.

run()[source]

Process each page from the generator.

save(text, page, comment=None, minorEdit=True, botflag=True)[source]

Update the given page with new text.

treat(page)[source]

Load the given page, does some changes, and saves it.

scripts.basic.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

blockpageschecker Module

A bot to remove stale protection templates from pages that are not protected.

Very often sysops block the pages for a setted time but then the forget to remove the warning! This script is useful if you want to remove those useless warning left in these pages.

Parameters:

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

-xml Retrieve information from a local XML dump (pages-articles or pages-meta-current, see https://download.wikimedia.org). Argument can also be given as “-xml:filename”.
-protectedpages: Check all the blocked pages; useful when you have not
categories or when you have problems with them. (add the namespace after ”:” where you want to check - default checks all protected pages.)

-moveprotected: Same as -protectedpages, for moveprotected pages

Furthermore, the following command line parameters are supported:

-always         Doesn't ask every time if the bot should make the change or not,
                do it always.

-show           When the bot can't delete the template from the page (wrong
                regex or something like that) it will ask you if it should show
                the page on your browser.
                (attention: pages included may give false positives!)

-move           The bot will check if the page is blocked also for the move
                option, not only for edit

--- Example of how to use the script ---

 python blockpageschecker.py -always

 python blockpageschecker.py -cat:Geography -always

 python blockpageschecker.py -show -protectedpages:4
scripts.blockpageschecker.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.blockpageschecker.showQuest(page)[source]
scripts.blockpageschecker.understandBlock(text, TTP, TSP, TSMP, TTMP, TU)[source]

Understand if the page is blocked and if it has the right template.

blockreview Module

This bot implements a blocking review process for de-wiki first.

For other sites this bot script must be changed.

This script is run by [[de:User:xqt]]. It should not be run by other users without prior contact.

The following parameters are supported:

-
class scripts.blockreview.BlockreviewBot(dry=False)[source]

Bases: object

Block review bot.

SysopGenerator()[source]
getInfo(user)[source]
load(page)[source]

Load the given page and return the page text.

msg_admin = {'de': 'Bot-Benachrichtigung: Sperrprüfungswunsch von [[%(user)s]]'}
msg_done = {'de': 'Bot: Sperrprüfung abgeschlossen. Benutzer ist entsperrt.'}
msg_user = {'de': 'Bot: Administrator [[Benutzer:%(admin)s|%(admin)s]] für Sperrprüfung benachrichtigt'}
note_admin = {'de': "\n\n== Sperrprüfungswunsch ==\nHallo %(admin)s,\n\n[[%(user)s]] wünscht die Prüfung seiner/ihrer Sperre vom %(time)s über die Dauer von %(duration)s. Kommentar war ''%(comment)s''. Bitte äußere Dich dazu auf der [[%(usertalk)s#%(section)s|Diskussionsseite]]. -~~~~"}
note_project = {'de': "\n\n== [[%(user)s]] ==\n* gesperrt am %(time)s durch {{Benutzer|%(admin)s}} für eine Dauer von %(duration)s.\n* Kommentar war ''%(comment)s''.\n* [[Benutzer:%(admin)s]] wurde [[Benutzer Diskussion:%(admin)s#Sperrprüfungswunsch|benachrichtigt]].\n* [[%(usertalk)s#%(section)s|Link zur Diskussion]]\n:<small>-~~~~</small>\n;Antrag entgegengenommen"}
project_name = {'de': 'Benutzer:TAXman/Sperrprüfung Neu', 'pt': 'Wikipedia:Pedidos a administradores/Discussão de bloqueio'}
review_cat = {'de': 'Wikipedia:Sperrprüfung'}
run()[source]
save(text, page, comment, minorEdit=True, botflag=True)[source]
treat(userPage)[source]

Load the given page, does some changes, and saves it.

unblock_tpl = {'de': 'Benutzer:TAXman/Sperrprüfungsverfahren', 'pt': 'Predefinição:Discussão de bloqueio'}
scripts.blockreview.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

capitalize_redirects Module

Bot to create capitalized redirects.

It creates redirects where the first character of the first word is uppercase and the remaining characters and words are lowercase.

Command-line arguments:

This script supports use of pywikibot.pagegenerators arguments.

-always Don’t prompt to make changes, just do them.
-titlecase creates a titlecased redirect version of a given page where all words of the title start with an uppercase character and the remaining characters are lowercase.

Example: “python capitalize_redirects.py -start:B -always”

class scripts.capitalize_redirects.CapitalizeBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

Capitalization Bot.

treat(page)[source]
scripts.capitalize_redirects.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

casechecker Module

Bot to find all pages on the wiki with mixed latin and cyrilic alphabets.

class scripts.casechecker.CaseChecker[source]

Bases: object

Case checker.

AddNoSuggestionTitle(title)[source]
AppendLineToLog(filename, text)[source]
ColorCodeWord(word, toScreen=False)[source]
FindBadWords(title)[source]
MakeMoveSummary(fromTitle, toTitle)[source]
OpenLogFile(filename)[source]
Page(title)[source]
PickTarget(title, original, candidates)[source]
ProcessDataBlock(data)[source]
ProcessTitle(title)[source]
PutNewPage(pageObj, pageTxt, msg)[source]
Run()[source]
RunQuery(params)[source]
WikiLog(text)[source]
alwaysInLatin = ['II', 'III']
alwaysInLocal = ['СССР', 'Как', 'как']
apfrom = ''
aplimit = None
autonomous = False
doFailed = False
failedTitles = 'failedTitles.txt'
filterredir = 'nonredirects'
latClrFnt = '<font color=brown>'
latLtr = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
latinKeyboard = 'qwertyuiopasdfghjklzxcvbnm'
latinSuspects = 'ABEKMHOPCTXIËÏaeopcyxiëï'
lclClrFnt = '<font color=green>'
localKeyboard = 'йцукенгшщзфывапролдячсмить'
localLowerLtr = 'ёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ'
localLtr = 'ЁІЇЎАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯҐёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ'
localSuspects = 'АВЕКМНОРСТХІЁЇаеорсухіёї'
localUpperLtr = 'ЁІЇЎАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯҐ'
namespaces = []
nosuggestions = 'nosuggestions.txt'
replace = False
romanNumChars = 'IVXLMC'
romanNumSfxPtrn = re.compile('^[IVXLMC]+[ёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ]+$')
romannumSuffixes = 'ёіїўабвгдежзийклмнопрстуфхцчшщъыьэюяґ'
stopAfter = -1
stripChars = ' \t,'
suffixClr = '</font>'
title = None
titleList = None
titles = True
whitelists = {'ru': 'ВП:КЛ/Проверенные'}
wikilog = None
wikilogfile = 'wikilog.txt'
wordBreaker = re.compile('[ _\\-/\\|#[\\]():]')
scripts.casechecker.SetColor(color)[source]
scripts.casechecker.xuniqueCombinations(items, n)[source]

catall Module

Add or change categories on a number of pages.

Usage::
catall.py [start]

Provides the categories on the page and asks whether to change them.

If no starting name is provided, the bot starts at ‘A’.

Options::
-onlynew : Only run on pages that do not yet have a category.
scripts.catall.choosecats(pagetext)[source]
scripts.catall.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.catall.make_categories(page, list, site=None)[source]

category Module

Scripts to manage categories.

Syntax: python category.py action [-option]

where action can be one of these::
  • add - mass-add a category to a list of pages
  • remove - remove category tag from all pages in a category
  • move - move all pages in a category to another category
  • tidy - tidy up a category by moving its articles into subcategories
  • tree - show a tree of subcategories of a given category
  • listify - make a list of all of the articles that are in a category

and option can be one of these:

Options for “add” action::
  • -person
    • sort persons by their last name
  • -create
    • If a page doesn’t exist, do not skip it, create it instead
  • -redirect
    • Follow redirects

If action is “add”, the following options are supported:

This script supports use of pywikibot.pagegenerators arguments.

Options for “listify” action::
  • -overwrite
    • This overwrites the current page with the list even if

    something is already there.

  • -showimages
    • This displays images rather than linking them in the list.
  • -talkpages
    • This outputs the links to talk pages of the pages to be

    listified in addition to the pages themselves.

Options for “remove” action::
  • -nodelsum
    • This specifies not to use the custom edit summary as the

    deletion reason. Instead, it uses the default deletion reason for the language, which is “Category was disbanded” in English.

Options for “move” action::
  • -hist
    • Creates a nice wikitable on the talk page of target category

    that contains detailed page history of the source category.

  • -nodelete
    • Don’t delete the old category after move
  • -nowb
    • Don’t update the wikibase repository
  • -allowsplit
    • If that option is not set, it only moves the talk and main

    page together.

  • -mvtogether
    • Only move the pages/subcategories of a category, if the

    target page (and talk page, if -allowsplit is not set) doesn’t exist.

Options for several actions::
  • -rebuild
    • reset the database
  • -from: - The category to move from (for the move option)

    Also, the category to remove from in the remove option Also, the category to make a list of in the listify option

  • -to: - The category to move to (for the move option)
    • Also, the name of the list to make in the listify option

    NOTE: If the category names have spaces in them you may need to use a special syntax in your shell so that the names aren’t treated as separate parameters. For instance, in BASH, use single quotes, e.g. -from:’Polar bears’

  • -batch
    • Don’t prompt to delete emptied categories (do it

    automatically).

  • -summary: - Pick a custom edit summary for the bot.

  • -inplace
    • Use this flag to change categories in place rather than

    rearranging them.

  • -recurse
    • Recurse through all subcategories of categories.
  • -pagesonly
    • While removing pages from a category, keep the subpage links

    and do not remove them

  • -match
    • Only work on pages whose titles match the given regex (for

    move and remove actions).

  • -depth: - The max depth limit beyond which no subcategories will be

    listed.

For the actions tidy and tree, the bot will store the category structure locally in category.dump. This saves time and server load, but if it uses these data later, they may be outdated; use the -rebuild parameter in this case.

For example, to create a new category from a list of persons, type:

python category.py add -person

and follow the on-screen instructions.

Or to do it all from the command-line, use the following syntax:

python category.py move -from:US -to:'United States'

This will move all pages in the category US to the category United States.

category_redirect Module

This bot will move pages out of redirected categories.

Usage: category_redirect.py [options]

The bot will look for categories that are marked with a category redirect template, take the first parameter of the template as the target of the redirect, and move all pages and subcategories of the category there. It also changes hard redirects into soft redirects, and fixes double redirects. A log is written under <userpage>/category_redirect_log. Only category pages that haven’t been edited for a certain cooldown period (currently 7 days) are taken into account.

class scripts.category_redirect.CategoryRedirectBot[source]

Bases: object

Page category update bot.

get_log_text()[source]

Rotate log text and return the most recent text.

move_contents(oldCatTitle, newCatTitle, editSummary)[source]

The worker function that moves pages out of oldCat into newCat.

readyToEdit(cat)[source]

Return True if cat not edited during cooldown period, else False.

run()[source]

Run the bot.

scripts.category_redirect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

cfd Module

This script processes the Categories for discussion working page.

It parses out the actions that need to be taken as a result of CFD discussions (as posted to the working page by an administrator) and performs them.

Syntax: python cfd.py

class scripts.cfd.ReCheck[source]

Bases: object

Helper class.

check(pattern, text)[source]
scripts.cfd.findDay(pageTitle, oldDay)[source]
scripts.cfd.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

checkimages Module

Script to check recently uploaded files.

This script checks if a file description is present and if there are other problems in the image’s description.

This script will have to be configured for each language. Please submit translations as addition to the Pywikibot framework.

Everything that needs customisation is indicated by comments.

This script understands the following command-line arguments:

-limit              The number of images to check (default: 80)

-commons            The Bot will check if an image on Commons has the same name
                    and if true it reports the image.

-duplicates[:#]     Checking if the image has duplicates (if arg, set how many
                    rollback wait before reporting the image in the report
                    instead of tag the image) default: 1 rollback.

-duplicatesreport   Report the duplicates in a log *AND* put the template in
                    the images.

-sendemail          Send an email after tagging.

-break              To break the bot after the first check (default: recursive)

-time[:#]           Time in seconds between repeat runs (default: 30)

-wait[:#]           Wait x second before check the images (default: 0)
                    NOT YET IMPLEMENTED

-skip[:#]           The bot skip the first [:#] images (default: 0)

-start[:#]          Use allpages() as generator
                    (it starts already form File:[:#])

-cat[:#]            Use a category as generator

-regex[:#]          Use regex, must be used with -url or -page

-page[:#]           Define the name of the wikipage where are the images

-url[:#]            Define the url where are the images

-nologerror         If given, this option will disable the error that is risen
                    when the log is full.

---- Instructions for the real-time settings  ----
  • For every new block you have to add:

<——- ——->

In this way the Bot can understand where the block starts in order to take the right parameter.

  • Name= Set the name of the block

  • Find= Use it to define what search in the text of the image’s description,

    while

    Findonly= search only if the exactly text that you give is in the image’s

    description.

  • Summary= That’s the summary that the bot will use when it will notify the

    problem.

  • Head= That’s the incipit that the bot will use for the message.

  • Text= This is the template that the bot will use when it will report the

    image’s problem.

—- Known issues/FIXMEs: —-
  • Clean the code, some passages are pretty difficult to understand if you’re not the coder.
  • Add the “catch the language” function for commons.
  • Fix and reorganise the new documentation
  • Add a report for the image tagged.
exception scripts.checkimages.LogIsFull(arg)[source]

Bases: pywikibot.exceptions.Error

Log is full and the Bot cannot add other data to prevent Errors.

exception scripts.checkimages.NothingFound(arg)[source]

Bases: pywikibot.exceptions.Error

Regex returned [] instead of results.

class scripts.checkimages.checkImagesBot(site, logFulNumber=25000, sendemailActive=False, duplicatesReport=False, logFullError=True)[source]

Bases: object

A robot to check recently uploaded files.

checkImageDuplicated(duplicates_rollback)[source]

Function to check the duplicated files.

checkImageOnCommons()[source]

Checking if the file is on commons.

checkStep()[source]
convert_to_url(page)[source]

Return the page title suitable as for an URL.

countEdits(pagename, userlist)[source]

Function to count the edit of a user or a list of users in a page.

findAdditionalProblems()[source]
isTagged()[source]

Understand if a file is already tagged or not.

load(raw)[source]

Load a list of objects from a string using regex.

loadHiddenTemplates()[source]

Function to load the white templates.

load_licenses()[source]

Load the list of the licenses.

miniTemplateCheck(template)[source]

Check if template is in allowed licenses or in licenses to skip.

put_mex_in_talk()[source]

Function to put the warning in talk page of the uploader.

regexGenerator(regexp, textrun)[source]

Find page to yield using regex to parse text.

report(newtext, image_to_report, notification=None, head=None, notification2=None, unver=True, commTalk=None, commImage=None)[source]

Function to make the reports easier.

report_image(image_to_report, rep_page=None, com=None, rep_text=None, addings=True, regex=None)[source]

Report the files to the report page when needed.

returnOlderTime(listGiven, timeListGiven)[source]

Get some time and return the oldest of them.

setParameters(imageName)[source]

Set parameters.

Now only image but maybe it can be used for others in “future”.

skipImages(skip_number, limit)[source]

Given a number of files, skip the first -number- files.

smartDetection()[source]

Detect templates.

The bot instead of checking if there’s a simple template in the image’s description, checks also if that template is a license or something else. In this sense this type of check is smart.

tag_image(put=True)[source]

Add template to the Image page and find out the uploader.

takesettings()[source]

Function to take the settings from the wiki.

templateInList()[source]

Check if template is in list.

The problem is the calls to the Mediawiki system because they can be pretty slow. While searching in a list of objects is really fast, so first of all let’s see if we can find something in the info that we already have, then make a deeper check.

uploadBotChangeFunction(reportPageText, upBotArray)[source]

Detect the user that has uploaded the file through the upload bot.

wait(waitTime, generator, normal, limit)[source]

Skip the images uploaded before x seconds.

Let the users to fix the image’s problem alone in the first x seconds.

scripts.checkimages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.checkimages.printWithTimeZone(message)[source]

Print the messages followed by the TimeZone encoded correctly.

claimit Module

A script that adds claims to Wikidata items based on categories.

Usage:

python claimit.py [pagegenerators] P1 Q2 P123 Q456

You can use any typical pagegenerator to provide with a list of pages. Then list the property–>target pairs to add.

For geographic coordinates:

python claimit.py [pagegenerators] P625 [lat-dec],[long-dec],[prec]

[lat-dec] and [long-dec] represent the latitude and longitude respectively, and [prec] represents the precision. All values are in decimal degrees, not DMS. If [prec] is omitted, the default precision is 0.0001 degrees.

Example:

 python claimit.py [pagegenerators] P625 -23.3991,-52.0910,0.0001

------------------------------------------------------------------------------

By default, claimit.py does not add a claim if one with the same property already exists on the page. To override this behavior, use the ‘exists’ option:

python claimit.py [pagegenerators] P246 "string example" -exists:p

Suppose the claim you want to add has the same property as an existing claim and the “-exists:p” argument is used. Now, claimit.py will not add the claim if it has the same target, sources, and/or qualifiers as the existing claim. To override this behavior, add ‘t’ (target), ‘s’ (sources), or ‘q’ (qualifiers) to the ‘exists’ argument.

For instance, to add the claim to each page even if one with the same property, target, and qualifiers already exists:

python claimit.py [pagegenerators] P246 "string example" -exists:ptq

Note that the ordering of the letters in the ‘exists’ argument does not matter, but ‘p’ must be included.

class scripts.claimit.ClaimRobot(generator, claims, exists_arg='')[source]

Bases: pywikibot.bot.WikidataBot

A bot to add Wikidata claims.

treat(page, item)[source]

Treat each page.

scripts.claimit.listsEqual(list1, list2)[source]

Return true if the lists are probably equal, ignoring order.

Works for lists of unhashable items (like dictionaries).

scripts.claimit.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

clean_sandbox Module

This bot resets a (user) sandbox with predefined text.

This script understands the following command-line arguments:

-hours:#       Use this parameter if to make the script repeat itself
                after # hours. Hours can be defined as a decimal. 0.01
                hours are 36 seconds; 0.1 are 6 minutes.

-delay:#       Use this parameter for a wait time after the last edit
                was made. If no parameter is given it takes it from
                hours and limits it between 5 and 15 minutes.
                The minimum delay time is 5 minutes.

-user          Use this parameter to run the script in the user name-
                space.
                > ATTENTION: on most wiki THIS IS FORBIDEN FOR BOTS ! <
                > (please talk with your admin first)                  <
                Since it is considered bad style to edit user page with-
                out permission, the 'user_sandboxTemplate' for given
                language has to be set-up (no fall-back will be used).
                All pages containing that template will get cleaned.
                Please be also aware that the rules when to clean the
                user sandbox differ from those for project sandbox.

-page          Run the bot on specific page, you can use this when
                you haven't configured clean_candbox for your wiki.

-text          The text that substitutes in the sandbox, you can use this
                when you haven't configured clean_candbox for your wiki.

-summary       Summary of the edit made by bot.
class scripts.clean_sandbox.SandboxBot(**kwargs)[source]

Bases: pywikibot.bot.Bot

Sandbox reset bot.

availableOptions = {'delay_td': None, 'user': False, 'no_repeat': True, 'summary': '', 'text': '', 'page': None, 'hours': 1, 'delay': None}
run()[source]

Run bot.

scripts.clean_sandbox.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

commonscat Module

With this tool you can add the template {{commonscat}} to categories.

The tool works by following the interwiki links. If the template is present on another langauge page, the bot will use it.

You could probably use it at articles as well, but this isn’t tested.

This bot uses pagegenerators to get a list of pages. The following options are supported:

This script supports use of pywikibot.pagegenerators arguments.

-always Don’t prompt you for each replacement. Warning message has not to be confirmed. ATTENTION: Use this with care!
-summary:XYZ Set the action summary message for the edit to XYZ,
otherwise it uses messages from add_text.py as default.
-checkcurrent Work on all category pages that use the primary commonscat template.

For example to go through all categories:: commonscat.py -start:Category:!

class scripts.commonscat.CommonscatBot(generator, always, summary=None)[source]

Bases: pywikibot.bot.Bot

Commons categorisation bot.

addCommonscat(page)[source]

Add CommonsCat template to page.

Take a page. Go to all the interwiki page looking for a commonscat template. When all the interwiki’s links are checked and a proper category is found add it to the page.

changeCommonscat(page=None, oldtemplate='', oldcat='', newtemplate='', newcat='', linktitle='', description='')[source]

Change the current commonscat template and target.

Return the name of a valid commons category.

If the page is a redirect this function tries to follow it. If the page doesn’t exists the function will return an empty string

Find CommonsCat template on interwiki pages.

In Pywikibot 2.0, page.interwiki() now returns Link objects, not Page objects

Return type:unicode, name of a valid commons category

Find CommonsCat template on page.

Return type:tuple of (<templatename>, <target>, <linktext>, <note>)
classmethod getCommonscatTemplate(code=None)[source]

Get the template name of a site. Expects the site code.

Return as tuple containing the primary template and it’s alternatives

skipPage(page)[source]

Determine if the page should be skipped.

treat(page)[source]

Load the given page, do some changes, and save it.

scripts.commonscat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

coordinate_import Module

Coordinate importing script.

Usage:

python coordinate_import.py -lang:en -family:wikipedia -cat:Category:Coordinates_not_on_Wikidata

This will work on all pages in the category “coordinates not on Wikidata” and will import the coordinates on these pages to Wikidata.

The data from the “GeoData” extension (https://www.mediawiki.org/wiki/Extension:GeoData) is used so that extension has to be setup properly. You can look at the [[Special:Nearby]] page on your local Wiki to see if it’s populated.

You can use any typical pagegenerator to provide with a list of pages:

python coordinate_import.py -lang:it -family:wikipedia -transcludes:Infobox_stazione_ferroviaria -namespace:0

This script supports use of pywikibot.pagegenerators arguments.

class scripts.coordinate_import.CoordImportRobot(generator)[source]

Bases: pywikibot.bot.WikidataBot

A bot to import coordinates to Wikidata.

has_coord_qualifier(claims)[source]

Check if self.prop is used as property for a qualifier.

Parameters:claims (dict) – the Wikibase claims to check in
Returns:the first property for which self.prop is used as qualifier, or None if any
Returntype:unicode or None
treat(page, item)[source]

Treat page/item.

scripts.coordinate_import.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

cosmetic_changes Module

This module can do slight modifications to tidy a wiki page’s source code.

The changes are not supposed to change the look of the rendered wiki page.

The following parameters are supported:

This script supports use of pywikibot.pagegenerators arguments.

-always Don’t prompt you for each replacement. Warning (see below) has not to be confirmed. ATTENTION: Use this with care!
-async Put page on queue to be saved to wiki asynchronously.
-summary:XYZ Set the summary message text for the edit to XYZ, bypassing
the predefined message texts with original and replacements inserted.
-ignore: Ignores if an error occured and either skips the page or
only that method. It can be set to ‘page’ or ‘method’.

&warning;

For regular use, it is recommended to put this line into your user-config.py:

cosmetic_changes = True

You may enable cosmetic changes for additional languages by adding the dictionary cosmetic_changes_enable to your user-config.py. It should contain a tuple of languages for each site where you wish to enable in addition to your own langlanguage if cosmetic_changes_mylang_only is True (see below). Please set your dictionary by adding such lines to your user-config.py:

cosmetic_changes_enable['wikipedia'] = ('de', 'en', 'fr')

There is another config variable: You can set

cosmetic_changes_mylang_only = False

if you’re running a bot on multiple sites and want to do cosmetic changes on all of them, but be careful if you do.

You may disable cosmetic changes by adding the all unwanted languages to the dictionary cosmetic_changes_disable in your user-config.py. It should contain a tuple of languages for each site where you wish to disable cosmetic changes. You may use it with cosmetic_changes_mylang_only is False, but you can also disable your own language. This also overrides the settings in the dictionary cosmetic_changes_enable. Please set this dictionary by adding such lines to your user-config.py:

cosmetic_changes_disable['wikipedia'] = ('de', 'en', 'fr')

You may disable cosmetic changes for a given script by appending the all unwanted scripts to the list cosmetic_changes_deny_script in your user-config.py. By default it contains cosmetic_changes.py itself and touch.py. This overrides all other enabling settings for cosmetic changes. Please modify the given list by adding such lines to your user-config.py:

cosmetic_changes_deny_script.append('your_script_name_1')

or by adding a list to the given one:

cosmetic_changes_deny_script += ['your_script_name_1', 'your_script_name_2']
class scripts.cosmetic_changes.CosmeticChangesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

Cosmetic changes bot.

treat(page)[source]
class scripts.cosmetic_changes.CosmeticChangesToolkit(site, diff=False, redirect=False, namespace=None, pageTitle=None, ignore=False, debug='[deprecated name of diff]')[source]

Bases: object

Cosmetic changes toolkit.

change(text)[source]

Execute all clean up methods and catch errors if activated.

cleanUpSectionHeaders(text)[source]

Add a space between the equal signs and the section title.

Example: ==Section title== becomes == Section title ==

NOTE: This space is recommended in the syntax help on the English and German Wikipedia. It might be that it is not wanted on other wikis. If there are any complaints, please file a bug report.

commonsfiledesc(text)[source]
fixArabicLetters(text)[source]
fixHtml(text)[source]
fixReferences(text)[source]
fixSelfInterwiki(text)[source]

Interwiki links to the site itself are displayed like local links.

Remove their language code prefix.

fixStyle(text)[source]
fixSyntaxSave(text)[source]
fixTypo(text)[source]
static isbn_execute(text)[source]

Hyphenate ISBN numbers and catch ‘InvalidIsbnException’.

putSpacesInLists(text)[source]

Add a space between the * or # and the text.

NOTE: This space is recommended in the syntax help on the English, German, and French Wikipedia. It might be that it is not wanted on other wikis. If there are any complaints, please file a bug report.

removeNonBreakingSpaceBeforePercent(text)[source]

Remove a non-breaking space between number and percent sign.

Newer MediaWiki versions automatically place a non-breaking space in front of a percent sign, so it is no longer required to place it manually.

FIXME: which version should this be run on?

removeUselessSpaces(text)[source]
replaceDeprecatedTemplates(text)[source]
resolveHtmlEntities(text)[source]
safe_execute(method, text)[source]

Execute the method and catch exceptions if enabled.

standardizePageFooter(text)[source]

Standardize page footer.

Makes sure that interwiki links, categories and star templates are put to the correct position and into the right order. This combines the old instances standardizeInterwiki and standardizeCategories The page footer has the following section in that sequence:: 1. categories 2. ## TODO: template beyond categories ## 3. additional information depending on local site policy 4. stars templates for featured and good articles 5. interwiki links

translateAndCapitalizeNamespaces(text)[source]

Use localized namespace names.

translateMagicWords(text)[source]

Use localized magic words.

validXhtml(text)[source]
scripts.cosmetic_changes.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

create_categories Module

Program to batch create categories.

The program expects a generator containing a list of page titles to be used as base.

The following command line parameters are supported:

-always         Don't ask, just do the edit.

-overwrite      (not implemented yet).

-parent         The name of the parent category.

-basename       The base to be used for the new category names.

Example:: create_categories.py

-lang:commons -family:commons -links:User:Multichill/Wallonia -parent:”Cultural heritage monuments in Wallonia” -basename:”Cultural heritage monuments in”
class scripts.create_categories.CreateCategoriesBot(generator, parent, basename, **kwargs)[source]

Bases: pywikibot.bot.Bot

Category creator bot.

create_category(page)[source]
run()[source]
scripts.create_categories.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

data_ingestion Module

A generic bot to do data ingestion (batch uploading) to Commons.

scripts.data_ingestion.CSVReader(fileobj, urlcolumn, *args, **kwargs)[source]

CSV reader.

class scripts.data_ingestion.DataIngestionBot(reader, titlefmt, pagefmt, site=Site("commons", "commons"))[source]

Bases: object

Data ingestion bot.

doSingle()[source]
run()[source]
class scripts.data_ingestion.Photo(URL, metadata)[source]

Bases: object

Represents a Photo (or other file), with metadata, to upload to Commons.

The constructor takes two parameters: URL (string) and metadata (dict with str:str key:value pairs) that can be referred to from the title & template generation.

downloadPhoto()[source]

Download the photo and store it in a io.BytesIO object.

TODO: Add exception handling

findDuplicateImages(site=Site("commons", "commons"))[source]

Find duplicates of the photo.

Calculates the SHA1 hash and asks the MediaWiki api for a list of duplicates.

TODO: Add exception handling, fix site thing

getDescription(template, extraparams={})[source]

Generate a description for a file.

getTitle(fmt)[source]

Populate format string with %(name)s entries using metadata.

Parameters:fmt (unicode) – format string
Returns:formatted string
Return type:unicode

delete Module

This script can be used to delete and undelete pages en masse.

Of course, you will need an admin account on the relevant wiki.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-always:          Don't prompt to delete pages, just do it.

-summary:         Supply a custom edit summary.

-undelete:        Actually undelete pages instead of deleting.
                  Obviously makes sense only with -page and -file.

Usage: python delete.py [-category categoryName]

Examples:

Delete everything in the category “To delete” without prompting.

python delete.py -cat:”To delete” -always
class scripts.delete.DeletionRobot(generator, summary, **kwargs)[source]

Bases: pywikibot.bot.Bot

This robot allows deletion of pages en masse.

treat(page)[source]

Delete one page from the generator.

scripts.delete.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

disambredir Module

User assisted updating redirect links on disambiguation pages.

Usage::
python disambredir.py [start]

If no starting name is provided, the bot starts at ‘!’.

scripts.disambredir.firstcap(string)[source]
scripts.disambredir.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.disambredir.treat(text, linkedPage, targetPage)[source]

Based on the method of the same name in solve_disambiguation.py.

scripts.disambredir.workon(page, links)[source]

editarticle Module

Edit a Wikipedia article with your favourite editor.

TODO: - non existing pages
  • edit conflicts
  • minor edits
  • watch/unwatch
  • ...
class scripts.editarticle.ArticleEditor(*args)[source]

Bases: object

Edit a wiki page.

handle_edit_conflict(new)[source]
run()[source]
set_options(*args)[source]

Parse commandline and set options attribute.

setpage()[source]

Set page and page title.

scripts.editarticle.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

featured Module

Manage featured/good article/list status template.

This script understands various command-line arguments:

Task commands::

-featured         use this script for featured articles. Default task if no task
                  command is specified

-good             use this script for good articles.

-lists            use this script for featured lists.

-former           use this script for removing {{Link FA|xx}} from former
                  fearured articles

                  NOTE: you may have all of these commands in one run

Option commands::

-interactive:     ask before changing each page

-nocache          doesn't include cache files file to remember if the article
                  already was verified.

-nocache:xx,yy    you may ignore language codes xx,yy,... from cache file

-fromlang:xx,yy   xx,yy,zz,.. are the languages to be verified.
-fromlang:ar--fi  Another possible with range the languages

-fromall          to verify all languages.

-tolang:xx,yy     xx,yy,zz,.. are the languages to be updated

-after:zzzz       process pages after and including page zzzz
                  (sorry, not implemented yet)

-side             use -side if you want to move all {{Link FA|lang}} next to the
                  corresponding interwiki links. Default is placing
                  {{Link FA|lang}} on top of the interwiki links.
                  (This option is deprecated with wikidata)

-count            Only counts how many featured/good articles exist
                  on all wikis (given with the "-fromlang" argument) or
                  on several language(s) (when using the "-fromall" argument).
                  Example: featured.py -fromlang:en,he -count
                  counts how many featured articles exist in the en and he
                  wikipedias.

-quiet            no corresponding pages are displayed.
scripts.featured.BACK(site, name, hide)[source]
scripts.featured.CAT(site, name, hide)[source]
scripts.featured.DATA(site, name, hide)[source]
class scripts.featured.FeaturedBot(**kwargs)[source]

Bases: pywikibot.bot.Bot

Featured article bot.

add_template(source, dest, task, fromsite)[source]

Place or remove the Link_GA/FA template on/from a page.

featuredArticles(site, task, cache)[source]
featuredWithInterwiki(fromsite, task)[source]

Read featured articles and find the corresponding pages.

Find corresponding pages on other sites, place the template and remember the page in the cache dict.

findTranslated(page, oursite=None)[source]
getTemplateList(code, task)[source]
hastemplate(task)[source]
itersites(task)[source]

Generator for site codes to be processed.

readcache(task)[source]
run()[source]
run_task(task)[source]
treat(fromsite, task)[source]
writecache()[source]
scripts.featured.TMPL(site, name, hide)[source]
scripts.featured.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

fixing_redirects Module

Correct all redirect links in featured pages or only one page of each wiki.

Can be using with:: This script supports use of pywikibot.pagegenerators arguments.

-featured Run over featured pages

Run fixing_redirects.py -help to see all the command-line options -file, -ref, -links, ...

scripts.fixing_redirects.firstcap(string)[source]
scripts.fixing_redirects.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.fixing_redirects.treat(text, linkedPage, targetPage)[source]

Based on the method of the same name in solve_disambiguation.py.

scripts.fixing_redirects.workon(page)[source]

flickrripper Module

freebasemappingupload Module

Script to upload the mappings of Freebase to Wikidata.

Can be easily adapted to upload other String identifiers as well

This bot needs the dump from https://developers.google.com/freebase/data#freebase-wikidata-mappings

The script takes a single parameter:

-filename: the filename to read the freebase-wikidata mappings from;
          default: fb2w.nt.gz
class scripts.freebasemappingupload.FreebaseMapperRobot(filename)[source]

Bases: object

Freebase Mapping bot.

processLine(line)[source]
run()[source]
scripts.freebasemappingupload.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

harvest_template Module

Template harvesting script.

Usage:

python harvest_template.py -transcludes:"..." template_parameter PID [template_parameter PID]

 or

python harvest_template.py [generators] -template:"..." template_parameter PID [template_parameter PID]

This will work on all pages that transclude the template in the article namespace

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Examples:

python harvest_template.py -lang:nl -cat:Sisoridae -template:"Taxobox straalvinnige" -namespace:0 orde P70 familie P71 geslacht P74
class scripts.harvest_template.HarvestRobot(generator, templateTitle, fields)[source]

Bases: pywikibot.bot.WikidataBot

A bot to add Wikidata claims.

getTemplateSynonyms(title)[source]

Fetch redirects of the title, so we can check against them.

treat(page, item)[source]

Process a single page/item.

scripts.harvest_template.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

illustrate_wikidata Module

Bot to add images to Wikidata items. The image is extracted from the page_props.

For this to be available the PageImages extension (https://www.mediawiki.org/wiki/Extension:PageImages) needs to be installed

Usage:

python illustrate_wikidata.py <some generator>

This script supports use of pywikibot.pagegenerators arguments.

class scripts.illustrate_wikidata.IllustrateRobot(generator, wdproperty='P18')[source]

Bases: pywikibot.bot.WikidataBot

A bot to add Wikidata image claims.

treat(page, item)[source]

Treat a page / item.

scripts.illustrate_wikidata.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

image Module

This script can be used to change one image to another or remove an image.

Syntax: python image.py image_name [new_image_name]

If only one command-line parameter is provided then that image will be removed; if two are provided, then the first image will be replaced by the second one on all pages.

Command line options:

-summary:  Provide a custom edit summary.  If the summary includes spaces,
          surround it with single quotes, such as::
          -summary:'My edit summary'
-always    Don't prompt to make changes, just do them.
-loose     Do loose replacements.  This will replace all occurences of the name
          of the image (and not just explicit image syntax).  This should work
          to catch all instances of the image, including where it is used as a
          template parameter or in image galleries.  However, it can also make
          more mistakes.  This only works with image replacement, not image
          removal.

Examples:

The image “FlagrantCopyvio.jpg” is about to be deleted, so let’s first remove it from everything that displays it:

python image.py FlagrantCopyvio.jpg

The image “Flag.svg” has been uploaded, making the old “Flag.jpg” obsolete:

python image.py Flag.jpg Flag.svg
class scripts.image.ImageRobot(generator, old_image, new_image=None, **kwargs)[source]

Bases: pywikibot.bot.Bot

This bot will replace or remove all occurences of an old image.

msg_remove = {'nl': 'Bot: afbeelding %s verwijderd', 'de': 'Bot: Entferne Bild %s', 'lt': 'robotas: Šalinamas vaizdas %s', 'pt': 'Bot: Alterando imagem %s', 'nn': 'robot: fjerna biletet %s', 'ko': '로봇 - %s 그림을 제거', 'fr': 'Bot: Enleve image %s', 'es': 'Robot - Retirando imagen %s', 'zh': '機器人:移除圖像 %s', 'no': 'robot: fjerner bildet %s', 'it': "Bot: Rimuovo l'immagine %s", 'pl': 'Robot usuwa obraz %s', 'fa': 'ربات: برداشتن تصویر %s', 'ru': 'Бот: удалил файл %s', 'ja': 'ロボットによる:画像削除 %s', 'en': 'Robot: Removing image %s', 'he': 'בוט: מסיר את התמונה %s', 'ar': 'روبوت - إزالة الصورة %s'}
msg_replace = {'nl': 'Bot: afbeelding %s vervangen door %s', 'de': 'Bot: Ersetze Bild %s durch %s', 'lt': 'robotas: vaizdas %s keičiamas į %s', 'pt': 'Bot: Alterando imagem %s para %s', 'nn': 'robot: erstatta biletet %s med %s', 'ko': '로봇 - 그림 %s을 %s로 치환', 'fr': 'Bot: Remplace image %s par %s', 'es': 'Robot - Reemplazando imagen %s por %s', 'zh': '機器人:取代圖像 %s 至 %s', 'no': 'robot: erstatter bildet %s med %s', 'it': "Bot: Sostituisco l'immagine %s con %s", 'pl': 'Robot zamienia obraz %s na %s', 'fa': 'ربات: جایگزین کردن تصویر %s با %s', 'ru': 'Бот: Замена файла %s на %s', 'ja': 'ロボットによる:画像置き換え %s から %s へ', 'en': 'Bot: Replacing image %s with %s', 'he': 'בוט: מחליף את התמונה %s בתמונה %s', 'ar': 'روبوت - استبدال الصورة %s مع %s'}
run()[source]

Start the bot’s action.

scripts.image.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

imagerecat Module

Program to (re)categorize images at commons.

The program uses commonshelper for category suggestions. It takes the suggestions and the current categories. Put the categories through some filters and adds the result.

The following command line parameters are supported:

-onlyfilter     Don't use Commonsense to get categories, just filter the current
                categories

-onlyuncat      Only work on uncategorized images. Will prevent the bot from
                working on an image multiple times.

-hint           Give Commonsense a hint.
                For example -hint:li.wikipedia.org

-onlyhint       Give Commonsense a hint. And only work on this hint.
               Syntax is the same as -hint. Some special hints are possible::
                _20 : Work on the top 20 wikipedia's
                _80 : Work on the top 80 wikipedia's
                wps : Work on all wikipedia's
scripts.imagerecat.applyAllFilters(categories)[source]

Apply all filters on categories.

scripts.imagerecat.categorizeImages(generator, onlyFilter, onlyUncat)[source]

Loop over all images in generator and try to categorize them.

Get category suggestions from CommonSense.

scripts.imagerecat.filterBlacklist(categories)[source]

Filter out categories which are on the blacklist.

scripts.imagerecat.filterCountries(categories)[source]

Try to filter out ...by country categories.

First make a list of any ...by country categories and try to find some countries. If a by country category has a subcategoy containing one of the countries found, add it. The ...by country categories remain in the set and should be filtered out by filterParents.

scripts.imagerecat.filterDisambiguation(categories)[source]

Filter out disambiguation categories.

scripts.imagerecat.filterParents(categories)[source]

Remove all parent categories from the set to prevent overcategorization.

scripts.imagerecat.followRedirects(categories)[source]

If a category is a redirect, replace the category with the target.

scripts.imagerecat.getCategoryByName(name, parent='', grandparent='')[source]

Get category by name.

scripts.imagerecat.getCheckCategoriesTemplate(usage, galleries, ncats)[source]

Build the check categories template with all parameters.

scripts.imagerecat.getCommonshelperCats(imagepage)[source]

Get category suggestions from CommonSense.

Return type:list of unicode
scripts.imagerecat.getCurrentCats(imagepage)[source]

Get the categories currently on the image.

scripts.imagerecat.getOpenStreetMap(latitude, longitude)[source]

Get the result from https://nominatim.openstreetmap.org/reverse .

Return type:list of tuples
scripts.imagerecat.getOpenStreetMapCats(latitude, longitude)[source]

Get a list of location categories based on the OSM nomatim tool.

scripts.imagerecat.getUsage(use)[source]

Parse the Commonsense output to get the usage.

scripts.imagerecat.initLists()[source]

Get the list of countries & the blacklist from Commons.

scripts.imagerecat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.imagerecat.removeTemplates(oldtext='')[source]

Remove {{Uncategorized}} and {{Check categories}} templates.

scripts.imagerecat.saveImagePage(imagepage, newcats, usage, galleries, onlyFilter)[source]

Remove the old categories and add the new categories to the image.

imagetransfer Module

Script to copy images to Wikimedia Commons, or to another wiki.

Syntax:

python imagetransfer.py pagename [-interwiki] [-tolang:xx] [-tofamily:yy]

Arguments:

-interwiki   Look for images in pages found through interwiki links.

-keepname    Keep the filename and do not verify description while replacing

-tolang:xx   Copy the image to the wiki in language xx

-tofamily:yy Copy the image to a wiki in the family yy

-file:zz     Upload many files from textfile: [[Image:xx]]
                                                 [[Image:yy]]

If pagename is an image description page, offers to copy the image to the target site. If it is a normal page, it will offer to copy any of the images used on that page, or if the -interwiki argument is used, any of the images used on a page reachable via interwiki links.

class scripts.imagetransfer.ImageTransferBot(generator, targetSite=None, interwiki=False, keep_name=False, ignore_warning=False)[source]

Bases: object

Image transfer bot.

run()[source]
showImageList(imagelist)[source]
transferImage(sourceImagePage)[source]

Download image and its description, and upload it to another site.

Returns:the filename which was used to upload the image
scripts.imagetransfer.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

imageuncat Module

Program to add uncat template to images without categories at commons.

See imagerecat.py (still working on that one) to add these images to categories.

scripts.imageuncat.addUncat(page)[source]

Add the uncat template to the page.

Parameters:page – Page to be modified
Return type:Page
scripts.imageuncat.isUncat(page)[source]

Do we want to skip this page.

If we found a category which is not in the ignore list it means that the page is categorized so skip the page. If we found a template which is in the ignore list, skip the page.

scripts.imageuncat.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.imageuncat.recentChanges(site=None, delay=0, block=70)[source]

Return a pagegenerator containing all the images edited in a certain timespan.

The delay is the amount of minutes to wait and the block is the timespan to return images in. Should probably be copied to somewhere else

scripts.imageuncat.uploadedYesterday(site)[source]

Return a pagegenerator containing all the pictures uploaded yesterday.

Should probably copied to somewhere else

interwiki Module

Script to check language links for general pages.

Uses existing translations of a page, plus hints from the command line, to download the equivalent pages from other languages. All of such pages are downloaded as well and checked for interwiki links recursively until there are no more links that are encountered. A rationalization process then selects the right interwiki links, and if this is unambiguous, the interwiki links in the original page will be automatically updated and the modified page uploaded.

These command-line arguments can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

-days: Like -years, but runs through all date pages. Stops at
Dec 31. If the argument is given in the form -days:X, it will start at month no. X through Dec 31. If the argument is simply given as -days, it will run from Jan 1 through Dec 31. E.g. for -days:9 it will run from Sep 1 through Dec 31.
-years: run on all year pages in numerical order. Stop at year 2050.

If the argument is given in the form -years:XYZ, it will run from [[XYZ]] through [[2050]]. If XYZ is a negative value, it is interpreted as a year BC. If the argument is simply given as -years, it will run from 1 through 2050.

This implies -noredirect.

-new: Work on the 100 newest pages. If given as -new:x, will work

on the x newest pages. When multiple -namespace parameters are given, x pages are inspected, and only the ones in the selected name spaces are processed. Use -namespace:all for all namespaces. Without -namespace, only article pages are processed.

This implies -noredirect.

-restore: restore a set of “dumped” pages the bot was working on
when it terminated. The dump file will be subsequently removed.
-restore:all restore a set of “dumped” pages of all dumpfiles to a given
family remaining in the “interwiki-dumps” directory. All these dump files will be subsequently removed. If restoring process interrupts again, it saves all unprocessed pages in one new dump file of the given site.
-continue: like restore, but after having gone through the dumped pages,
continue alphabetically starting at the last of the dumped pages. The dump file will be subsequently removed.
-warnfile: used as -warnfile:filename, reads all warnings from the
given file that apply to the home wiki language, and read the rest of the warning as a hint. Then treats all the mentioned pages. A quicker way to implement warnfile suggestions without verifying them against the live wiki is using the warnfile.py script.

Additionaly, these arguments can be used to restrict the bot to certain pages:

-namespace:n   Number or name of namespace to process. The parameter can be
                used multiple times. It works in combination with all other
                parameters, except for the -start parameter. If you e.g.
                want to iterate over all categories starting at M, use
                -start:Category:M.

-number:       used as -number:#, specifies that the bot should process
                that amount of pages and then stop. This is only useful in
                combination with -start. The default is not to stop.

-until:        used as -until:title, specifies that the bot should
                process pages in wiki default sort order up to, and
                including, "title" and then stop. This is only useful in
                combination with -start. The default is not to stop.
                Note: do not specify a namespace, even if -start has one.

-bracket       only work on pages that have (in the home language)
                parenthesis in their title. All other pages are skipped.
                (note: without ending colon)

-skipfile:     used as -skipfile:filename, skip all links mentioned in
                the given file. This does not work with -number!

-skipauto      use to skip all pages that can be translated automatically,
                like dates, centuries, months, etc.
                (note: without ending colon)

-lack:         used as -lack:xx with xx a language code: only work on pages
                without links to language xx. You can also add a number nn
                like -lack:xx:nn, so that the bot only works on pages with
                at least nn interwiki links (the default value for nn is 1).

These arguments control miscellanous bot behaviour:

-quiet         Use this option to get less output
                (note: without ending colon)

-async         Put page on queue to be saved to wiki asynchronously. This
                enables loading pages during saving throtteling and gives a
                better performance.
                NOTE: For post-processing it always assumes that saving the
                the pages was sucessful.
                (note: without ending colon)

-summary:      Set an additional action summary message for the edit. This
                could be used for further explainings of the bot action.
                This will only be used in non-autonomous mode.

-hintsonly     The bot does not ask for a page to work on, even if none of
                the above page sources was specified.  This will make the
                first existing page of -hint or -hinfile slip in as the start
                page, determining properties like namespace, disambiguation
                state, and so on.  When no existing page is found in the
                hints, the bot does nothing.
                Hitting return without input on the "Which page to check:"
                prompt has the same effect as using -hintsonly.
                Options like -back, -same or -wiktionary are in effect only
                after a page has been found to work on.
                (note: without ending colon)

These arguments are useful to provide hints to the bot:

-hint:         used as -hint:de:Anweisung to give the bot a hint
                where to start looking for translations. If no text
                is given after the second ':', the name of the page
                itself is used as the title for the hint, unless the
                -hintnobracket command line option (see there) is also
                selected.

                There are some special hints, trying a number of languages
               at once::
                   * all:       All languages with at least ca. 100 articles.
                   * 10:        The 10 largest languages (sites with most
                                 articles). Analogous for any other natural
                                 number.
                   * arab:      All languages using the Arabic alphabet.
                   * cyril:     All languages that use the Cyrillic alphabet.
                   * chinese:   All Chinese dialects.
                   * latin:     All languages using the Latin script.
                   * scand:     All Scandinavian languages.

                Names of families that forward their interlanguage links
                to the wiki family being worked upon can be used (with
               -family=wikipedia only), they are::
                   * commons:   Interlanguage links of Mediawiki Commons.
                   * incubator: Links in pages on the Mediawiki Incubator.
                   * meta:      Interlanguage links of named pages on Meta.
                   * species:   Interlanguage links of the wikispecies wiki.
                   * strategy:  Links in pages on Wikimedias strategy wiki.
                   * test:      Take interwiki links from Test Wikipedia

                Languages, groups and families having the same page title
                can be combined, as  -hint:5,scand,sr,pt,commons:New_York

-hintfile:     similar to -hint, except that hints are taken from the given
                file, enclosed in [[]] each, instead of the command line.

-askhints:     for each page one or more hints are asked. See hint: above
                for the format, one can for example give "en:something" or
                "20:" as hint.

-repository    Include data repository

-same          looks over all 'serious' languages for the same title.
               -same is equivalent to -hint:all::
                (note: without ending colon)

-wiktionary:   similar to -same, but will ONLY accept names that are
                identical to the original. Also, if the title is not
                capitalized, it will only go through other wikis without
                automatic capitalization.

-untranslated: works normally on pages with at least one interlanguage
                link; asks for hints for pages that have none.

-untranslatedonly: same as -untranslated, but pages which already have a
                translation are skipped. Hint: do NOT use this in
                combination with -start without a -number limit, because
                you will go through the whole alphabet before any queries
                are performed!

-showpage      when asking for hints, show the first bit of the text
                of the page always, rather than doing so only when being
                asked for (by typing '?'). Only useful in combination
                with a hint-asking option like -untranslated, -askhints
                or -untranslatedonly.
                (note: without ending colon)

-noauto        Do not use the automatic translation feature for years and
                dates, only use found links and hints.
                (note: without ending colon)

-hintnobracket used to make the bot strip everything in brackets,
                and surrounding spaces from the page name, before it is
                used in a -hint:xy: where the page name has been left out,
                or -hint:all:, -hint:10:, etc. without a name, or
                an -askhint reply, where only a language is given.

These arguments define how much user confirmation is required:

-autonomous    run automatically, do not ask any questions. If a question
-auto          to an operator is needed, write the name of the page
                to autonomous_problems.dat and continue on the next page.
                (note: without ending colon)

-confirm       ask for confirmation before any page is changed on the
                live wiki. Without this argument, additions and
                unambiguous modifications are made without confirmation.
                (note: without ending colon)

-force         do not ask permission to make "controversial" changes,
                like removing a language because none of the found
                alternatives actually exists.
                (note: without ending colon)

-cleanup       like -force but only removes interwiki links to non-existent
                or empty pages.

-select        ask for each link whether it should be included before
                changing any page. This is useful if you want to remove
                invalid interwiki links and if you do multiple hints of
                which some might be correct and others incorrect. Combining
                -select and -confirm is possible, but seems like overkill.
                (note: without ending colon)

These arguments specify in which way the bot should follow interwiki links:

-noredirect    do not follow redirects nor category redirects.
                (note: without ending colon)

-initialredirect  work on its target if a redirect or category redirect is
                entered on the command line or by a generator (note: without
                ending colon). It is recommended to use this option with the
                -movelog pagegenerator.

-neverlink:    used as -neverlink:xx where xx is a language code::
                Disregard any links found to language xx. You can also
                specify a list of languages to disregard, separated by
                commas.

-ignore:       used as -ignore:xx:aaa where xx is a language code, and
                aaa is a page title to be ignored.

-ignorefile:   similar to -ignore, except that the pages are taken from
                the given file instead of the command line.

-localright    do not follow interwiki links from other pages than the
                starting page. (Warning! Should be used very sparingly,
                only when you are sure you have first gotten the interwiki
                links on the starting page exactly right).
                (note: without ending colon)

-hintsareright do not follow interwiki links to sites for which hints
                on existing pages are given. Note that, hints given
                interactively, via the -askhint command line option,
                are only effective once they have been entered, thus
                interwiki links on the starting page are followed
                regardess of hints given when prompted.
                (Warning! Should be used with caution!)
                (note: without ending colon)

-back          only work on pages that have no backlink from any other
                language; if a backlink is found, all work on the page
                will be halted.  (note: without ending colon)

The following arguments are only important for users who have accounts for multiple languages, and specify on which sites the bot should modify pages:

-localonly     only work on the local wiki, not on other wikis in the
                family I have a login at. (note: without ending colon)

-limittwo      only update two pages - one in the local wiki (if logged-in)
                and one in the top available one.
                For example, if the local page has links to de and fr,
                this option will make sure that only the local site and
                the de: (larger) sites are updated. This option is useful
                to quickly set two way links without updating all of the
                wiki families sites.
                (note: without ending colon)

-whenneeded    works like limittwo, but other languages are changed in the
               following cases::
                * If there are no interwiki links at all on the page
                * If an interwiki link must be removed
                * If an interwiki link must be changed and there has been
                  a conflict for this page
                Optionally, -whenneeded can be given an additional number
                (for example -whenneeded:3), in which case other languages
                will be changed if there are that number or more links to
                change or add. (note: without ending colon)

The following arguments influence how many pages the bot works on at once:

-array:        The number of pages the bot tries to be working on at once.
                If the number of pages loaded is lower than this number,
                a new set of pages is loaded from the starting wiki. The
                default is 100, but can be changed in the config variable
                interwiki_min_subjects

-query:        The maximum number of pages that the bot will load at once.
                Default value is 50.

Some configuration option can be used to change the working of this bot:

interwiki_min_subjects: the minimum amount of subjects that should be processed
at the same time.
interwiki_backlink: if set to True, all problems in foreign wikis will
be reported

interwiki_shownew: should interwiki.py display every new link it discovers?

interwiki_graph: output a graph PNG file on conflicts? You need pydot for
this: https://pypi.python.org/pypi/pydot/1.0.2 https://code.google.com/p/pydot/

interwiki_graph_format: the file format for interwiki graphs

without_interwiki: save file with local articles without interwikis

All these options can be changed through the user-config.py configuration file.

If interwiki.py is terminated before it is finished, it will write a dump file to the interwiki-dumps subdirectory. The program will read it if invoked with the “-restore” or “-continue” option, and finish all the subjects in that list. After finishing the dump file will be deleted. To run the interwiki-bot on all pages on a language, run it with option “-start:!”, and if it takes so long that you have to break it off, use “-continue” next time.

exception scripts.interwiki.GiveUpOnPage(arg)[source]

Bases: pywikibot.exceptions.Error

The user chose not to work on this page and its linked pages any more.

class scripts.interwiki.Global[source]

Bases: object

Container class for global settings.

Use of globals outside of this is to be avoided.

always = False
askhints = False
async = False
auto = True
autonomous = False
cleanup = False
confirm = False
contentsondisk = False
followinterwiki = True
followredirect = True
force = False
hintnobracket = False
hints = []
hintsareright = False
ignore = []
initialredirect = False
lacklanguage = None
limittwo = False
localonly = False
maxquerysize = 50
minsubjects = 100
needlimit = 0
nobackonly = False
parenthesesonly = False
quiet = False
readOptions(arg)[source]

Read all commandline parameters for the global container.

rememberno = False
remove = []
repository = False
restoreAll = False
same = False
select = False
showtextlinkadd = 300
skip = set()
skipauto = False
strictlimittwo = False
summary = ''
untranslated = False
untranslatedonly = False
class scripts.interwiki.InterwikiBot[source]

Bases: object

A class keeping track of a list of subjects.

It controls which pages are queried from which languages when.

add(page, hints=None)[source]

Add a single subject to the list.

dump(append=True)[source]
firstSubject()[source]

Return the first subject that is still being worked on.

generateMore(number)[source]

Generate more subjects.

This is called internally when the list of subjects becomes too small, but only if there is a PageGenerator

isDone()[source]

Check whether there is still more work to do.

maxOpenSite()[source]

Return the site that has the most open queries plus the number.

If there is nothing left, return None. Only languages that are TODO for the first Subject are returned.

minus(site, count=1)[source]

Helper routine that the Subject class expects in a counter.

oneQuery()[source]

Perform one step in the solution process.

Returns True if pages could be preloaded, or false otherwise.

plus(site, count=1)[source]

Helper routine that the Subject class expects in a counter.

queryStep()[source]
run()[source]

Start the process until finished.

selectQuerySite()[source]

Select the site the next query should go out for.

setPageGenerator(pageGenerator, number=None, until=None)[source]

Add a generator of subjects.

Once the list of subjects gets too small, this generator is called to produce more Pages.

exception scripts.interwiki.LinkMustBeRemoved(arg)[source]

Bases: scripts.interwiki.SaveError

An interwiki link has to be removed, but this can’t be done because of user preferences or because the user chose not to change the page.

class scripts.interwiki.PageTree[source]

Bases: object

Structure to manipulate a set of pages.

Allows filtering efficiently by Site.

add(page)[source]
filter(site)[source]

Iterate over pages that are in Site site.

remove(page)[source]
removeSite(site)[source]

Remove all pages from Site site.

siteCounts()[source]

Yield (Site, number of pages in site) pairs.

exception scripts.interwiki.SaveError(arg)[source]

Bases: pywikibot.exceptions.Error

An attempt to save a page with changed interwiki has failed.

class scripts.interwiki.StoredPage(page)[source]

Bases: pywikibot.page.Page

Store the Page contents on disk.

This is to avoid sucking too much memory when a big number of Page objects will be loaded at the same time.

SPcopy = ['_editrestriction', '_site', '_namespace', '_section', '_title', 'editRestriction', 'moveRestriction', '_permalink', '_userName', '_ipedit', '_editTime', '_startTime', '_revisionId', '_deletedRevs']
SPdelContents()[source]
static SPdeleteStore()[source]
SPgetContents()[source]
SPpath = None
SPsetContents(contents)[source]
SPstore = None
class scripts.interwiki.Subject(originPage=None, hints=None)[source]

Bases: object

Class to follow the progress of a single ‘subject’.

(i.e. a page with all its translations)

Subject is a transitive closure of the binary relation on Page:: “has_a_langlink_pointing_to”.

A formal way to compute that closure would be:

With P a set of pages, NL (‘NextLevel’) a function on sets defined as::
NL(P) = { target | ∃ source ∈ P, target ∈ source.langlinks() }
pseudocode::

todo <- [originPage] done <- [] while todo != []:

pending <- todo
todo <-NL(pending) / done
done <- NL(pending) U done

return done

There is, however, one limitation that is induced by implementation:: to compute efficiently NL(P), one has to load the page contents of pages in P. (Not only the langlinks have to be parsed from each Page, but we also want

to know if the Page is a redirect, a disambiguation, etc...)

Because of this, the pages in pending have to be preloaded. However, because the pages in pending are likely to be in several sites we cannot “just” preload them as a batch.

Instead of doing “pending <- todo” at each iteration, we have to elect a Site, and we put in pending all the pages from todo that belong to that Site:

Code becomes::

todo <- {originPage.site:[originPage]} done <- [] while todo != {}:

site <- electSite()
pending <- todo[site]

preloadpages(site, pending)

todo[site] <- NL(pending) / done
done <- NL(pending) U done

return done

Subject objects only operate on pages that should have been preloaded before. In fact, at any time:

* todo contains new Pages that have not been loaded yet
* done contains Pages that have been loaded, and that have been treated.
* If batch preloadings are successful, Page._get() is never called from
  this Object.
addIfNew(page, counter, linkingPage)[source]

Add the pagelink given to the todo list, if it hasnt been seen yet.

If it is added, update the counter accordingly.

Also remembers where we found the page, regardless of whether it had already been found before or not.

Returns True if the page is new.

askForHints(counter)[source]
assemble()[source]
batchLoaded(counter)[source]

Notify that the promised batch of pages was loaded.

This is called by a worker to tell us that the promised batch of pages was loaded. In other words, all the pages in self.pending have already been preloaded.

The only argument is an instance of a counter class, that has methods minus() and plus() to keep counts of the total work todo.

clean()[source]

Delete the contents that are stored on disk for this Subject.

We cannot afford to define this in a StoredPage destructor because StoredPage instances can get referenced cyclicly: that would stop the garbage collector from destroying some of those objects.

It’s also not necessary to set these lines as a Subject destructor:: deleting all stored content one entry by one entry when bailing out after a KeyboardInterrupt for example is redundant, because the whole storage file will be eventually removed.

disambigMismatch(page, counter)[source]

Check whether the given page has a different disambiguation status.

Returns a tuple (skip, alternativePage).

skip is True if the pages have mismatching statuses and the bot is either in autonomous mode, or the user chose not to use the given page.

alternativePage is either None, or a page that the user has chosen to use instead of the given page.

finish()[source]

Round up the subject, making any necessary changes.

This should be called exactly once after the todo list has gone empty.

getFoundDisambig(site)[source]

Return the first disambiguation found.

If we found a disambiguation on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.

getFoundInCorrectNamespace(site)[source]

Return the first page in the extended namespace.

If we found a page that has the expected namespace on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.

getFoundNonDisambig(site)[source]

Return the first non-disambiguation found.

If we found a non-disambiguation on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.

isDone()[source]

Return True if all the work for this subject has completed.

isIgnored(page)[source]
makeForcedStop(counter)[source]

End work on the page before the normal end.

namespaceMismatch(linkingPage, linkedPage, counter)[source]

Check whether or not the given page has a different namespace.

Returns True if the namespaces are different and the user has selected not to follow the linked page.

openSites()[source]

Iterator.

Yields (site, count) pairs:: * site is a site where we still have work to do on * count is the number of items in that Site that need work on

problem(txt, createneed=True)[source]

Report a problem with the resolution of this subject.

Return True if saving was successful.

Report missing back links. This will be called from finish() if needed.

updatedSites is a list that contains all sites we changed, to avoid reporting of missing backlinks for pages we already fixed

reportInterwikilessPage(page)[source]
skipPage(page, target, counter)[source]
translate(hints=None, keephintedsites=False)[source]

Add the given translation hints to the todo list.

whatsNextPageBatch(site)[source]

Return the next page batch.

By calling this method, you ‘promise’ this instance that you will preload all the ‘site’ Pages that are in the todo list.

This routine will return a list of pages that can be treated.

whereReport(page, indent=4)[source]
wiktionaryMismatch(page)[source]
scripts.interwiki.botMayEdit(page)[source]
scripts.interwiki.compareLanguages(old, new, insite)[source]
scripts.interwiki.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.interwiki.readWarnfile(filename, bot)[source]

isbn Module

This script reports and fixes invalid ISBN numbers.

Additionally, it can convert all ISBN-10 codes to the ISBN-13 format, and correct the ISBN format by placing hyphens.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

-namespace:n Number or name of namespace to process. The parameter can be
used multiple times. It works in combination with all other parameters, except for the -start parameter. If you e.g. want to iterate over all categories starting at M, use -start:Category:M.

Furthermore, the following command line parameters are supported:

-to13             Converts all ISBN-10 codes to ISBN-13.
                  NOTE: This needn't be done, as MediaWiki still supports
                  (and will keep supporting) ISBN-10, and all libraries and
                  bookstores will most likely do so as well.

-format           Corrects the hyphenation.
                  NOTE: This is in here for testing purposes only. Usually
                  it's not worth to create an edit for such a minor issue.
                  The recommended way of doing this is enabling
                  cosmetic_changes, so that these changes are made on-the-fly
                  to all pages that are modified.

-always           Don't prompt you for each replacement.
class scripts.isbn.ISBN[source]

Bases: object

Abstract superclass.

format()[source]

Put hyphens into this ISBN number.

class scripts.isbn.ISBN10(code)[source]

Bases: scripts.isbn.ISBN

ISBN 10.

checkChecksum()[source]

Raise an InvalidIsbnException if the ISBN checksum is incorrect.

checkValidity()[source]
digits()[source]

Return a list of the digits and Xs in the ISBN code.

format()[source]
possiblePrefixes()[source]
toISBN13()[source]

Create a 13-digit ISBN from this 10-digit ISBN.

Adds the GS1 prefix ‘978’ and recalculates the checksum. The hyphenation structure is taken from the format of the original ISBN number.

class scripts.isbn.ISBN13(code, checksumMissing=False)[source]

Bases: scripts.isbn.ISBN

ISBN 13.

calculateChecksum()[source]
checkValidity()[source]
digits()[source]

Return a list of the digits in the ISBN code.

possiblePrefixes()[source]
exception scripts.isbn.InvalidIsbnException(arg)[source]

Bases: pywikibot.exceptions.Error

Invalid ISBN.

class scripts.isbn.IsbnBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

ISBN bot.

run()[source]
treat(page)[source]
scripts.isbn.convertIsbn10toIsbn13(text)[source]

Helper function to convert ISBN 10 to ISBN 13.

scripts.isbn.getIsbn(code)[source]

Return an ISBN object for the code.

scripts.isbn.hyphenateIsbnNumbers(text)[source]

Helper function to hyphenate an ISBN.

scripts.isbn.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

listpages Module

Print a list of pages, as defined by page generator parameters.

Optionally, it also prints page content to STDOUT.

These parameters are supported to specify which pages titles to print:

-format  Defines the output format.

        Can be a custom string according to python string.format() notation or
        can be selected by a number from following list (1 is default format)::
        1 - u'{num:4d} {page.title}'
            --> 10 PageTitle

        2 - u'{num:4d} {[[page.title]]}'
            --> 10 [[PageTitle]]

        3 - u'{page.title}'
            --> PageTitle

        4 - u'{[[page.title]]}'
            --> [[PageTitle]]

        5 - u'{num:4d} \03{{lightred}}{page.loc_title:<40}\03{{default}}'
            --> 10 PageTitle (colorised in lightred)

        6 - u'{num:4d} {page.loc_title:<40} {page.can_title:<40}'
            --> 10 localised_Namespace:PageTitle canonical_Namespace:PageTitle

        7 - u'{num:4d} {page.loc_title:<40} {page.trs_title:<40}'
            --> 10 localised_Namespace:PageTitle outputlang_Namespace:PageTitle
            (*) requires "outputlang:lang" set.

        num is the sequential number of the listed page.

-outputlang   Language for translation of namespaces

-notitle Page title is not printed.

-get     Page content is printed.
Custom format can be applied to the following items extrapolated from a

page object:

site: obtained from page._link._site

title: obtained from page._link._title

loc_title: obtained from page._link.canonical_title()

can_title: obtained from page._link.ns_title()
based either the canonical namespace name or on the namespace name in the language specified by the -trans param; a default value ‘**‘ will be used if no ns is found.

onsite: obtained from pywikibot.Site(outputlang, self.site.family)

trs_title: obtained from page._link.ns_title(onsite=onsite)

This script supports use of pywikibot.pagegenerators arguments.

class scripts.listpages.Formatter(page, outputlang=None, default='******')[source]

Bases: object

Structure with Page attributes exposed for formatting from cmd line.

fmt_need_lang = ['7']
fmt_options = {'3': '{page.title}', '2': '{num:4d} [[{page.title}]]', '5': '{num:4d} \x03{{lightred}}{page.loc_title:<40}\x03{{default}}', '1': '{num:4d} {page.title}', '6': '{num:4d} {page.loc_title:<40} {page.can_title:<40}', '7': '{num:4d} {page.loc_title:<40} {page.trs_title:<40}', '4': '[[{page.title}]]'}
output(num=None, fmt=1)[source]

Output formatted string.

scripts.listpages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

login Module

Script to log the bot in to a wiki account.

Suggestion is to make a special account to use for bot use only. Make sure this bot account is well known on your home wiki before using.

Parameters:

-family:FF
-lang:LL     Log in to the LL language of the FF family.
              Example: -family:wiktionary -lang:fr will log you in at
              fr.wiktionary.org.

-all         Try to log in on all sites where a username is defined in
              user-config.py.

-logout      Log out of the curren site. Combine with -all to log out of
              all sites, or with -family and -lang to log out of a specific
              site.

-force       Ignores if the user is already logged in, and tries to log in.

-pass        Useful in combination with -all when you have accounts for
              several sites and use the same password for all of them.
              Asks you for the password, then logs in on all given sites.

-pass:XXXX   Uses XXXX as password. Be careful if you use this
              parameter because your password will be shown on your
              screen, and will probably be saved in your command line
              history. This is NOT RECOMMENDED for use on computers
              where others have either physical or remote access.
              Use -pass instead.

-sysop       Log in with your sysop account.

If not given as parameter, the script will ask for your username and password (password entry will be hidden), log in to your home wiki using this combination, and store the resulting cookies (containing your password hash, so keep it secured!) in a file in the data subdirectory.

All scripts in this library will be looking for this cookie file and will use the login information if it is present.

To log out, throw away the *.lwp file that is created in the data subdirectory.

scripts.login.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

lonelypages Module

This is a script written to add the template “orphan” to pages.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

-xml Retrieve information from a local XML dump (pages-articles or pages-meta-current, see https://download.wikimedia.org). Argument can also be given as “-xml:filename”.
-page Only edit a specific page. Argument can also be given as “-page:pagetitle”. You can give this parameter multiple times to edit multiple pages.

Furthermore, the following command line parameters are supported:

-enable:          Enable or disable the bot via a Wiki Page.

-disambig:        Set a page where the bot saves the name of the disambig
                  pages found (default: skip the pages)

-limit:           Set how many pages check.

-always           Always say yes, won't ask


--- Examples ---
 python lonelypages.py -enable:User:Bot/CheckBot -always
class scripts.lonelypages.LonelyPagesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

Orphan page tagging bot.

enable_page()[source]
run()[source]
treat(page)[source]
scripts.lonelypages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

misspelling Module

This script fixes links that contain common spelling mistakes.

This is only possible on wikis that have a template for these misspellings.

Command line options:

-always:XY  instead of asking the user what to do, always perform the same
            action. For example, XY can be "r0", "u" or "2". Be careful with
            this option, and check the changes made by the bot. Note that
            some choices for XY don't make sense and will result in a loop,
            e.g. "l" or "m".

-start:XY   goes through all misspellings in the category on your wiki
            that is defined (to the bot) as the category containing
            misspelling pages, starting at XY. If the -start argument is not
            given, it starts at the beginning.

-main       only check pages in the main namespace, not in the talk,
            wikipedia, user, etc. namespaces.
class scripts.misspelling.MisspellingRobot(always, firstPageTitle, main_only)[source]

Bases: scripts.solve_disambiguation.DisambiguationRobot

Spelling bot.

createPageGenerator(firstPageTitle)[source]
findAlternatives(disambPage)[source]
misspellingCategory = {'hu': 'Átirányítások hibás névről', 'de': 'Kategorie:Wikipedia:Falschschreibung', 'en': 'Redirects from misspellings', 'da': 'Omdirigeringer af fejlstavninger', 'nl': 'Categorie:Wikipedia:Redirect voor spelfout'}
misspellingTemplate = {'hu': None, 'de': 'Falschschreibung', 'en': None, 'da': None, 'nl': None}
setSummaryMessage(disambPage, new_targets=[], unlink=False, dn=False)[source]
scripts.misspelling.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

movepages Module

This script can move pages.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-from and -to     The page to move from and the page to move to.

-noredirect       Leave no redirect behind.

-notalkpage       Do not move this page's talk page (if it exists)

-prefix           Move pages by adding a namespace prefix to the names of the
                  pages. (Will remove the old namespace prefix if any)
                  Argument can also be given as "-prefix:namespace:".

-always           Don't prompt to make changes, just do them.

-skipredirects    Skip redirect pages (Warning: increases server load)

-summary          Prompt for a custom summary, bypassing the predefined message
                  texts. Argument can also be given as "-summary:XYZ".

-pairs            Read pairs of file names from a file. The file must be in a
                  format [[frompage]] [[topage]] [[frompage]] [[topage]] ...
                  Argument can also be given as "-pairs:filename"
class scripts.movepages.MovePagesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

Page move bot.

moveOne(page, newPageTitle)[source]
treat(page)[source]
scripts.movepages.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

newitem Module

This script creates new items on Wikidata based on certain criteria.

  • When was the (Wikipedia) page created?
  • When was the last edit on the page?
  • Does the page contain interwiki’s?

This script understands various command-line arguments:

-lastedit         The minimum number of days that has passed since the page was
                  last edited.

-pageage          The minimum number of days that has passed since the page was
                  created.

-touch            Do a null edit on every page which has a wikibase item.
class scripts.newitem.NewItemRobot(generator, **kwargs)[source]

Bases: pywikibot.bot.WikidataBot

A bot to create new items.

treat(page, item)[source]

Treat page/item.

scripts.newitem.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

noreferences Module

This script adds a missing references section to pages.

It goes over multiple pages, searches for pages where <references /> is missing although a <ref> tag is present, and in that case adds a new references section.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

-xml Retrieve information from a local XML dump (pages-articles or pages-meta-current, see https://download.wikimedia.org). Argument can also be given as “-xml:filename”.
-namespace:n Number or name of namespace to process. The parameter can be
used multiple times. It works in combination with all other parameters, except for the -start parameter. If you e.g. want to iterate over all categories starting at M, use -start:Category:M.
-always Don’t prompt you for each replacement.
-quiet Use this option to get less output

If neither a page title nor a page generator is given, it takes all pages from the default maintenance category.

It is strongly recommended not to run this script over the entire article namespace (using the -start) parameter, as that would consume too much bandwidth. Instead, use the -xml parameter, or use another way to generate a list of affected articles

class scripts.noreferences.NoReferencesBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

References section bot.

addReferences(oldText)[source]

Add a references tag into an existing section where it fits into.

If there is no such section, creates a new section containing the references tag. * Returns : The modified pagetext

createReferenceSection(oldText, index, ident='==')[source]
lacksReferences(text)[source]

Check whether or not the page is lacking a references tag.

run()[source]
class scripts.noreferences.XmlDumpNoReferencesPageGenerator(xmlFilename)[source]

Bases: object

Generator which will yield Pages that might lack a references tag.

These pages will be retrieved from a local XML dump file (pages-articles or pages-meta-current).

scripts.noreferences.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

nowcommons Module

Script to delete files that are also present on Wikimedia Commons.

Do not run this script on Wikimedia Commons itself. It works based on a given array of templates defined below.

Files are downloaded and compared. If the files match, it can be deleted on the source wiki. If multiple versions of the file exist, the script will not delete. If the MD5 comparison is not equal, the script will not delete.

A sysop account on the local wiki is required if you want all features of this script to work properly.

This script understands various command-line arguments::
-always run automatically, do not ask any questions. All files that qualify for deletion are deleted. Reduced screen output.
-replace replace links if the files are equal and the file names differ
-replacealways replace links if the files are equal and the file names differ without asking for confirmation
-replaceloose Do loose replacements. This will replace all occurences of the name of the image (and not just explicit image syntax). This should work to catch all instances of the file, including where it is used as a template parameter or in galleries. However, it can also make more mistakes.
-replaceonly Use this if you do not have a local sysop account, but do wish to replace links from the NowCommons template.
-hash Use the hash to identify the images that are the same. It doesn’t work always, so the bot opens two tabs to let to the user to check if the images are equal or not.
– Example –
python nowcommons.py -replaceonly -hash -replace -replaceloose -replacealways

– Known issues –

Please fix these if you are capable and motivated::
  • if a file marked nowcommons is not present on Wikimedia Commons, the bot
will exit.
class scripts.nowcommons.NowCommonsDeleteBot(**kwargs)[source]

Bases: pywikibot.bot.Bot

Bot to delete migrated files.

findFilenameOnCommons(localImagePage)[source]
getPageGenerator()[source]
ncTemplates()[source]
run()[source]
useHashGenerator()[source]
scripts.nowcommons.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

pagefromfile Module

Bot to upload pages from a file.

This bot takes its input from a file that contains a number of pages to be put on the wiki. The pages should all have the same begin and end text (which may not overlap).

By default the text should have the intended title of the page as the first text in bold (that is, between ‘’’ and ‘’‘), you can modify this behavior with command line options.

The default is not to include the begin and end text in the page, if you want to include that text, use the -include option.

Specific arguments:

-start:xxx      Specify the text that marks the beginning of a page
-end:xxx        Specify the text that marks the end of a page
-file:xxx       Give the filename we are getting our material from
                (default: dict.txt)
-include        The beginning and end markers should be included
                in the page.
-titlestart:xxx Use xxx in place of ''' for identifying the
                beginning of page title
-titleend:xxx   Use xxx in place of ''' for identifying the
                end of page title
-notitle        do not include the title, including titlestart, and
                titleend, in the page
-nocontent      If page has this statment it doesn't append
                (example: -nocontent:"{{infobox")
-noredirect     if you don't want to upload on redirect page
                it is True by default and bot adds pages to redirected pages
-summary:xxx    Use xxx as the edit summary for the upload - if
                a page exists, standard messages are appended
                after xxx for appending, prepending, or replacement
-autosummary    Use MediaWikis autosummary when creating a new page,
                overrides -summary in this case
-minor          set minor edit flag on page edits

If the page to be uploaded already exists:

-safe           do nothing (default)
-appendtop      add the text to the top of it
-appendbottom   add the text to the bottom of it
-force          overwrite the existing page
exception scripts.pagefromfile.NoTitle(offset)[source]

Bases: Exception

No title found.

class scripts.pagefromfile.PageFromFileReader(filename, pageStartMarker, pageEndMarker, titleStartMarker, titleEndMarker, include, notitle)[source]

Bases: object

Responsible for reading the file.

The run() method yields a (title, contents) tuple for each found page.

findpage(text)[source]

Find page to work on.

run()[source]

Read file and yield page title and content.

class scripts.pagefromfile.PageFromFileRobot(reader, **kwargs)[source]

Bases: pywikibot.bot.Bot

Responsible for writing pages to the wiki.

Titles and contents are given by a PageFromFileReader.

run()[source]

Start file processing and upload content.

save(title, contents)[source]

Upload page content.

scripts.pagefromfile.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

protect Module

This script can be used to protect and unprotect pages en masse.

Of course, you will need an admin account on the relevant wiki. These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-always           Don't prompt to protect pages, just do it.

-summary:         Supply a custom edit summary. Tries to generate summary from
                  the page selector. If no summary is supplied or couldn't
                  determine one from the selector it'll ask for one.

-unprotect        Acts like "default:none"

-default:         Sets the default protection level (default 'sysop'). If no
                  level is defined it doesn't change unspecified levels.

-[type]:[level]   Set [type] protection level to [level]

Usual values for [level] are: sysop, autoconfirmed, none; further levels may be provided by some wikis.

For all protection types (edit, move, etc.) it chooses the default protection level. This is “sysop” or “none” if -unprotect was selected. If multiple

-unprotect or -default are used, only the last occurence is applied.

Usage: python protect.py <OPTIONS>

Examples:

Protect everything in the category ‘To protect’ prompting.
python protect.py -cat:’To protect’
Unprotect all pages listed in text file ‘unprotect.txt’ without prompting.
python protect.py -file:unprotect.txt -unprotect -always
class scripts.protect.ProtectionRobot(generator, protections, **kwargs)[source]

Bases: pywikibot.bot.Bot

This bot allows protection of pages en masse.

run()[source]

Start the bot’s action.

Loop through everything in the page generator and apply the protections.

scripts.protect.check_protection_level(operation, level, levels, default=None)[source]

Check if the protection level is valid or asks if necessary.

:return a valid protection level :rtype string

scripts.protect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

redirect Module

Script to resolve double redirects, and to delete broken redirects.

Requires access to MediaWiki’s maintenance pages or to a XML dump file. Delete function requires adminship.

Syntax:

python redirect.py action [-arguments ...]

where action can be one of these:

double Fix redirects which point to other redirects. do Shortcut action command is “do”.

broken Tries to fix broken redirect to the last moved target of the br destination page. If this fails and -delete option is given

it deletes redirects where targets don’t exist if bot has admin rights otherwise it marks the page with a speedy deletion template if available. Shortcut action command is “br”.
both Both of the above. Retrieves redirect pages from live wiki,
not from a special page.

and arguments can be:

-xml           Retrieve information from a local XML dump
              (https://download.wikimedia.org). Argument can also be given as
              "-xml:filename.xml". Cannot be used with -fullscan or -moves.

-fullscan      Retrieve redirect pages from live wiki, not from a special page
              Cannot be used with -xml.

-moves         Use the page move log to find double-redirect candidates. Only
              works with action "double", does not work with -xml.

              NOTE: You may use only one of these options above.
              If neither of -xml -fullscan -moves is given, info will be
              loaded from a special page of the live wiki.

-page:title    Work on a single page

-namespace:n   Namespace to process. Can be given multiple times, for several
              namespaces. If omitted, only the main (article) namespace is
              treated.

-offset:n      With -moves, the number of hours ago to start scanning moved
              pages. With -xml, the number of the redirect to restart with
              (see progress). Otherwise, ignored.

-start:title   The starting page title in each namespace. Page need not exist.

-until:title   The possible last page title in each namespace. Page needs not
              exist.

-step:n        The number of entries retrieved at oncevia API

-total:n       The maximum count of redirects to work upon. If omitted, there
              is no limit.

-delete        Enables deletion of broken redirects.

-always        Don't prompt you for each replacement.
class scripts.redirect.RedirectGenerator(xmlFilename=None, namespaces=[], offset=-1, use_move_log=False, use_api=False, start=None, until=None, number=None, step=None, page_title=None)[source]

Bases: object

Redirect generator.

get_moved_pages_redirects()[source]

Generate redirects to recently-moved pages.

get_redirect_pages_via_api()[source]

Yield Pages that are redirects.

get_redirects_from_dump(alsoGetPageTitles=False)[source]

Extract redirects from dump.

Load a local XML dump file, look at all pages which have the redirect flag set, and find out where they’re pointing at. Return a dictionary where the redirect names are the keys and the redirect targets are the values.

get_redirects_via_api(maxlen=8)[source]

Return a generator that yields tuples of data about redirect Pages.

0 - page title of a redirect page 1 - type of redirect:

         0 - broken redirect, target page title missing
         1 - normal redirect, target page exists and is not a
             redirect
2..maxlen - start of a redirect chain of that many redirects
             (currently, the API seems not to return sufficient
             data to make these return values possible, but
             that may change)
 maxlen+1 - start of an even longer chain, or a loop
             (currently, the API seems not to return sufficient
             data to allow this return values, but that may
             change)
     None - start of a redirect chain of unknown length, or
             loop

2 - target page title of the redirect, or chain (may not exist) 3 - target page of the redirect, or end of chain, or page title

where chain or loop detecton was halted, or None if unknown
retrieve_broken_redirects()[source]
retrieve_double_redirects()[source]
class scripts.redirect.RedirectRobot(action, generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

Redirect bot.

delete_1_broken_redirect(redir_name)[source]
delete_broken_redirects()[source]
fix_1_double_redirect(redir_name)[source]
fix_double_or_delete_broken_redirects()[source]
fix_double_redirects()[source]
has_valid_template(twtitle)[source]

Check whether a template from translatewiki.net exists on the wiki.

We assume we are always working on self.site

:param twtitle - a string which is the i18n key

moved_page(source)[source]
run()[source]

Run the script method selected by ‘action’ parameter.

scripts.redirect.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

replace Module

This bot will make direct text replacements.

It will retrieve information on which pages might need changes either from an XML dump or a text file, or only change a single page.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

Furthermore, the following command line parameters are supported:

-xml              Retrieve information from a local XML dump (pages-articles
                  or pages-meta-current, see https://download.wikimedia.org).
                  Argument can also be given as "-xml:filename".

-regex            Make replacements using regular expressions. If this argument
                  isn't given, the bot will make simple text replacements.

-nocase           Use case insensitive regular expressions.

-dotall           Make the dot match any character at all, including a newline.
                  Without this flag, '.' will match anything except a newline.

-multiline        '^' and '$' will now match begin and end of each line.

-xmlstart         (Only works with -xml) Skip all articles in the XML dump
                  before the one specified (may also be given as
                  -xmlstart:Article).

-addcat:cat_name  Adds "cat_name" category to every altered page.

-excepttitle:XYZ  Skip pages with titles that contain XYZ. If the -regex
                  argument is given, XYZ will be regarded as a regular
                  expression.

-requiretitle:XYZ Only do pages with titles that contain XYZ. If the -regex
                  argument is given, XYZ will be regarded as a regular
                  expression.

-excepttext:XYZ   Skip pages which contain the text XYZ. If the -regex
                  argument is given, XYZ will be regarded as a regular
                  expression.

-exceptinside:XYZ Skip occurences of the to-be-replaced text which lie
                  within XYZ. If the -regex argument is given, XYZ will be
                  regarded as a regular expression.

-exceptinsidetag:XYZ Skip occurences of the to-be-replaced text which lie
                  within an XYZ tag.

-summary:XYZ      Set the summary message text for the edit to XYZ, bypassing
                  the predefined message texts with original and replacements
                  inserted.

-sleep:123        If you use -fix you can check multiple regex at the same time
                  in every page. This can lead to a great waste of CPU because
                  the bot will check every regex without waiting using all the
                  resources. This will slow it down between a regex and another
                  in order not to waste too much CPU.

-fix:XYZ          Perform one of the predefined replacements tasks, which are
                  given in the dictionary 'fixes' defined inside the files
                  fixes.py and user-fixes.py.
                  The -regex and -nocase argument and given replacements will
                  be ignored if you use -fix.

                 The available fixes are listed in :py:mod:`pywikibot.fixes`.

-always           Don't prompt you for each replacement

-recursive        Recurse replacement as long as possible. Be careful, this
                  might lead to an infinite loop.

-allowoverlap     When occurences of the pattern overlap, replace all of them.
                  Be careful, this might lead to an infinite loop.
other: First argument is the old text, second argument is the new
text. If the -regex argument is given, the first argument will be regarded as a regular expression, and the second argument might contain expressions like 1 or g<name>. It is possible to introduce more than one pair of old text and replacement.

Examples:

If you want to change templates from the old syntax, e.g. {{msg:Stub}}, to the new syntax, e.g. {{Stub}}, download an XML dump file (pages-articles) from https://download.wikimedia.org, then use this command:

python replace.py -xml -regex "{{msg:(.*?)}}" "{{\1}}"

If you have a dump called foobar.xml and want to fix typos in articles, e.g. Errror -> Error, use this:

python replace.py -xml:foobar.xml "Errror" "Error" -namespace:0
If you want to do more than one replacement at a time, use this::
python replace.py -xml:foobar.xml “Errror” “Error” “Faail” “Fail” -namespace:0

If you have a page called ‘John Doe’ and want to fix the format of ISBNs, use:

python replace.py -page:John_Doe -fix:isbn

This command will change ‘referer’ to ‘referrer’, but not in pages which talk about HTTP, where the typo has become part of the standard:

python replace.py referer referrer -file:typos.txt -excepttext:HTTP

Please type “replace.py -help | more” if you can’t read the top of the help.

class scripts.replace.ReplaceRobot(generator, replacements, exceptions={}, acceptall=False, allowoverlap=False, recursive=False, addedCat=None, sleep=None, summary='', site=None)[source]

Bases: pywikibot.bot.Bot

A bot that can do text replacements.

count_changes(page, err)[source]

Count succesfully changed pages.

doReplacements(original_text)[source]

Apply all replacements to the given text.

Return type:unicode
isTextExcepted(original_text)[source]

Return True iff one of the exceptions applies for the given text.

Return type:bool
isTitleExcepted(title)[source]

Return True iff one of the exceptions applies for the given title.

Return type:bool
run()[source]

Start the bot.

class scripts.replace.XmlDumpReplacePageGenerator(xmlFilename, xmlStart, replacements, exceptions, site)[source]

Bases: object

Iterator that will yield Pages that might contain text to replace.

These pages will be retrieved from a local XML dump file. Arguments:

* xmlFilename  - The dump's path, either absolute or relative
* xmlStart     - Skip all articles in the dump before this one
* replacements - A list of 2-tuples of original text (as a
                  compiled regular expression) and replacement
                  text (as a string).
* exceptions   - A dictionary which defines when to ignore an
                  occurence. See docu of the ReplaceRobot
                  constructor below.
isTextExcepted(text)[source]

Return True iff one of the exceptions applies for the given text.

Return type:bool
isTitleExcepted(title)[source]

Return True iff one of the exceptions applies for the given title.

Return type:bool
scripts.replace.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.replace.prepareRegexForMySQL(pattern)[source]

Convert regex to MySQL syntax.

replicate_wiki Module

This bot replicates pages in a wiki to a second wiki within one family.

Example::
python replicate_wiki.py [-r] -ns 10 -f wikipedia -o nl li fy

to copy all templates from an nlwiki to liwiki and fywiki. It will show which pages have to be changed if -r is not present, and will only actually write pages if -r /is/ present.

You can add replicate_replace to your user_config.py, which has the following format:

replicate_replace = {
‘wikipedia:li’: {‘Hoofdpagina’: ‘Veurblaad’}

}

to replace all occurences of ‘Hoofdpagina’ with ‘Veurblaad’ when writing to liwiki. Note that this does not take the origin wiki into account.

class scripts.replicate_wiki.SyncSites(options)[source]

Bases: object

Work is done in here.

check_namespace(namespace)[source]

Check an entire namespace.

check_namespaces()[source]

Check all namespaces, to be ditched for clarity.

check_page(pagename)[source]

Check one page.

check_sysops()[source]

Check if sysops are the same on all wikis.

generate_overviews()[source]

Create page on wikis with overview of bot results.

put_message(site)[source]
scripts.replicate_wiki.multiple_replace(text, word_dict)[source]

Replace all occurrences in text of key value pairs in word_dict.

scripts.replicate_wiki.namespaces(site)[source]

dict from namespace number to prefix.

revertbot Module

This script can be used for reverting certain edits.

The following command line parameters are supported:

This script supports use of pywikibot.pagegenerators arguments.

-username Edits of which user need to be reverted.
-rollback Rollback edits instead of reverting them. Note that in rollback, no diff would be shown.
class scripts.revertbot.BaseRevertBot(site, user=None, comment=None, rollback=False)[source]

Bases: object

Base revert bot.

Subclass this bot and override callback to get it to do something useful.

callback(item)[source]
get_contributions(max=500, ns=None)[source]
log(msg)[source]
revert(item)[source]
revert_contribs(callback=None)[source]
scripts.revertbot.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
class scripts.revertbot.myRevertBot(site, user=None, comment=None, rollback=False)[source]

Bases: scripts.revertbot.BaseRevertBot

Example revert bot.

callback(item)[source]

script_wui Module

shell Module

Spawns an interactive Python shell.

Usage::
python pwb.py shell [args]

If no arguments are given, the pywikibot library will not be loaded.

The following parameters are supported:

This script supports use of pywikibot.pagegenerators arguments.

scripts.shell.main(*args)[source]

Script entry point.

solve_disambiguation Module

Script to help a human solve disambiguations by presenting a set of options.

Specify the disambiguation page on the command line.

The program will pick up the page, and look for all alternative links, and show them with a number adjacent to them. It will then automatically loop over all pages referring to the disambiguation page, and show 30 characters of context on each side of the reference to help you make the decision between the alternatives. It will ask you to type the number of the appropriate replacement, and perform the change.

It is possible to choose to replace only the link (just type the number) or replace both link and link-text (type ‘r’ followed by the number).

Multiple references in one page will be scanned in order, but typing ‘n’ (next) on any one of them will leave the complete page unchanged. To leave only some reference unchanged, use the ‘s’ (skip) option.

Command line options:

-pos:XXXX   adds XXXX as an alternative disambiguation

-just       only use the alternatives given on the command line, do not
            read the page for other possibilities

-dnskip     Skip links already marked with a disambiguation-needed
            template (e.g., {{dn}})

-primary    "primary topic" disambiguation (Begriffsklärung nach Modell 2).
            That's titles where one topic is much more important, the
            disambiguation page is saved somewhere else, and the important
            topic gets the nice name.

-primary:XY like the above, but use XY as the only alternative, instead of
            searching for alternatives in [[Keyword (disambiguation)]].
            Note: this is the same as -primary -just -pos:XY

-file:XYZ   reads a list of pages from a text file. XYZ is the name of the
            file from which the list is taken. If XYZ is not given, the
            user is asked for a filename. Page titles should be inside
            [[double brackets]]. The -pos parameter won't work if -file
            is used.

-always:XY  instead of asking the user what to do, always perform the same
            action. For example, XY can be "r0", "u" or "2". Be careful with
            this option, and check the changes made by the bot. Note that
            some choices for XY don't make sense and will result in a loop,
            e.g. "l" or "m".

-main       only check pages in the main namespace, not in the talk,
            wikipedia, user, etc. namespaces.

-start:XY   goes through all disambiguation pages in the category on your
            wiki that is defined (to the bot) as the category containing
            disambiguation pages, starting at XY. If only '-start' or
            '-start:' is given, it starts at the beginning.

-min:XX     (XX being a number) only work on disambiguation pages for which
            at least XX are to be worked on.

To complete a move of a page, one can use:

python solve_disambiguation.py -just -pos:New_Name Old_Name
class scripts.solve_disambiguation.DisambiguationRobot(always, alternatives, getAlternatives, dnSkip, generator, primary, main_only, minimum=0)[source]

Bases: pywikibot.bot.Bot

Disambiguation bot.

checkContents(text)[source]

Check if the text matches any of the ignore regexes.

For a given text, returns False if none of the regular expressions given in the dictionary at the top of this class matches a substring of the text. Otherwise returns the substring which is matched by one of the regular expressions.

findAlternatives(disambPage)[source]
ignore_contents = {'kk': ('{{[Ii]nuse}}', '{{[Pp]rocessing}}'), 'de': ('{{[Ii]nuse}}', '{{[Ll]öschen}}'), 'nl': ('{{wiu2}}', '{{nuweg}}'), 'ru': ('{{[Ii]nuse}}', '{{[Pp]rocessing}}'), 'fi': ('{{[Tt]yöstetään}}',)}
listAlternatives()[source]
makeAlternativesUnique()[source]
primary_redir_template = {'hu': 'Egyért-redir'}
run()[source]
setSummaryMessage(disambPage, new_targets=[], unlink=False, dn=False)[source]
setupRegexes()[source]
treat(refPage, disambPage)[source]

Treat a page.

Parameters::
disambPage - The disambiguation page or redirect we don’t want
anything to link to

refPage - A page linking to disambPage

Returns False if the user pressed q to completely quit the program. Otherwise, returns True.

class scripts.solve_disambiguation.PrimaryIgnoreManager(disambPage, enabled=False)[source]

Bases: object

Primary ignore manager.

If run with the -primary argument, reads from a file which pages should not be worked on; these are the ones where the user pressed n last time. If run without the -primary argument, doesn’t ignore any pages.

ignore(refPage)[source]
isIgnored(refPage)[source]
class scripts.solve_disambiguation.ReferringPageGeneratorWithIgnore(disambPage, primary=False, minimum=0)[source]

Bases: object

Referring Page generator, with an ignore manager.

scripts.solve_disambiguation.correctcap(link, text)[source]
scripts.solve_disambiguation.firstcap(string)[source]
scripts.solve_disambiguation.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

spamremove Module

Script to remove links that are being or have been spammed.

Usage:

spamremove.py www.spammedsite.com

It will use Special:Linksearch to find the pages on the wiki that link to that site, then for each page make a proposed change consisting of removing all the lines where that url occurs. You can choose to:: * accept the changes as proposed * edit the page yourself to remove the offending link * not change the page in question

Command line options::
-always Do not ask, but remove the lines automatically. Be very careful in using this option!
-namespace: Filters the search to a given namespace. If this is specified
multiple times it will search all given namespaces
scripts.spamremove.main(*args)[source]

Process command line arguments and perform task.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

template Module

Very simple script to replace a template with another one.

It also converts the old MediaWiki boilerplate format to the new template format.

Syntax: python template.py [-remove] [xml[:filename]] oldTemplate [newTemplate]

Specify the template on the command line. The program will pick up the template page, and look for all pages using it. It will then automatically loop over them, and replace the template.

Command line options:

-remove      Remove every occurence of the template from every article

-subst       Resolves the template by putting its text directly into the
            article. This is done by changing {{...}} or {{msg:...}} into
            {{subst:...}}

-assubst     Replaces the first argument as old template with the second
            argument as new template but substitutes it like -subst does.
            Using both options -remove and -subst in the same command line has
            the same effect.

-xml         retrieve information from a local dump
            (https://download.wikimedia.org). If this argument isn't given,
            info will be loaded from the maintenance page of the live wiki.
            argument can also be given as "-xml:filename.xml".

-user:       Only process pages edited by a given user

-skipuser:   Only process pages not edited by a given user

-timestamp:  (With -user or -skipuser). Only check for a user where his edit is
            not older than the given timestamp. Timestamp must be writen in
            MediaWiki timestamp format which is "%Y%m%d%H%M%S"
            If this parameter is missed, all edits are checked but this is
            restricted to the last 100 edits.

-summary:    Lets you pick a custom edit summary.  Use quotes if edit summary
            contains spaces.

-always      Don't bother asking to confirm any of the changes, Just Do It.

-category:   Appends the given category to every page that is edited.  This is
            useful when a category is being broken out from a template
            parameter or when templates are being upmerged but more information
            must be preserved.
other: First argument is the old template name, second one is the new

name.

If you want to address a template which has spaces, put quotation marks around it, or use underscores.

Examples:

If you have a template called [[Template:Cities in Washington]] and want to change it to [[Template:Cities in Washington state]], start

python template.py “Cities in Washington” “Cities in Washington state”

Move the page [[Template:Cities in Washington]] manually afterwards.

If you have a template called [[Template:test]] and want to substitute it only on pages in the User: and User talk: namespaces, do:

python template.py test -subst -namespace:2 -namespace:3

Note that -namespace: is a global Pywikibot parameter

This next example substitutes the template lived with a supplied edit summary. It only performs substitutions in main article namespace and doesn’t prompt to start replacing. Note that -putthrottle: is a global Pywikibot parameter.

python template.py -putthrottle:30 -namespace:0 lived -subst -always
-summary:”BOT: Substituting {{lived}}, see [[WP:SUBST]].”

This next example removes the templates {{cfr}}, {{cfru}}, and {{cfr-speedy}} from five category pages as given:

python template.py cfr cfru cfr-speedy -remove -always
      -page:"Category:Mountain monuments and memorials" -page:"Category:Indian family names"
      -page:"Category:Tennis tournaments in Belgium" -page:"Category:Tennis tournaments in Germany"
      -page:"Category:Episcopal cathedrals in the United States"
      -summary:"Removing Cfd templates from category pages that survived."

This next example substitutes templates test1, test2, and space test on all pages:

python template.py test1 test2 "space test" -subst -always
class scripts.template.TemplateRobot(generator, templates, **kwargs)[source]

Bases: pywikibot.bot.Bot

This bot will replace, remove or subst all occurences of a template.

run()[source]

Start the robot’s action.

scripts.template.UserEditFilterGenerator(generator, username, timestamp=None, skip=False)[source]

Generator which will yield Pages modified by username.

It only looks at the last 100 editors. If timestamp is set in MediaWiki format JJJJMMDDhhmmss, older edits are ignored If skip is set, pages edited by the given user are ignored otherwise only pages edited by this user are given back

class scripts.template.XmlDumpTemplatePageGenerator(templates, xmlfilename)[source]

Bases: object

Generator which yield Pages that transclude a template.

These pages will be retrieved from a local XML dump file (cur table), and may not still transclude the template.

scripts.template.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

templatecount Module

This script will display the list of pages transcluding a given list of templates.

It can also be used to simply count the number of pages (rather than listing each individually).

Syntax: python templatecount.py command [arguments]

Command line options:

-count        Counts the number of times each template (passed in as an
             argument) is transcluded.

-list         Gives the list of all of the pages transcluding the templates
             (rather than just counting them).

-namespace:   Filters the search to a given namespace.  If this is specified
             multiple times it will search all given namespaces

Examples:

Counts how many times {{ref}} and {{note}} are transcluded in articles.

python templatecount.py -count -namespace:0 ref note

Lists all the category pages that transclude {{cfd}} and {{cfdu}}.

python templatecount.py -list -namespace:14 cfd cfdu
class scripts.templatecount.TemplateCountRobot[source]

Bases: object

Template count bot.

static countTemplates(templates, namespaces)[source]
static listTemplates(templates, namespaces)[source]
static template_dict(templates, namespaces)[source]
static template_dict_generator(templates, namespaces)[source]
scripts.templatecount.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

touch Module

This bot goes over multiple pages of a wiki, and edits them without changes.

This is for example used to get category links in templates working.

This script understands various command-line arguments:

This script supports use of pywikibot.pagegenerators arguments.

-purge Do not touch but purge the page
-redir specifies that the bot should work on redirect pages; otherwise, they will be skipped.
class scripts.touch.TouchBot(generator, **kwargs)[source]

Bases: pywikibot.bot.Bot

Page touch bot.

run()[source]
scripts.touch.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

transferbot Module

This script transfers pages from a source wiki to a target wiki.

It also copies edit history to a subpage.

-tolang: The target site code.

-tosite: The target site family.

-prefix: Page prefix on the new site.

-overwrite: Existing pages are skipped by default. Use his option to
overwrite pages.

Internal links are not repaired!

Pages to work on can be specified using any of:

This script supports use of pywikibot.pagegenerators arguments.

Example commands:

# Transfer all pages in category "Query service" from the Toolserver wiki to
# wikitech, adding Nova_Resource:Tools/Tools/ as prefix

transferbot.py -v -family:toolserver -tofamily:wikitech -cat:”Query service” -prefix:Nova_Resource:Tools/Tools/

# Copy the template “Query service” from the Toolserver wiki to wikitech transferbot.py -v -family:toolserver -tofamily:wikitech -page:”Template:Query service”

exception scripts.transferbot.TargetPagesMissing[source]

Bases: scripts.transferbot.WikiTransferException

Thrown if no page range has been specified for the script to operate on.

exception scripts.transferbot.TargetSiteMissing[source]

Bases: scripts.transferbot.WikiTransferException

Thrown when the target site is the same as the source site.

Based on the way each are initialized, this is likely to happen when the target site simply hasn’t been specified.

exception scripts.transferbot.WikiTransferException[source]

Bases: Exception

Base class for exceptions from this script.

Makes it easier for clients to catch all expected exceptions that the script might throw

scripts.transferbot.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

unusedfiles Module

This bot appends some text to all unused images and notifies uploaders.

Parameters:

-always     Don't be asked every time.
class scripts.unusedfiles.UnusedFilesBot(site, **kwargs)[source]

Bases: pywikibot.bot.Bot

Unused files bot.

append_text(page, apptext)[source]
run()[source]
scripts.unusedfiles.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

upload Module

Script to upload images to wikipedia.

Arguments:

-keep         Keep the filename as is
-filename     Target filename
-noverify     Do not ask for verification of the upload description if one
               is given
-abortonwarn: Abort upload on the specified warning type. If no warning type
               is specified, aborts on any warning.
-ignorewarn:  Ignores specified upload warnings. If no warning type is
               specified, ignores all warnings. Use with caution
-chunked:     Upload the file in chunks (more overhead, but restartable). If
               no value is specified the chunk size is 1 MiB. The value must
              be a number which can be preceded by a suffix. The units are::
                 No suffix: Bytes
                 'k': Kilobytes (1000 B)
                 'M': Megabytes (1000000 B)
                 'Ki': Kibibytes (1024 B)
                 'Mi': Mebibytes (1024x1024 B)
               The suffixes are case insenstive.

If any other arguments are given, the first is the URL or filename to upload, and the rest is a proposed description to go with the upload. If none of these are given, the user is asked for the file or URL to upload. The bot will then upload the image to the wiki.

The script will ask for the location of an image, if not given as a parameter, and for a description.

class scripts.upload.UploadRobot(url, urlEncoding=None, description='', useFilename=None, keepFilename=False, verifyDescription=True, ignoreWarning=False, targetSite=None, uploadByUrl=False, aborts=[], chunk_size=0)[source]

Bases: object

Upload bot.

abort_on_warn(warn_code)[source]

Determine if the warning message should cause an abort.

ignore_on_warn(warn_code)[source]

Determine if the warning message should be ignored.

process_filename()[source]

Return base filename portion of self.url.

read_file_content()[source]

Return name of temp file in which remote file is saved.

run()[source]

Run bot.

upload_image(debug=False)[source]

Upload the image at self.url to the target wiki.

Return the filename that was used to upload the image. If the upload fails, ask the user whether to try again or not. If the user chooses not to retry, return null.

urlOK()[source]

Return True if self.url is an URL or an existing local file.

scripts.upload.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments

version Module

Script to determine the Pywikibot version (tag, revision and date).

watchlist Module

Allows access to the bot account’s watchlist.

The function refresh() downloads the current watchlist and saves it to disk. It is run automatically when a bot first tries to save a page retrieved. The watchlist can be updated manually by running this script. The list will also be reloaded automatically once a month.

Syntax: python watchlist [-all | -new]

Command line options::
-all
  • Reloads watchlists for all wikis where a watchlist is already

present

-new
  • Load watchlists for all wikis where accounts is setting in

user-config.py

scripts.watchlist.get(site=None)[source]

Load the watchlist, fetching it if necessary.

scripts.watchlist.isWatched(pageName, site=None)[source]

Check whether a page is being watched.

scripts.watchlist.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.watchlist.refresh(site, sysop=False)[source]

Fetch the watchlist.

scripts.watchlist.refresh_all(sysop=False)[source]

Reload watchlists for all wikis where a watchlist is already present.

scripts.watchlist.refresh_new(sysop=False)[source]

Load watchlists of all wikis for accounts set in user-config.py.

weblinkchecker Module

This bot is used for checking external links found at the wiki.

It checks several pages at once, with a limit set by the config variable max_external_links, which defaults to 50.

The bot won’t change any wiki pages, it will only report dead links such that people can fix or remove the links themselves.

The bot will store all links found dead in a .dat file in the deadlinks subdirectory. To avoid the removing of links which are only temporarily unavailable, the bot ONLY reports links which were reported dead at least two times, with a time lag of at least one week. Such links will be logged to a .txt file in the deadlinks subdirectory.

After running the bot and waiting for at least one week, you can re-check those pages where dead links were found, using the -repeat parameter.

In addition to the logging step, it is possible to automatically report dead links to the talk page of the article where the link was found. To use this feature, set report_dead_links_on_talk = True in your user-config.py, or specify “-talk” on the command line. Adding “-notalk” switches this off irrespective of the configuration variable.

When a link is found alive, it will be removed from the .dat file.

These command line parameters can be used to specify which pages to work on:

This script supports use of pywikibot.pagegenerators arguments.

-repeat Work on all pages were dead links were found before. This is useful to confirm that the links are dead after some time (at least one week), which is required before the script will report the problem.
-namespace Only process templates in the namespace with the given number or name. This parameter may be used multiple times.
-xml Should be used instead of a simple page fetching method from pagegenerators.py for performance and load issues
-xmlstart Page to start with when using an XML dump
-ignore HTTP return codes to ignore. Can be provided several times :: -ignore:401 -ignore:500

Furthermore, the following command line parameters are supported:

-talk        Overrides the report_dead_links_on_talk config variable, enabling
            the feature.

-notalk      Overrides the report_dead_links_on_talk config variable, disabling
            the feature.
-day         the first time found dead link longer than x day ago, it should
            probably be fixed or removed. if no set, default is 7 day.

The following config variables are supported:

max_external_links - The maximum number of web pages that should be
loaded simultaneously. You should change this according to your Internet connection speed. Be careful: if it is set too high, the script might get socket errors because your network is congested, and will then think that the page is offline.
report_dead_links_on_talk - If set to true, causes the script to report dead
links on the article’s talk page if (and ONLY if) the linked page has been unavailable at least two times during a timespan of at least one week.
Syntax examples::
python weblinkchecker.py -start:!
Loads all wiki pages in alphabetical order using the Special:Allpages feature.
python weblinkchecker.py -start:Example_page
Loads all wiki pages using the Special:Allpages feature, starting at “Example page”
python weblinkchecker.py -weblink:www.example.org
Loads all wiki pages that link to www.example.org
python weblinkchecker.py Example page
Only checks links found in the wiki page “Example page”
python weblinkchecker.py -repeat
Loads all wiki pages where dead links were found during a prior run
class scripts.weblinkchecker.DeadLinkReportThread[source]

Bases: threading.Thread

A Thread that is responsible for posting error reports on talk pages.

There is only one DeadLinkReportThread, and it is using a semaphore to make sure that two LinkCheckerThreads can not access the queue at the same time.

kill()[source]
report(url, errorReport, containingPage, archiveURL)[source]

Report error on talk page of the page containing the dead link.

run()[source]
shutdown()[source]
class scripts.weblinkchecker.History(reportThread)[source]

Bases: object

Store previously found dead links.

The URLs are dictionary keys, and values are lists of tuples where each tuple represents one time the URL was found dead. Tuples have the form (title, date, error) where title is the wiki page where the URL was found, date is an instance of time, and error is a string with error code and message.

We assume that the first element in the list represents the first time we found this dead link, and the last element represents the last time.

Example:

dict = {
https://www.example.org/page‘: [
(‘WikiPageTitle’, DATE, ‘404: File not found’), (‘WikiPageName2’, DATE, ‘404: File not found’),

]

log(url, error, containingPage, archiveURL)[source]

Log an error report to a text file in the deadlinks subdirectory.

save()[source]

Save the .dat file to disk.

setLinkAlive(url)[source]

Record that the link is now alive.

If link was previously found dead, remove it from the .dat file.

Returns:True if previously found dead, else returns False.
setLinkDead(url, error, page, day)[source]

Add the fact that the link was found dead to the .dat file.

class scripts.weblinkchecker.LinkCheckThread(page, url, history, HTTPignore, day)[source]

Bases: threading.Thread

A thread responsible for checking one URL.

After checking the page, it will die.

run()[source]
class scripts.weblinkchecker.LinkChecker(url, redirectChain=[], serverEncoding=None, HTTPignore=[])[source]

Bases: object

Check links.

Given a HTTP URL, tries to load the page from the Internet and checks if it is still online.

Returns a (boolean, string) tuple saying if the page is online and including a status reason.

Warning: Also returns false if your Internet connection isn’t working correctly! (This will give a Socket Error)

changeUrl(url)[source]
check(useHEAD=False)[source]

Return True and the server status message if the page is alive.

Return type:tuple of (bool, unicode)
getConnection()[source]
getEncodingUsedByServer()[source]
readEncodingFromResponse(response)[source]
resolveRedirect(useHEAD=False)[source]

Return the redirect target URL as a string, if it is a HTTP redirect.

If useHEAD is true, uses the HTTP HEAD method, which saves bandwidth by not downloading the body. Otherwise, the HTTP GET method is used.

Return type:unicode or None
scripts.weblinkchecker.RepeatPageGenerator()[source]

Generator for pages in History.

class scripts.weblinkchecker.WeblinkCheckerRobot(generator, HTTPignore=None, day=7)[source]

Bases: object

Bot which will search for dead weblinks.

It uses several LinkCheckThreads at once to process pages from generator.

checkLinksIn(page)[source]
run()[source]
class scripts.weblinkchecker.XmlDumpPageGenerator(xmlFilename, xmlStart, namespaces)[source]

Bases: object

Xml generator that yiels pages containing a web link.

next()[source]
scripts.weblinkchecker.check(url)[source]

Peform a check on URL.

scripts.weblinkchecker.countLinkCheckThreads()[source]

Count LinkCheckThread threads.

Returns:number of LinkCheckThread threads
Return type:int
scripts.weblinkchecker.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.weblinkchecker.weblinksIn(text, withoutBracketed=False, onlyBracketed=False)[source]

Yield web links from text.

TODO: move to textlib

welcome Module

Script to welcome new users.

This script works out of the box for Wikis that have been defined in the script. It is currently used on the Dutch, Norwegian, Albanian, Italian Wikipedia, Wikimedia Commons and English Wikiquote.

Ensure you have community support before running this bot!

URLs to current implementations:: * Wikimedia Commons: https://commons.wikimedia.org/wiki/Commons:Welcome_log * Dutch Wikipedia: https://nl.wikipedia.org/wiki/Wikipedia:Logboek_welkom * Italian Wikipedia: https://it.wikipedia.org/wiki/Wikipedia:Benvenuto_log * English Wikiquote: https://en.wikiquote.org/wiki/Wikiquote:Welcome_log * Persian Wikipedia: https://fa.wikipedia.org/wiki/ویکی‌پدیا:سیاهه_خوشامد * Korean Wikipedia: https://ko.wikipedia.org/wiki/위키백과:Welcome_log

Everything that needs customisation to support additional projects is indicated by comments.

Description of basic functionality:: * Request a list of new users every period (default: 3600 seconds)

You can choose to break the script after the first check (see arguments)
  • Check if new user has passed a threshold for a number of edits (default: 1 edit)
  • Optional: check username for bad words in the username or if the username consists solely of numbers; log this somewhere on the wiki (default: False) Update: Added a whitelist (explanation below).
  • If user has made enough edits (it can be also 0), check if user has an empty talk page
  • If user has an empty talk page, add a welcome message.
  • Optional: Once the set number of users have been welcomed, add this to the configured log page, one for each day (default: True)
  • If no log page exists, create a header for the log page first.

This script (by default not yet implemented) uses two templates that need to be on the local wiki:: * {{WLE}}: contains mark up code for log entries (just copy it from Commons) * {{welcome}}: contains the information for new users

This script understands the following command-line arguments:

-edit[:#]       Define how many edits a new user needs to be welcomed
                 (default: 1, max: 50)

-time[:#]       Define how many seconds the bot sleeps before restart
                 (default: 3600)

-break          Use it if you don't want that the Bot restart at the end
                 (it will break) (default: False)

-nlog           Use this parameter if you do not want the bot to log all
                 welcomed users (default: False)

-limit[:#]      Use this parameter to define how may users should be
                 checked (default:50)

-offset[:TIME]  Skip the latest new users (those newer than TIME)
                 to give interactive users a chance to welcome the
                 new users (default: now)
                 Timezone is the server timezone, GMT for Wikimedia
                 TIME format : yyyymmddhhmmss

-timeoffset[:#] Skip the latest new users, accounts newer than
                 # minutes

-numberlog[:#]  The number of users to welcome before refreshing the
                 welcome log (default: 4)

-filter         Enable the username checks for bad names (default: False)

-ask            Use this parameter if you want to confirm each possible
                 bad username (default: False)

-random         Use a random signature, taking the signatures from a wiki
                 page (for instruction, see below).

-file[:#]       Use a file instead of a wikipage to take the random sign.
                 If you use this parameter, you don't need to use -random.

-sign           Use one signature from command line instead of the default

-savedata       This feature saves the random signature index to allow to
                 continue to welcome with the last signature used.

-sul            Welcome the auto-created users (default: False)

-quiet          Prevents users without contributions are displayed

-quick          Provide quick check by API bulk-retrieve user datas

***************************** GUIDE *******************************

Report, Bad and white list guide:

  1. Set in the code which page it will use to load the badword, the whitelist and the report

  2. In these page you have to add a “tuple” with the names that you want to add in the two list. For example: (‘cat’, ‘mouse’, ‘dog’) You can write also other text in the page, it will work without problem.

  3. What will do the two pages? Well, the Bot will check if a badword is in the username and set the “warning” as True. Then the Bot check if a word of the whitelist is in the username. If yes it remove the word and recheck in the bad word list to see if there are other badword in the username. Example:

    * dio is a badword
    * Claudio is a normal name
    * The username is "Claudio90 fuck!"
    * The Bot finds dio and sets "warning"
    * The Bot finds Claudio and sets "ok"
    * The Bot finds fuck at the end and sets "warning"
    * Result: The username is reported.
    
  4. When a user is reported you have to check him and do::
    • If he’s ok, put the {{welcome}}
    • If he’s not, block him
    • You can decide to put a “you are blocked, change another username” template or not.
    • Delete the username from the page.
    IMPORTANT : The Bot check the user in this order::
    • Search if he has a talkpage (if yes, skip)
    • Search if he’s blocked, if yes he will be skipped
    • Search if he’s in the report page, if yes he will be skipped
    • If no, he will be reported.

Random signature guide:

Some welcomed users will answer to the one who has signed the welcome message. When you welcome many new users, you might be overwhelmed with such answers. Therefore you can define usernames of other users who are willing to receive some of these messages from newbies.

  1. Set the page that the bot will load
  2. Add the signatures in this way:

*<SPACE>SIGNATURE <NEW LINE>

Example:: <pre> * [[User:Filnik|Filnik]] * [[User:Rock|Rock]] </pre>

NOTE: The white space and <pre></pre> aren’t required but I suggest you to
use them.

*********************** Known issues/FIXMEs ************************ * * * The regex to load the user might be slightly different from project to * * project. (In this case, write to Filnik or the PWRF for help...) * * Use a class to group toghether the functions used. * * * **************************** Badwords ******************************

The list of Badwords of the code is opened. If you think that a word is international and it must be blocked in all the projects feel free to add it. If also you think that a word isn’t so international, feel free to delete it.

However, there is a dinamic-wikipage to load that badwords of your project or you can add them directly in the source code that you are using without adding or deleting.

Some words, like “Administrator” or “Dio” (God in italian) or “Jimbo” aren’t badword at all but can be used for some bad-nickname.

exception scripts.welcome.FilenameNotSet(arg)[source]

Bases: pywikibot.exceptions.Error

An exception indicating that a signature filename was not specifed.

class scripts.welcome.Global[source]

Bases: object

Container class for global settings.

attachEditCount = 1
confirm = False
defaultSign = '--~~~~'
dumpToLog = 15
filtBadName = False
makeWelcomeLog = True
offset = 0
queryLimit = 50
quick = False
quiet = False
randomSign = False
recursive = True
saveSignIndex = False
signFileName = None
timeRecur = 3600
timeoffset = 0
welcomeAuto = False
class scripts.welcome.WelcomeBot[source]

Bases: object

Bot to add welcome messages on User pages.

badNameFilter(name, force=False)[source]
check_managed_sites()[source]

Check that site is managed by welcome.py.

defineSign(force=False)[source]
makelogpage(queue=None)[source]
parseNewUserLog()[source]
reportBadAccount(name=None, final=False)[source]
run()[source]
scripts.welcome.load_word_function(raw)[source]

Load the badword list and the whitelist.

scripts.welcome.main(*args)[source]

Process command line arguments and invoke bot.

If args is an empty list, sys.argv is used.

Parameters:args (list of unicode) – command line arguments
scripts.welcome.showStatus(n=0)[source]

Output colorized status.