User:Brett Sargent/cis410/Homework5

From MCIS Wiki
Jump to: navigation, search

Python Final Project

I wrote a program that gathers stories from a Western State news feed and data about snowfall for a few of the local ski resorts. The program formats and sends this information to my page on the wikipedia.

I had initially intended this program to work on the MCIS wiki, and I was half way to getting it to work before I found out that the MCIS Wiki API is missing a few methods, like the one to edit a page. The Wikipedia and the MCIS wiki are both powered by MediaWiki, I didn't have to change any code I had written.

The wikimedia API was easy to use and well documented. The most tricky part is logging in. You have to send your credentials to the server in order to receive a cookie with various tokens. Here is the code for loading a cookie that may already be present:

COOKIEFILE = 'cookies.lwp'
cj = cookielib.LWPCookieJar()
if os.path.isfile(COOKIEFILE):
	cj.load(COOKIEFILE)
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)

It is necessary to set up the cookie file to be read and written to and then to. the urllib2.build_opener and urllib2.install_opener methods are to make the cookie file be used each time urllib2 opens a url. When try to log in to the Wikipedia, the cookie is handled automatically.

Here is the code for logging in.

def loadURL(url, values):
	req = urllib2.Request(url, urllib.urlencode(values))
	return urllib2.urlopen(req).read()

def login(name, pw):	
	WikiResult = re.compile('login result="[a-zA-Z]*')
	url = 'http://en.wikipedia.org/w/api.php?action=login'
	values = {'lgname' : name, 'lgpassword' : pw}
	result = WikiResult.search(loadURL(url, values))
	print result.group(0)[19: len(result.group(0))]

The next step is to get an edit token for the page. You have to query the page asking for a token and it gets returned in the xml response. The token can then be passed into the edit method to edit the page. Here is the code for those methods:

def getToken(page):
	url = 'http://en.wikipedia.org/w/api.php?action=query'
	values = {'prop'   : 'info|revisions',
	          'intoken': 'edit',
	          'titles' :  page}
	the_page = loadURL(url, values)	
	findToken = re.compile('edittoken="[A-Za-z0-9]*')
	searchRes = findToken.search(the_page)
	return searchRes.group(0)[16: len(searchRes.group(0))] + "+\\"

def editPage(page, textToAdd, section):
	WikiResult = re.compile('edit result="[a-zA-Z]*')
	url = 'http://en.wikipedia.org/w/api.php?action=edit'
	token = getToken(page)		
	values = {'section': section,			 
			  'text'   : textToAdd,
			  'title'  : page,
			  'token'  : token}
  	result = WikiResult.search(loadURL(url, values))
	print result.group(0)[18: len(result.group(0))]

That is all of the interesting stuff, the rest of it is just parsing the XML from RSS feeds and formatting output with wiki markup.

Here is the source code for the project Media:BrettSargent CIS410 Homework5.zip