Purple exclamation mark.svg Planning the future of Botwiki! - Help us bring Botwiki up to date, contribute to our strategy discussion, add bot scripts, and contribute manuals, guides, and tutorials! Almost anything related to bots, particularly those used to edit mediawiki, is welcome.

Red exclamation mark.svg UNABLE TO EDIT? - We've experienced attacks by spambots lately and now require you to confirm your e-mail before you can edit (go to your preferences, enter an e-mail address, and request a confirmation e-mail, then go to your e-mail and click on the confirmation link). We also require new accounts to make a few edits and wait a few minutes before before you can create a page; however, if this is a problem contact us in #botwiki and we can manually confirm your account. Sorry for the inconvenience.

Python:Ifexistslog.py

From Botwiki
Jump to: navigation, search
#ifexistslog.py
# TO FETCH THE PAGE:
#   http://noc.wikimedia.org/~tstarling/ifexist.log
# STORE IT, AND THEN FIND ALL STRINGS like:
#   2007-12-03 06:27:16 zh_yuewiki: 131 http://zh-yue.wikipedia.org/wiki/%E6%B4%9B%E7%A3%AF%E5%B1%B1%E8%84%88
 
import re
import urllib
import codecs
 
# CUSTOMIZE THE FOLLOWING LINE TO YOUR LANGUAGE
urlX = re.compile(r'http\://zh-yue.+\b',flags=re.U) 
 
file = urllib.urlopen('http://noc.wikimedia.org/~tstarling/ifexist.log')
saveFile = codecs.open('ifexists.log','a+',encoding='utf-8')
sortFile = codecs.open('ifexists.sort','a+',encoding='utf-8')
 
x=file.read()
 
saveFile.write(x)
saveFile.close()
 
list = urlX.findall(x)
 
a = raw_input('\n#################Here! \n\n\n\n\n\a')
 
## WE SHOULD REPLACE THE FOLLOWING BY SOMETHING WHICH CAN REMOVE THE DUPLICATIONS
 
n=0
for i in list:
  print(i)
  n+=1
print n, 'parses found.'
 
a=''
for i in list:
  print(i)
  if a=='': a=raw_input('press return to continue; type something else to automate')
  sortFile.write(i)
 
sortFile.close()
Personal tools
Share