»°ËµPython(Æß)´ÓGo...ogle˵Æð
½ñÌìÒ»ÉϿΣ¬´óÅ£ÀÏʦ¾Í¸ø´ó¼Ò³öÁËÒ»µÀÌ⣺
±à³Ì£ºÇë´Ó×Ö·û´®“goOoOogle”ÖÐÕÒ³öÒÔ“O”¿ªÍ·£¬²¢ÒÔ“O”½áÊøµÄ²¿·Ö¡£
“Õ⻹²»¼òµ¥£¬¿´ÎҵĔС²Ë²»Ò»»á¶ù¾Í¸ø³öÁ˴𰸣º
>>> s="goOoOogle"
>>> s.find("O")
2
>>> s.find("O",3)
4
>>> s[2:5]
'OoO'
>>>
“С²Ë°¡£¬ÄãÕ¦ÕâÃ´ÉµÄØ£¡”С°×²»½ûÔÚÐÄÀï̾Ϣ£¬“ÀÏʦ³öÕâµÀÌâ²»ÊÇÃ÷°Ú×Å¿¼ÎÒÃÇÕýÔò±í´ïʽÂ”
¹ûÈ»²»³öС°×µÄÔ¤ÁÏ£¬´óÅ£ÀÏʦÓÖ³öÁËÒ»µÀÌ⣺
Çë´Ó“goOoOogle”ÖÐÔÙÕÒ³öÁ¬ÐøµÄ“o”£¬²»·Ö´óСд¡£
“²»¾ÍÊÇÑ»·ÅжÏÂ¼ÌÐøfind”С²ËÓÖ¿ªÊ¼ÁË×Ô¼ºµÄËã·¨Ö®Âá£
“С°×£¬ÄãÀ´»Ø´ðһϔ´óÅ£¿´µ½Ð¡°×ºÃÏñÒѾÍê³É£¬±ãÈÃËû»Ø´ðһϡ£
“Õ⻹²»¼òµ¥Âð£¿ÓÃÕýÔòÆ¥Åäһϲ»¾ÍÐÐÁ˔С°×ÃþÁËÃþºóÄÔÉ×£¬“ÖÁÓÚ³ÌÐòÂï——»¹Ã»Ð´£¬ºÙºÙ¡£”
ÏÈ¿´¿´ÎÒÃǵē¼ÓÇ¿°æ”ÅÀ³æ£º
import chardet,urllib.request,urllib.parse,re
from sparrow.http.response import Response
class Crawler(object):
def __init__(self,reqQueue=None,resQueue=None): #Á½¸ö¶ÓÁУ¬Ò»¸öÓÃÓÚÌáÈ¡ÍøÒ³ÇëÇó£¬Ò»¸ö´æ·ÅץȡµÄ½á¹û
self.reqQueue=reqQueue
self.resQueue=resQueue
def getData(self,request): #»ñµÃÍøÒ³Êý¾ÝºÍ·þÎñÆ÷·µ»ØµÄ±àÂë
response=urllib.request.urlopen(request)
info=Response(response)
encoding=info.charset
data=response.read()
return (encoding,data)
def getLinks(self,url,content): #½âÎöÍøÒ³ÄÚÈÝ£¬»ñµÃËùÓÐÁ´½Ó
rule_link=re.compile(r"(?i)<a\s+href=(?P<link>.+?)[\s>]") #ÕýÔò±í´ïʽ
links=rule_link.findall(content)
norLinks=[]
for i,link in enumerate(links): #½«²»ÍêÈ«µÄÁ´½Ó£¨Í¨³£ÎªÕ¾ÄÚÁ´½Ó£©²¹³äÍêÈ«
link=link.strip("\"\'") #¹ýÂËÒýºÅ
us=urllib.parse.urlsplit(link)
if(us.scheme==''):
link=urllib.parse.urljoin(url,link)
norLinks.append(link)
return norLinks
def getEnco
Ïà¹ØÎĵµ£º
ÒÔÏÂ"#"¿ªÍ·ÊÇUbuntuÖÕ¶ËÃüÁî
1¡£Ê×ÏȰ²×°Ubuntu10.04
²Î¿¼ http://wiki.ubuntu.org.cn/
2¡£ÐÞ¸ÄrootÓû§ÃÜÂë
3¡£Ê¹ÓÃrootµÇ½ϵͳ
4¡£UbuntuĬÈÏÒѾ°²×°python2.6.5
5¡£ÏÂÔØstackless
²é¿´ÍøÖ· http://zope.stackless.com/download/sdocument_view
# cd /usr/src
# wget http://www.sta ......
Python ×Ô1.5°æ±¾ÆðÔö¼ÓÁËre Ä£¿é£¬ËüÌṩ Perl ·ç¸ñµÄÕýÔò±í´ïʽģʽ¡£Python 1.5֮ǰ°æ±¾ÔòÊÇͨ¹ý regex Ä£¿éÌṩ Emecs ·ç¸ñµÄģʽ¡£Emacs ·ç¸ñģʽ¿É¶ÁÐÔÉÔ²îЩ£¬¶øÇÒ¹¦ÄÜÒ²²»Ç¿£¬Òò´Ë±àддúÂëʱ¾¡Á¿²»ÒªÔÙʹÓà regex Ä£¿é£¬µ±È»Å¼¶ûÄ㻹ÊÇ¿ÉÄÜÔÚÀÏ´úÂëÀï·¢ÏÖÆä×ÙÓ°¡£
& ......
×î½üÔÚÓÃPython´¦ÀíһЩÊý¾Ý£¬Êý¾ÝÐèÒª´æ´¢µ½MySQLÊý¾Ý¿âÖУ¬²ÉÓÃMySQLdbÀ´½øÐÐÊý¾Ý¿âµÄ²Ù×÷£¬µ«ÊDZ»Ò»¸öÎÊÌâÀ§ÈÅÁ˺ܾá£ÔÚ´ò¿ªÊý¾Ý¿âµÄʱºòMySQLdb.connect(self.host, self.user, self.password, self.database, port=self.port)³öÒì³££¬¶øÇÒÒì³£³öÏÖµÄλÖ÷dz£Ææ¹Ö¡£
³öÏÖÔÚconverters.py 164ÐÐ
from decimal import ......
execÓï¾äÓÃÀ´Ö´Ðд¢´æÔÚ×Ö·û´®»òÎļþÖеÄPythonÓï¾ä¡£ÀýÈ磬ÎÒÃÇ¿ÉÒÔÔÚÔËÐÐʱÉú³ÉÒ»¸ö°üº¬Python´úÂëµÄ×Ö·û´®£¬È»ºóʹÓÃexecÓï¾äÖ´ÐÐÕâЩÓï¾ä¡£ÏÂÃæÊÇÒ»¸ö¼òµ¥µÄÀý×Ó¡£
>>> exec 'print "Hello World"'
Hello World
evalÓï¾äÓÃÀ´¼ÆËã´æ´¢ÔÚ×Ö·û´®ÖеÄÓÐЧPython±í´ïʽ¡£ÏÂÃæÊÇÒ»¸ö¼òµ¥µÄÀý×Ó¡£
>>> ......
̸µ½ÁËPythonÓï¾äµÄÁ½ÖÖÖ´Ðз½Ê½£¬Êµ¼ÊÉÏ£¬ÕâÁ½ÖÖÔËÐз½Ê½ÔÚ±¾ÖÊ ÉÏÊÇÏàͬµÄ£¬ËüÃǶ¼ÊÇÓɽâÊÍÆ÷À´½âÊÍÖ´ÐÐÎÒÃÇÌṩµÄPythonÓï¾ä¡£ ÕâÀïËù˵µÄ½âÊÍÖ´ÐÐÊÇÏà¶ÔÓÚ±àÒëÖ´ÐжøÑԵġ£ÎÒÃÇÖªµÀ£¬Ê¹ÓÃÖîÈç C»òC++Ö®ÀàµÄ±àÒëÐÔÓïÑÔ±àдµÄ³ÌÐò¿ÉÒÔ´ÓÔ´Îļþת»»³É¼ÆËã»úʹÓà µÄ»úÆ÷ÓïÑÔ£¬ ¾Á¬½ÓÆ÷Á¬½ÓºóÐγɶþ½øÖÆ¿ÉÖ´ÐÐÎļþ¡£µ±ÎÒÃÇÔ ......