»°ËµPython£¨ËÄ£©»¶ÓСÂéȸ
С°×ÊǸö΢ÈíÃÔ£¬ËûµÄżÏñÊDZȶû´óÊ壬ÔÒòµ±È»ÊǵØÇòÈ˶¼ÖªµÀÀ²¡£´ó¶þµÄʱºò£¬ËûµÄ“ê¡Ñ§¼Æ»®”ÔøÒ»¶ÈµÃ³Ñ£¬ÔÒòÊÇËû¹Ò¿ÆÌ«¶à¡£µ±È»£¬´óÈýÐÂѧÆÚ¿ªÊ¼µÄʱºò£¬Ãæ¶Ô¹«ÖÚÖÊÒÉ£¬Ð¡°×Õ¾ÔÚÒÎ×ÓÉÏ£¬Ïñ¼«ÁË¡¶´óÄÚÃÜ̽ÁãÁã·¢¡·ÀïµÄÎ÷ÃÅ´µÑ©£º“ÊÀ½çÊ׸»±È²»Ò»¶¨Óжà³öÉ«£¬ÕâÖ»²»¹ýÊÇÄãÃÇÕâЩÐǶ·ÊÐÃñÒ»ÏáÇéÔ¸µÄÏë·¨°ÕÁË¡£”
ÿÌìС°×¶¼»áÔÚËÞÉá×ªÓÆ£¬ºÃÏñºÜ“¹Â¶À”£¬×ìÀïÄîÄîÓдʣº“Õâ¸öÊÀ½çÕýÔÚ·¢Éú×Å·Ì츲µØµÄ±ä»¯£¬¶øÎÒÃÇÈ´ÏñÃ«Â¿ËÆµÄÉú»î¡£”×îºó£¬Ëû×Ü»áÀ´Ò»¾ä£º“ÎÒÒª³ÉÁ¢µÚ¶þ¸ö¹È¸è£¡”
Õâ½Ú¿Î£¬ÎÒÃǾͻáÁ˽âËÑË÷ÒýÇæ£¬»¹»á±àдһ¸öСÐ͵ÄÍøÂçÅÀ³æ¡£
ËÑË÷ÒýÇæÓÐÄ¿·Ö¹¹³É£¿
Ê×ÏȸÐлÕâÕÅͼµÄÔ×÷Õߣ¬Ö÷Òª»¹ÊÇÒª¸ÐлCountry¡£Í¨¹ýÕâÕÅͼ£¬ÎÒÃÇ¿ÉÒÔ¿´µ½£ºÊ×ÏÈ£¬ÍøÂçÖ©Öë×¥È¡ÍøÒ³£¬½«ÍøÒ³ÄÚÈݼ°Á´½Ó´æµ½Êý¾Ý¿âÖС£È»ºóÓÉË÷ÒýÄ£¿é½¨Á¢¹Ø¼ü´Êµ½ÍøÖ·µÄË÷Òý£¬¹©¼ìË÷Ä£¿é²éѯ¡£¼ìË÷Ä£¿éÊǸù¾ÝÄãÊäÈëµÄÄÚÈÝ´ÓË÷ÒýÊý¾Ý¿âÌáÈ¡Êý¾Ý¡£Ö÷Ҫģ¿é½éÉÜÈçÏ£º
ÍøÒ³×¥È¡Ä£¿é£º°üÀ¨CrawlerºÍCrawler control£¬ÆäÖÐCrawler¸ºÔðץȡ²¢·ÖÎöÍøÒ³Á´½Ó£¬·µ»ØpageºÍurl£»Crawler control¸ºÔð¿ØÖÆ¡¢µ÷¶ÈCrawler¡£
ÍøÒ³´æ´¢Ä£¿é:Page cache£¬ÓÃÓÚ´æ´¢Crawlerץȡµ½µÄÍøÒ³ÄÚÈÝ¡£
Ë÷ÒýÄ£¿é:½¨Á¢¹Ø¼ü´Êµ½Á´½ÓºÍÍøÒ³µÄË÷Òý¡£
¼ìË÷Ä£¿é£º½«Òª²éѯµÄÄÚÈÝ·Ö½âΪÊʺϲéѯµÄ´Ê¡£
Óû§½Ó¿Ú£º½ÓÊÜÓû§ÊäÈ룬´«µÝµ½¼ìË÷Ä£¿é¡£
½ÓÏÂÀ´µÄ¿Î³ÌÀïÎÒÃÇ»á¸ù¾ÝËùѧµÄPython֪ʶ¿ª·¢Ò»¸öСÐ͵ÄËÑË÷ÒýÇæ¡£Ãû×Ö½ÐSparrow¼´Âéȸ£¬Òâ˼ÊÇ“ÂéȸËäС£¬ÎåÔà¾ãÈ«”¡£ÎÒÃǵēÂéȸ”»áËæ×ÅÎÒÃÇ֪ʶµÄÔö¼Ó¶øÔ½·ÉÔ½¸ß£¬Ëµ²»¶¨»á±ä³É·ï»ËÄØ¡£µ±È»£¬ÏÖÔÚËü»¹Ã»ÓÐÆð·É¡£
ÈÃÎÒÃÇ¿ªÊ¼ËÑË÷ÒýÇæÖ®Âðɣ¡
Ê×ÏÈÎÒÃÇҪѧϰµÄÄ£¿éÊÇÍøÒ³×¥È¡Ä£¿é£¨Crawler£©£¬ÓÖ½Ð×öÍøÂçÖ©Ö루Spider£©¡£
Õâ¸öÄ£¿éÓÉCrawlerÀàÀ´Íê³É£¬¸ÃÀà³õʼ»¯Ê±Ê×ÏȽÓÊÜCrawler controlÄ£¿é´«µÝµÄurl£¬Ö´ÐÐÍê±Ï×îºó·µ»ØÍøÒ³ÄÚÈÝpageºÍÍøÒ³ÄÚ³öÏÖµÄurlÁ´½Ólink¡£Ô´ÂëÈçÏ£º
import urllib.request #ÓÃÓÚ»ñÈ¡ÍøÒ³ÄÚÈÝ
import urllib.parse #½âÎöÍøÖ·µÄÄ£¿é
import re #ÕýÔò±í´ïʽ
import queue #²Ù×÷¶ÓÁеÄÄ£¿é
class Crawler(object): #ÍøÂç×
Ïà¹ØÎĵµ£º
ÐèÒªÏȰ²×°libxml2-devel libxslt-develÕâÁ½¸örpm°ü£¬Èç¹ûʹÓ÷ÇrootÓû§°²×°£¬¿ÉÒÔÏÂÔØlibxml2ºÍlibxsltµÄÔ´´úÂë½øÐа²×°¡£ libxml2-devel¡¢libxslt-devel×°ºÃºó£¬½âѹlxmlµÄ°ü£¬Çл»µ½Õâ¸ö°üµÄ·¾¶¡£
¼ÓÈëCFLAGS½øÐбàÒëºÍ°²×°£¬ÔÚshellÏÂÒÀ´ÎÊäÈëÈçÏÂÃüÁ CFLAGS=-I/usr/include/libxml2:/usr/include/libxslt/ ......
Python£¬¼òµ¥µÄÁ¦Á¿
¶¹°êÍø¶Ô»¥ÁªÍøÓû§À´ËµÊÇÖªÃûµÄWeb 2.0ÉçÇø£¬µ«¶Ô¿ª·¢Õß¶øÑÔ£¬¸üÖØÒªµÄÊÇÒ»¸öÓ¦ÓÃPython´òÔìµÄ·Ç³£³É¹¦µÄWeb 2.0Õ¾µã¡£
Pythonµ®ÉúÒÑÓÐ20ÄêµÄÀúÊ·£¬Ä¿Ç°¹úÄÚµÄPython·¢Õ¹×´¿öÈçºÎ£¿ÓëÆäËûÓïÑÔ£¨Æ½Ì¨£©Ïà±È£¬PythonÓÐÄÄЩÏÔ¶øÒ×¼ûµÄÓŵ㣿¶¹°êÓ¦ÓÃPython½øÐÐWeb¿ª·¢ÓÐÄÄЩ³É¹¦¾Ñé¿ÉÒÔ½è¼ø£¿´ø×ÅÕâЩΠ......
ǰÑÔ£º
×î½üÓÖÏëѧϰpython,ÓÖÏëÈ¥ÎÂϰһÏÂËã·¨£¬ÓÚÊǾÍÏë³öÁËÕâ¸öÁ½²»ÎóµÄ·½·¨,^_^
¶ÑÕ»£º
ʹÓÃpythonµÄÁбí½á¹¹£¬ÏêÇé¿ÉÒԲ鿴help(list)
#Filename: stack.py
shoplist=['apple','mango','carrot','banana']
shoplist.append('rice')
popitem=shoplist[-1]
del shoplist[-1]
print 'the popitem is',popitem
......
¶ÓÁУº
Óë¶ÑÕ»ÀàËÆ£¬Í¨¹ýpythonµÄÁбíÀàÐÍÀ´ÊµÏÖ£¬²Î¿¼ help(list)
shoplist=['apple','mango','carrot','banana']
print 'I have',len(shoplist),'items to purchase'
print 'these items are:'
for item in shoplist:
print item,
shoplist.append('rice')
print 'my shopping list is now', shoplist
shoplist. ......