Ò׽ؽØÍ¼Èí¼þ¡¢µ¥Îļþ¡¢Ãâ°²×°¡¢´¿ÂÌÉ«¡¢½ö160KB

python ÅÀ³æ³ÌÐòÏê½â

1 #!/usr/bin/python ʹÓÃħ·¨×Ö·ûµ÷ÓÃpython
2
3 from sys import argv  µ¼ÈësysÊǵ¼Èëpython½âÊÍÆ÷ºÍËû»·¾³Ïà¹ØµÄ²ÎÊý
4 from os import makedirs,unlink,sep
osÖ÷ÒªÌṩ¶Ôϵͳ·¾¶£¬ÎļþÖØÃüÃûºÍɾ³ýÎļþËùÐèµÄº¯Êý
makedirsÊÇ´´½¨µÝ¹éÎļþ¼ÐµÄº¯Êý¡£±ÈÈç˵ÎÒÃÇÒª´´½¨Ò»¸öеÄĿ¼£¬/python/HTML/crawl,µ«ÊÇĿǰÕâÈý¸öÎļþ¼Ð¶¼²»´æÔÚ£¬Èç¹ûʹÓÃmkdirÃüÁîµÄ»°ÐèҪʹÓÃÈý´Î²ÅÄÜÍê³É£¬µ«ÊÇʹÓÃos.makedirÖ»ÐèʹÓÃÒ»´Î¾Í¿ÉÒÔ´´½¨ºÃÕû¸öĿ¼¡£
os.makedirs(os.path.join(os.erviron["HOME"],"python","HTML","crawl")
os.unlink(path)ɾ³ýfile·¾¶£¬ºÍremove()Ïàͬ¡£
sep os.sepϵͳÓôËÀ´·Ö¸î·¾¶Ãû
5 from os.path import dirname,exists,isdir,splitext
ʹÓÃosÖеÄÕâЩģ¿éÀ´ÌáÈ¡dirname·¾¶Ãû£¬exists,isdirÊÇÎļþÀàÐͲâÊÔ£¬²âÊÔÊÇ·ñÊÇÒ»¸öĿ¼£¬splitextÊǽ«ÎļþÃûºÍÎļþºó׺·ÖÀë¡£·Ö³ÉĿ¼ÎļþÃûºÍºó׺Á½²¿·Ö¡£
6 from string import replace,find,lower
µ¼ÈëstringÄ£¿é£¬ÓÃÓÚ×Ö·û´®µÄÌæ»»£¬²éÕÒ£¬ºÍСд»¯¡£
7 from htmllib import HTMLParser
8 from urllib import urlretrieve
urlretrieve()º¯ÊýÓÃÓÚ½«HTMLÎļþÕû¸öÏÂÔØµ½ÄãµÄ±¾µØÓ²ÅÌÖÐÈ¥¡£
9 from urlparse import urlparse,urljoin
urlparseÓÃÓÚ½«URL·Ö½â³É6¸öÔªËØ
¶øurljoinÓÃÓÚ½«baseurlºÍnewurl×éºÏÔÚÒ»Æð
10 from formatter import DumbWriter,AbstractFormatter
formatterº¯ÊýÖ÷ÒªÓÃÓÚ¸ñʽ»¯Îı¾
11 from cStringIO import StringIO
µ÷ÓÃcStringIOº¯Êý¶ÔÄÚ´æÖеÄÎļþ½øÐд¦Àí
12
13 class Retriever:
RetrieverÀฺÔð´ÓÍøÉÏÏÂÔØÍøÒ³²¢¶Ôÿһ¸öÎĵµÀïÃæµÄÁ¬½Ó½øÐзÖÎö£¬Èç¹û·ûºÏÏÂÔØÔ­Ôò¾ÍÌí¼Óµ½“´ý´¦Àí”¶ÓÁÐÖС£´ÓÍøÉÏÏÂÔØµ½µÄÿ¸öÖ÷Ò³¶¼ÓÐÒ»¸öÓëÖ®¶ÔÓ¦µÄRetrieverʵÀý¡£RetrieverÓм¸¸ö°ïÖúʵÏÖ¹¦Äܵķ½·¨£¬·Ö±ðÊÇ£º¹¹ÔìÆ÷(__init__()),filename(),download()ºÍparseAndGetLinks()¡£
14  def __init__(self,url): ¶¨Òå¹¹ÔìÆ÷£¬Ö¸Ïòµ±Ç°ÀàµÄµ±Ç°ÊµÀýµÄÒýÓᣠ  self Ö¸Ïòд´½¨µÄ
¶ÔÏó£¬ÁíÍâÒ»¸ö²ÎÊýÊÇurl.¹¹ÔìÆ÷ʵÀý»¯Ò»¸öRetriever¶ÔÏ󣬲¢ÇÒ°ÑURL×Ö·û´®ºÍ´Ófilename()·µ»ØµÄÓëÖ®¶ÔÓ¦µÄÎļþÃû±£´æÎª±¾µØÊôÐÔ¡£
15   self.url=url
½«urlµÄÖµ¸¶¸øself.url
16   self.file=self.filename(url)
???
17  def filename(self,url,deffile="index


Ïà¹ØÎĵµ£º

python ÖÐÎÄÎÊÌâ

Èç¹ûPythonÔ´ÎļþÖгöÏÖÖÐÎÄ£¬ÐèÒªÔÚÔ´ÎļþµÚÒ»ÐмÓÉÏÀàËÆÈçϵĴúÂëÒ³Ö¸Á
# -*- coding:gbk -*-
Èç¹û³ÌÐòµÄÔËÐнá¹ûÖаüº¬ÖÐÎÄ£¬¿ÉÒÔÔÚ³ÌÐò¿ªÍ·°üº¬ÈçÏ´úÂ룬¾Í¿ÉÒÔÕýÈ·ÏÔʾÖÐÎĽá¹û£º
    import sys
    reload(sys)
    sys.setdefaultencoding('gbk')
......

beginning python summary chapter 6 ³éÏó

1¡¢Ê¹ÓÃdef¶¨Ò庯Êý
2¡¢º¯ÊýÎĵµ£ºÔÚº¯ÊýµÄ¿ªÍ·Ð´ÏÂ×Ö·û´®£¬Ëü¾Í»á×÷Ϊº¯ÊýµÄÒ»²¿·Ö½øÐд洢£¬³ÆÎª"Îĵµ×Ö·û´®"¡£
                    ÄÚ½¨µÄ½Ð×öhelpµÄº¯Êý·Ç³£ÓÐÓÃ,Èç¹ûÄãÔÚ½»»¥Ê½½âÊÍÆ÷ÖÐʹÓã¬Äã¿ÉÒԵõ ......

Python ×Ö·û´®


Python ×Ö·û´®
×Ö·û´®ÊÇ ×Ö·ûµÄÐòÁÐ ¡£×Ö·û´®»ù±¾ÉϾÍÊÇÒ»×éµ¥´Ê¡£
ÎÒ¼¸ºõ¿ÉÒÔ±£Ö¤ÄãÔÚÿ¸öPython³ÌÐòÖж¼ÒªÓõ½×Ö·û´®£¬ËùÒÔÇëÌØ±ðÁôÐÄÏÂÃæÕⲿ·ÖµÄÄÚÈÝ¡£ÏÂÃæ¸æËßÄãÈçºÎÔÚPythonÖÐʹÓÃ×Ö·û´®¡£
ʹÓõ¥ÒýºÅ£¨'£©
Äã¿ÉÒÔÓõ¥ÒýºÅָʾ×Ö·û´®£¬¾ÍÈçͬ'Quote me on this'ÕâÑù¡£ËùÓеĿհף¬¼´¿Õ¸ñºÍÖÆ±í·û¶¼ÕÕÔ­Ñù±£Áô¡£ ......
© 2009 ej38.com All Rights Reserved. ¹ØÓÚE½¡ÍøÁªÏµÎÒÃÇ | Õ¾µãµØÍ¼ | ¸ÓICP±¸09004571ºÅ