python ÅÀ³æ³ÌÐòÏê½â
1 #!/usr/bin/python ʹÓÃħ·¨×Ö·ûµ÷ÓÃpython
2
3 from sys import argv µ¼ÈësysÊǵ¼Èëpython½âÊÍÆ÷ºÍËû»·¾³Ïà¹ØµÄ²ÎÊý
4 from os import makedirs,unlink,sep
osÖ÷ÒªÌṩ¶Ôϵͳ·¾¶£¬ÎļþÖØÃüÃûºÍɾ³ýÎļþËùÐèµÄº¯Êý
makedirsÊÇ´´½¨µÝ¹éÎļþ¼ÐµÄº¯Êý¡£±ÈÈç˵ÎÒÃÇÒª´´½¨Ò»¸öеÄĿ¼£¬/python/HTML/crawl,µ«ÊÇĿǰÕâÈý¸öÎļþ¼Ð¶¼²»´æÔÚ£¬Èç¹ûʹÓÃmkdirÃüÁîµÄ»°ÐèҪʹÓÃÈý´Î²ÅÄÜÍê³É£¬µ«ÊÇʹÓÃos.makedirÖ»ÐèʹÓÃÒ»´Î¾Í¿ÉÒÔ´´½¨ºÃÕû¸öĿ¼¡£
os.makedirs(os.path.join(os.erviron["HOME"],"python","HTML","crawl")
os.unlink(path)ɾ³ýfile·¾¶£¬ºÍremove()Ïàͬ¡£
sep os.sepϵͳÓôËÀ´·Ö¸î·¾¶Ãû
5 from os.path import dirname,exists,isdir,splitext
ʹÓÃosÖеÄÕâЩģ¿éÀ´ÌáÈ¡dirname·¾¶Ãû£¬exists,isdirÊÇÎļþÀàÐͲâÊÔ£¬²âÊÔÊÇ·ñÊÇÒ»¸öĿ¼£¬splitextÊǽ«ÎļþÃûºÍÎļþºó׺·ÖÀë¡£·Ö³ÉĿ¼ÎļþÃûºÍºó׺Á½²¿·Ö¡£
6 from string import replace,find,lower
µ¼ÈëstringÄ£¿é£¬ÓÃÓÚ×Ö·û´®µÄÌæ»»£¬²éÕÒ£¬ºÍСд»¯¡£
7 from htmllib import HTMLParser
8 from urllib import urlretrieve
urlretrieve()º¯ÊýÓÃÓÚ½«HTMLÎļþÕû¸öÏÂÔØµ½ÄãµÄ±¾µØÓ²ÅÌÖÐÈ¥¡£
9 from urlparse import urlparse,urljoin
urlparseÓÃÓÚ½«URL·Ö½â³É6¸öÔªËØ
¶øurljoinÓÃÓÚ½«baseurlºÍnewurl×éºÏÔÚÒ»Æð
10 from formatter import DumbWriter,AbstractFormatter
formatterº¯ÊýÖ÷ÒªÓÃÓÚ¸ñʽ»¯Îı¾
11 from cStringIO import StringIO
µ÷ÓÃcStringIOº¯Êý¶ÔÄÚ´æÖеÄÎļþ½øÐд¦Àí
12
13 class Retriever:
RetrieverÀฺÔð´ÓÍøÉÏÏÂÔØÍøÒ³²¢¶Ôÿһ¸öÎĵµÀïÃæµÄÁ¬½Ó½øÐзÖÎö£¬Èç¹û·ûºÏÏÂÔØÔÔò¾ÍÌí¼Óµ½“´ý´¦Àí”¶ÓÁÐÖС£´ÓÍøÉÏÏÂÔØµ½µÄÿ¸öÖ÷Ò³¶¼ÓÐÒ»¸öÓëÖ®¶ÔÓ¦µÄRetrieverʵÀý¡£RetrieverÓм¸¸ö°ïÖúʵÏÖ¹¦Äܵķ½·¨£¬·Ö±ðÊÇ£º¹¹ÔìÆ÷(__init__()),filename(),download()ºÍparseAndGetLinks()¡£
14 def __init__(self,url): ¶¨Òå¹¹ÔìÆ÷£¬Ö¸Ïòµ±Ç°ÀàµÄµ±Ç°ÊµÀýµÄÒýÓᣠself Ö¸Ïòд´½¨µÄ
¶ÔÏó£¬ÁíÍâÒ»¸ö²ÎÊýÊÇurl.¹¹ÔìÆ÷ʵÀý»¯Ò»¸öRetriever¶ÔÏ󣬲¢ÇÒ°ÑURL×Ö·û´®ºÍ´Ófilename()·µ»ØµÄÓëÖ®¶ÔÓ¦µÄÎļþÃû±£´æÎª±¾µØÊôÐÔ¡£
15 self.url=url
½«urlµÄÖµ¸¶¸øself.url
16 self.file=self.filename(url)
???
17 def filename(self,url,deffile="index
Ïà¹ØÎĵµ£º
ΪÁËÈÃUEÖ§³ÖpythonÓïÑÔ£¬googleÁ˺ܶ࣬½á¹û¶¼²»ÐУ¬×îºó¿´ÁËÏÂÃæµÄ²©¿Í²ÅÖªµÀ´íÄÄÁË£¬×ܽáÏÂÒÔÃâÍü¼Ç¡£
http://wangtao.name/2009/12/20/ultraedit_python.html
ÔÚ¹ÙÍøÉÏÕÒµ½pythonµÄÀ©Õ¹ÏÂÔØµã£ºhttp://www.ultraedit.com/downloads/extras.html
Óи÷ÖÖÓïÑÔµÄÀ©Õ¹£¬±ã¿ÉÒÔÖ§³ÖÓï·¨¸ßÁÁ¡£
python 2.5£ºhttp://www.u ......
Ò»¡¢ÈçºÎÄÜÔÚÃüÁîÐÐģʽÏÂÔËÐÐPythonÃüÁ
ΪÁËÔÚÃüÁîÐÐģʽÏÂÔËÐÐPythonÃüÁÐèÒª½«python.exeËùÔÚµÄĿ¼¸½¼Óµ½PATHÕâ¸ö»·¾³±äÁ¿ÖС£
¶þ¡¢ÈçºÎʹPython½âÊÍÆ÷ÄÜÖ±½ÓimportĬÈϰ²×°Â·¾¶ÒÔÍâµÄµÚÈý·½Ä£¿é£¿
ΪÁËÄÜimportĬÈϰ²×°Â·¾¶ÒÔÍâµÄµÚÈý·½µÄÄ£¿é£¨Èç×Ô¼ºÐ´µÄÄ£¿é£©£¬ÐèҪн¨PYTHONPATH»·¾³±äÁ¿£¬»·¾³±äÁ¿µÄÖµÉè ......
Python ×Öµä
×ÖµäÀàËÆÓÚÄãͨ¹ýÁªÏµÈËÃû×Ö²éÕÒµØÖ·ºÍÁªÏµÈËÏêϸÇé¿öµÄµØÖ·²¾£¬¼´£¬ÎÒÃǰѼü£¨Ãû×Ö£©ºÍÖµ£¨ÏêϸÇé¿ö£©ÁªÏµÔÚÒ»Æð¡£×¢Ò⣬¼ü±ØÐëÊÇΨһµÄ£¬¾ÍÏñÈç¹ûÓÐÁ½¸öÈËÇ¡ÇÉͬÃûµÄ»°£¬ÄãÎÞ·¨ÕÒµ½ÕýÈ·µÄÐÅÏ¢¡£
×¢Ò⣬ÄãÖ»ÄÜʹÓò»¿É±äµÄ¶ÔÏ󣨱ÈÈç×Ö·û´®£©À´×÷Ϊ×ÖµäµÄ¼ü£¬µ«ÊÇÄã¿ÉÒÔ²»¿É±ä»ò¿É±äµÄ¶ÔÏó×÷Ϊ×Öµäµ ......
²»¶à˵ÁË£¬Ö±½Ó¿´´úÂë°É£¡
import os
path = 'e:/Download/'
kzm = []
kzmTemp = set()
kzmTemp2 = []
dict = {}
for root,dirs,files in os.walk(path):
for file in files:
ext = os.path.splitext(file)[1][1:]
  ......
´ÓÈ¥Ä껹û±ÏÒµ¾Í½Ó´¥Python£¬ÉÏÖÜÓÐЩÎÞÁÄÖØÐÂÔÙ¿´Ò»±é£¬·¢ÏÖÆäȷʵ²»´í¡£Óï·¨¼òµ¥£¬Ò»¸öÏÂÎç»ù±¾Á˽⣬ʹÓÃPydev²å¼þÔÚEclipseÖнøÐпª·¢»ù±¾ÉÏûÓÐÈκÎÕϰ¡£ÖصãÊÇÆäЧÂʺܸߣ¬²»Ðè±àÒëÖ±½ÓÔËÐС£±È½ÏÊʺϽøÐÐÊý¾ÝµÄÔ¤´¦Àí¡£²»´í£¬ÒÔºóÓлú»áºÃºÃÓÃÓᣠ......