Ò׽ؽØÍ¼Èí¼þ¡¢µ¥Îļþ¡¢Ãâ°²×°¡¢´¿ÂÌÉ«¡¢½ö160KB

ʹÓÃpython»ñÈ¡htmlÒ³ÃæµÄÄÚÈÝ

import urllib
from HTMLParser import HTMLParser
class TitleParser(HTMLParser):
def __init__(self):
self.title = ''
self.divcontent = ''
self.readingtitle = 0
self.readingdiv = 0
HTMLParser.__init__(self)
def handle_starttag(self, tag, attrs):
if tag == 'title':
self.readingtitle = 1
if -1 != tag.find("div"):
self.readingdiv = 1
def handle_data(self, data):
if self.readingtitle:
# Ordinarily, this is slow and a bad practice, but
# we can get away with it because a title is usually
# small and simple.
self.title += data
if self.readingdiv:
self.divcontent += data
def handle_endtag(self, tag):
if tag == 'title':
self.readingtitle = 0
if tag == "div":
self.readingdiv = 0
def gettitle(self):
return self.title
def getdiv(self):
return self.divcontent
def getweb(url):
web = urllib.urlopen('http://blog.chinaunix.net/u3/105068/showart_2223566.html').read()
return web
web = getweb('http://blog.chinaunix.net/u3/105068/showart_2223566.html')
test = TitleParser()
test.feed(web)
file_object = open('abinfile', 'w')
file_object.write(test.title)
file_object.write("\r\n")
file_object.write(test.divcontent)
file_object.close()


Ïà¹ØÎĵµ£º

HTML 5 ʼþÊôÐÔ


±ê׼ʼþÊôÐÔ
HTML 4 Ôö¼ÓÁËͨ¹ýʼþ´¥·¢ä¯ÀÀÆ÷ÖÐÐÐΪµÄÄÜÁ¦£¬±ÈÈçµ±Óû§µã»÷ij¸öÔªËØÊ±Æô¶¯Ò»¶Î JavaScript¡£
Èç¹ûÐèҪѧϰ¸ü¶àÓйØÊ¹ÓÃÕâЩʼþ½øÐбà³ÌµÄÄÚÈÝ£¬ÇëѧϰÎÒÃÇµÄ JavaScript ½Ì³Ì ºÍ DHTML ½Ì³Ì¡£
ÏÂÃæµÄ±í¸ñÁгöÁ˿ɲåÈë HTML 5 ÔªËØÖÐÒÔ¶¨ÒåʼþÐÐΪµÄ±ê׼ʼþÊôÐÔ¡£
HTML 4.01 Óë HTML 5 Ö®¼äµÄ²îÒì ......

windows & python & Emacs


ÉèÖÃEmacsµÄHOME,¿ÉÒÔÔÚscratch bufferÖÐÊäÈ룺(insert (getenv "HOME"))²é¿´
ÏÂÔØpython-mode.el
ÓÃEmacs´ò¿ªpython-mode.elÈ»ºóM-x byte-compile-file±àÒëpython-mode.elΪelc
C-h v²é¿´load-path±äÁ¿
ÔÚHOME/.emacs.d/init.elÖÐÌí¼Ó(setq load-path (cons "D:\\emacs-23.1-bin-i386" load-path))
½«pyÎļþÓëpytho ......

Python ÖеÄ×Ö·û±àÂë

1¡¢strÀàÐÍ¿ÉÒÔÀí½âΪһ¸ö¶þ½øÖÆblock£¬»òmultibyte
2¡¢multibyte_str.decode("<multibyte_encode_method>")  -> unicode
3¡¢unicode_str.encode("<multibyte_encode_method>")  -> multibyte_str(binary block)
4¡¢unicode_str µÄ²Ù×÷²ÎÊýҲӦΪunicode£¬È磺unicode_str.find("Ñù±¾".deco ......

Java¡¢JSP¡¢HTML¡¢XML±àд¶¼Òª·ûºÏÒÔϱàÂë¹æÔò£¡

1 ËùÓÐµÄ .java|.jsp|.html|.xml Ô´Îļþ¾ùʹÓÃutf-8±àÂë¸ñʽ±£´æµ½ÏµÍ³´ÅÅÌ¡£
È磺ÔÚEclipseÖб༭Îļþ£¬Ñ¡ÖÐÎļþ´ò¿ªÓÒ¼ü²Ëµ¥Ñ¡ÔñÊôÐÔ£¬½«Îı¾Îļþ±àÂëÉèÖÃΪÆäËû²¢Ñ¡ÔñUTF-8£»Ò²¿ÉÒÔÔÚ
Eclipse——Ê×Ñ¡Ïî——³£¹æ——ÄÚÈÝÀàÐÍÖÐÉèÖø÷ÖÖÎļþµÄȱʡ±àÂ룬ÕâÑùÒÔºóËùÓеÄÎı¾Îļþ¶¼Ê¹ÓÃÍ³Ò ......

ÄãÕæµÄÖªµÀÒ»¸öHTML¼°×ÊÔ´ÊÇÈçºÎloadµÄÂð?

 
ÄãÕæµÄÖªµÀÒ»¸öHTML¼°×ÊÔ´ÊÇÈçºÎloadµÄÂð(Á˽â¸÷¸ö²¿·ÖÊǺÎʱÏÂÔØºÍÖ´ÐеÄ)
Ô­ÎĵØÖ·£ºhttp://www.cnblogs.com/mindsbook/archive/2009/12/03/sequence_of_response.html
±¾²©¿ÍËùÓÐÄÚÈݲÉÓàCreative Commons Licenses Ðí¿ÉʹÓÃ. ÒýÓñ¾ÄÚÈÝʱ£¬Çë±£Áô ÖìÌÎ, ³ö´¦ £¬²¢ÇÒ ·ÇÉÌÒµ& ......
© 2009 ej38.com All Rights Reserved. ¹ØÓÚE½¡ÍøÁªÏµÎÒÃÇ | Õ¾µãµØÍ¼ | ¸ÓICP±¸09004571ºÅ