Ò׽ؽØÍ¼Èí¼þ¡¢µ¥Îļþ¡¢Ãâ°²×°¡¢´¿ÂÌÉ«¡¢½ö160KB

Java HTML ParserÓ¦ÓÃ

×î½üÒòΪÏîÄ¿ÐèÒª£¬Ñо¿ÁËjava html parserÀà¿âµÄÓ¦Ó᣼ǼÏÂʹÓÃÒªµã£º
Ö÷ÒªµÄÀà˵Ã÷£º
1¡¢ParserÀà
½âÎöÆ÷Ö÷À࣬¸ºÔðÔØÈëHTML´úÂë²¢½âÎö¡£
2¡¢Node½Ó¿Ú
ÓÃÀ´±íÕ÷ÔÚ½âÎö¹ý³ÌÖÐʹÓõÄÓï·¨µ¥Ôª¡£Ê¾ÀýÈç϶Îhtml´úÂ룺
<span> ----Tag node
text ----Text Node
</span>
Îı¾ºÍ±êÇ©¶¼ÊǶÀÁ¢µÄnodeÔªËØ¡£textÎı¾ÊDZêÇ©spanµÄchild node
3¡¢NodeFilter
±êÇ©¹ýÂËÆ÷½Ó¿Ú£¬ÓÃÀ´ÔÚparser»òNodeListÖйýÂ˳öÐèÒªµÄijһÀànode¡£
4¡¢NodeList
Êý¾Ý½á¹¹£¬±íʾNodeµÄ¼¯ºÏ
ÐèÒªÌØ±ð×¢ÒâµÄµØ·½£º
ParserºÍNodeList¶¼ÓÐÒ»¸öÃûΪextractAllNodesThatMatch(NodeFilter filter)µÄ·½·¨ÓÃÀ´¹ýÂ˳ö·ûºÏij¸öÌõ¼þµÄnode£¬µ«ÊÇÆäÄÚ²¿µÄʵÏÖ»úÖÆ²»Í¬¡£
ParserÊÇÔÚ½âÎöÆ÷µÄ¹¦ÄÜ»ù´¡ÉÏʹÓÃIterorʵÏÖ¡£Ã¿´Îµ÷Óø÷½·¨ºóÐèÒªÖ´ÐÐreset·½·¨£¬·ñÔò»áÓ°ÏìÏÂÒ»´Îµ÷ÓõĽá¹û¡£
¶øNodeListÊÇÔÚÄÚ²¿µÄÊý×éÉϽøÐÐÑ­»·Åжϣ¬Òò´Ë¸÷´Îµ÷ÓÃÖ®¼ä²»»á»¥ÏàÓ°Ï죬ЧÂÊÒ²±ÈParserµÄ¸ß£¬ÍÁ½¨Ê¹Óá£
´úÂëʾÀý£º
ʵÏÖgetElementByID¹¦ÄÜ
<code>
public class NodeIDFilter implements NodeFilter {
 private String id;
 public NodeIDFilter(String id)
 {
 this.id=id;
 }
 public boolean accept(Node node) {
 if(node instanceof Tag)
 {
 if(!((Tag)node).isEndTag())
 {
 String s=((Tag)node).getAttribute("id");
 if(s!=null)
 return s.equals(this.id);
 }
 }
 return false;
 // throw new UnsupportedOperationException("Not supported yet.");
 }
}
public class MHTMLParser
{
....
protected Node getElementById(String id) throws ParserException
 {
 //this.myparser.reset();
 if(this.mNodeList==null||this.mNodeList.size()==0) return null;
 NodeIDFilter nodef = new NodeIDFilter(id);
 NodeList nl = this.mNodeList.extractAllNodesThatMatch(nodef,true);
 //
 if (nl.size() != 0)
 {
 return nl.elementAt(0);
 }
 return null;
 }
 
}
</code>


Ïà¹ØÎĵµ£º

HTML語·¨×Öµä

 
語·¨
說Ã÷
<! - - ... - ->
ÓÃì¶HTMLÎļþÖеÄ註½â
<A HREF TARGET=>
Ö¸¶¨³¬鏈結µÄ·Ö¸î視´°Ä¿標
<A HREF=#錨µÄÃû稱>
設¶¨鏈結錨點µÄÃû稱
<A HREF=>
設¶¨³¬鏈 ......

javaÈÕÆÚ´¦ÀíSystem£¨1£©

È¡µÃµ±Ç°ÏµÍ³Ê±¼ä   System.currentTimeMillis()
·µ»ØºÁÃëʱ¼ä£¬ÎªlongÐÍÕûÊý£¬
¸Ãº¯ÊýÔÚ½øÐÐϵͳÐÔÄÜ¼à¿ØÊ±×î³£Óõ½
Àý£º
  long start = System.currentTimeMillis();
  //Òª¼à¿ØµÄ´úÂë¶Î
  long end = System.currentTimeMillis();
  System.out.println(end-start);
ÓÐʱҪ¼à¿Øµ ......

Win7ϵͳÏÂJava»·¾³±äÁ¿µÄÅäÖÃ

win7ϵͳÏÂJava»·¾³±äÁ¿µÄÅäÖà       Jdk°æ±¾Îª1.6
´ò¿ª¿ØÖÆÃæ°å->Óû§ÕÊ»§->¸ü¸ÄÎҵĻ·¾³±äÁ¿
±äÁ¿Ãû£ºJAVA_HOME
±äÁ¿Öµ£ºD:\Program Files\Java\jdk1.6.0_12(ÕâÖ»ÊÇÎÒµÄJDK°²×°Â·¾¶)
±äÁ¿Ãû£ºPath
ÔÚ±äÁ¿ÖµµÄ×îÇ°Ãæ¼ÓÉÏ£º%JAVA_HOME%\bin;£¨ÈôÒѾ­ÓÐPathÏÎÞÐëÁíÍâР......

½âÎöUnicode±àÂëºÍJava char

JavaµÄ×Ö·ûÀàÐͲÉÓõÄÊÇUTF-16±àÂ뷽ʽ¶ÔUnicode±àÂë±í½øÐбíʾ¡£ÆäÖÐÒ»¸öcharÀàÐ͹̶¨2Bytes£¨16bits£©¡£Ê×ÏÈÏȽéÉÜÒ»ÏÂUnicode±àÂë±íºÍUTF-16±àÂëËã·¨£º
       Unicode±àÂë±íµÄרҵÊõÓ
       ´úÂëµã (code point): Ö¸ÔÚUnicode±àÂë±íÖÐÒ»¸ö×Ö·ûËù¶ÔÓ ......
© 2009 ej38.com All Rights Reserved. ¹ØÓÚE½¡ÍøÁªÏµÎÒÃÇ | Õ¾µãµØÍ¼ | ¸ÓICP±¸09004571ºÅ