HTML Parser

From BC$ MobileTV Wiki
Jump to: navigation, search
  • Hpricot is a JRuby HTML parser based on jQuery and Htree:

http://code.whytheluckystiff.net/hpricot/

  • Cobra is a pure Java-based HTML parser:

http://lobobrowser.org/cobra.jsp

  • jTidy HTML 4.0 Parser:

http://jtidy.sourceforge.net/

  • Apache Axiom XML and HTML parser:

http://ws.apache.org/commons/axiom/

  • Jericho HTML Parser:

http://jerichohtml.sourceforge.net/doc/index.html

  • NekoHTML allows XPath manipulation of HTML:

http://nekohtml.sourceforge.net/

  • Many HTML Parsing resources can be found through ProgrammersBible:

http://www.programmersbible.com/