jodd jerry 解析html

http://jodd.org/doc/jerry/index.html
 
http://www.oschina.net/code/snippet_12_7758
 
public class AllMusicNewReleases {
02   
03     public static void main(String[] args) throws IOException {
04   
05         // download the page super-efficiently
06         File file = new File(SystemUtil.getTempDir(), "allmusic.html");
07         NetUtil.downloadFile("http://allmusic.com", file);
08   
09         // create Jerry, i.e. document context
10         Jerry doc = Jerry.jerry(FileUtil.readString(file));
11   
12         // parse
13         doc.$("div#new_releases div.list_item").each(new JerryFunction() {
14             public boolean onNode(Jerry $thisint index) {
15                 System.out.println("-----");
16                 System.out.println($this.$("div.album_title").text());
17                 System.out.println($this.$("div.album_artist").text().trim());
18                 return true;
19             }
20         });
21     }
22 }

猜你喜欢

转载自zhitangrui2010.iteye.com/blog/2211960