A recent crawler uses the thread pool (I was to create a fixed number of threads in the thread pool, and then continue to throw inside job)
If the site now required to link the task to climb after the start crawling links again, so I think in the thread pool tasks are executed to complete the task is restarted.
demo is as follows:
public static void ex(Connection conn) throws InterruptedException{
UrlTask urlTask = new UrlTask(7, conn); // 自己的任务类
ExecutorService pool = Executors.newFixedThreadPool(50);// 创建一个固定数量的线程池
pool.execute(urlTask);
pool.shutdown();
boolean flag = pool.awaitTermination(1, TimeUnit.MINUTES); // 1分钟检测一次线程池中的任务是否执行完成
if (!flag) {
ex(conn);//线程次中的任务执行完成后再次执行
}
}