1. Introduction
Executor multi-threading framework is built-in, no need to add third-party jar package
In order to improve the crawling efficiency of the crawler, we need to use multi-threading
Compared with the traditional Thread class, Java Executor is easy to use, has better performance, is easier to manage, and supports thread pools.
2. Commonly used interfaces:
Create a thread pool with a fixed number of threads.
public static ExecutorService newFixedThreadPool(int nThreads)
execute a thread
void java.util.concurrent.Executor.execute(Runnable command)
View the number of active threads
int java.util.concurrent.ThreadPoolExecutor.getActiveCount()
end all threads
void java.util.concurrent.ExecutorService.shutdown()
3. Set 10 threads to crawl 100 web pages at the same time
import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.ThreadPoolExecutor; public class ExecutorTest { private static Integer pages=1; // Number of pages private static boolean exeFlag= true ; // Execution flag public static void main(String[] args) { ExecutorService executorService =Executors.newFixedThreadPool(10); // Create ExecutorService connection pool with 10 default connections while(exeFlag){ if(pages<=100){ executorService.execute(new Runnable() { @Override public void run() { // TODO Auto-generated method stub System.out.println("Crawled "+pages+" web page..." ); pages++; } }); } else { if (((ThreadPoolExecutor)executorService).getActiveCount()==0){ // The number of active threads is 0 executorService.shutdown(); // End all threads exeFlag= false ; System.out.println( "The crawler task has been completed" ); } } try { Thread.sleep(100); // 线程休息0.1秒 } catch (InterruptedException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } }