Web crawlers (a) - basic use

get request

    The use substantially get request

    // 1. Open your browser and create HttpClient objects     
    CloseableHttpClient httpClient = HttpClients.createDefault (); 
    
    // 2. Enter the URL initiate get a request to create an object HttpGet 
    HttpGet get = new new HttpGet ( "http://112.124.1.187/index. ? HTML typeId = 16 " ); 
    
    // 3. heat request, it returns a response, initiating a request using the object HttpClient 
    CloseableHttpResponse response = httpClient.execute (GET); 
    
    // 4. parse the response, obtaining data 
    IF (response.getStatusLine (). getStatusCode () == 200 is ) { 
        the HttpEntity Entity = response.getEntity (); 
        String Content = EntityUtils.toString (Entity, "UTF-. 8" );  
        System.out.println (Content);
    }

 

    get request parameters (the address can be written directly, but constituting the hard coded)

    // Open a browser, to create an object HttpClient 
    CloseableHttpClient httpClient = HttpClients.createDefault ();
     the try {
         // address: . Http://112.124.1.187/index.html?typeId=16 with parameters
         // Create UriBuilder 
        UriBuilder = UriBuilder new new UriBuilder ( "http://112.124.1.187/index.html" );
         // add parameters
         // more parameters may add attached, attached setParameter (key, value) behind 
        uriBuilder.setParameter ( "typeId "," 16 " );
         // 2. enter the URL initiate get a request to create an object HttpGet 
        HttpGet get = new new HttpGet (uriBuilder.build ());
         //3. The heat request, returns a response, initiating a request using the object HttpClient 
        CloseableHttpResponse Response = null ;
         the try { 
            Response = httpClient.execute (GET);
             // 4. parse the response, the data acquisition 
            IF . (Response.getStatusLine () getStatusCode () = 200 is = ) { 
                the HttpEntity Entity = response.getEntity (); 
                String Content = EntityUtils.toString (Entity, "UTF-. 8" ); 
                System.out.println (Content); 
            } 
        } the catch (IOException E) { 
            e.printStackTrace ( ); 
        } 
    }catch (URISyntaxException e) {
        e.printStackTrace();
    }

 

post request

    And get the same basic use of the HttpGet changed HttpPost it.

    post request with parameters

    // 1. Open your browser and create HttpClient objects 
    CloseableHttpClient httpClient = HttpClients.createDefault (); 
    
    // address: . Http://112.124.1.187/index.html?typeId=16 with parameters
     // 2. Enter the URL, initiate a request to create a post HttpPost objects 
    HttpPost post = new new HttpPost ( "http://112.124.1.187/index.html" );
     // 2.1 statement List collection, the parameters encapsulated form 
    List <of NameValuePairs> the params = new new the ArrayList <> ( );
     // 2.2 add parameter 
    params.add ( new new BasicNameValuePair ( "typeId", "16" ));
     // 2.3 Entity objects created form, the parameters encoded url 
    UrlEncodedFormEntity formEntity = newUrlEncodedFormEntity (the params, "UTF-. 8" );
     // 2.4 Entity objects arranged to form Post request 
    post.setEntity (formEntity); 
    
    // 3. heat request, returns a response, initiating a request using the object HttpClient 
    CloseableHttpResponse Response = null ;
     the try { 
        response = httpClient.execute (POST);
         // 4. parse the response, the data acquisition 
        IF (. response.getStatusLine () getStatusCode () == 200 is ) { 
            the HttpEntity Entity = response.getEntity (); 
            String Content = the EntityUtils. toString (Entity, "UTF-. 8" );  
            System.out.println (Content); 
        }
    } the catch (IOException E) {
        e.printStackTrace();
    } finally{
        if(response != null){
            response.close();
        }
        httpClient.close();
    }

 

Like every operating as a connection, HttpClent connection once, and then off, then when you do, continue to be connected, and then disconnected. Constitute a waste of resources phenomena. We need to use the "pool" concept.

HttpClient- connection pool

    public  static  void      main (String [] args) {
         // create a connection pool manager 
        PoolingHttpClientConnectionManager = cm & lt new new PoolingHttpClientConnectionManager ();
         // set the maximum number of connections 
        cm.setMaxTotal (10 );
         // set the maximum number of connections for each host 
        cm. setDefaultMaxPerRoute (2 ); 
    
        // use the connection pool manager initiates a request 
        the doGet (cm & lt); 
        the doGet (cm & lt); 
    } 
    
    Private  static  void the doGet (cm & lt PoolingHttpClientConnectionManager) {
         // Get the object from the connection pool HttpClient  
        CloseableHttpClient httpClient =HttpClients.custom () setConnectionManager. (cm) .build ();
    
        HttpGet httpGet = new HttpGet("http://112.124.1.187");
        CloseableHttpResponse response = null;
        try {
            response = httpClient.execute(httpGet);
            if(response.getStatusLine().getStatusCode() == 200){
                String content = EntityUtils.toString(response.getEntity(),"utf-8");
                System.out.println(content.length());
            }
        } catch (IOException e) {
            e.printStackTrace (); 
        } the finally {
             IF(Response =! null ) {
                 the try { 
                    response.close (); 
                } the catch (IOException E) { 
                    e.printStackTrace (); 
                } 
            } 
            // do not close HttpClient, referred pools to manage
             // httpClient.close (); 
        } 
    }

 

Request parameter

    This request parameter is not placed behind the url address parameters, but you request process, involving the need to pre-determined rules. For example, in the request process, sometimes because of network reasons, or the target server, the request takes longer to complete, we need to customize the relevant time.

    GET = HttpGet      new new HttpGet ( "http://112.124.1.187/index.html?typeId=16" );
     // configuration request message 
    RequestConfig config = RequestConfig.custom () setConnectTimeout (10000).   // create a connection longest time, in milliseconds 
                            .setConnectionRequestTimeout (500)    // set maximum time obtaining a connection milliseconds 
                            .setSocketTimeout (10 * 1000)     // maximum time setting data transmission milliseconds 
                            .build ();
     // configuration request to 
    get.setConfig (config);

 

Guess you like

Origin www.cnblogs.com/jr-xiaojian/p/12310470.html