httpClient、jsoup、okhttp、junitHtml抓取页面方法介绍(未完成)

1.本文宗旨

记录日常所用并与网友分享

2.httpClient、jsoup、okhttp、junitHtml能干什么

用来发送或者接受http请求,或者说能够抓取网页爬去信息,就是java版的爬虫.

3.httpClient、okhttp、restTemplate收发请求

3.1httpClient的get请求

  /**
  * httpClient的get请求介绍
  */
  @Test
    public void getHtmlPost(){
        // 创建默认的httpClient实例.
        String url = "https://search.17k.com/search.xhtml";
        Map<String,Object> map = new HashMap<>();
        map.put("c.st",0);
        map.put("c.q","近战狂兵");
        url = setDoGetUrl(url,map);
        RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(120000).setSocketTimeout(60000)
                .setConnectionRequestTimeout(60000).build();

        CloseableHttpClient httpclient;
        HttpClientContext httpClientContext = HttpClientContext.create();
        CookieStore cookieStore = null;
        cookieStore = new BasicCookieStore();
        httpclient = HttpClientBuilder.create().setKeepAliveStrategy(new DefaultConnectionKeepAliveStrategy())
                .setRedirectStrategy(new DefaultRedirectStrategy()).setDefaultRequestConfig(requestConfig)
                .setDefaultCookieStore(cookieStore).build();
        Function<Object,Object> function=null;
        String resGet = doGet(url,map,httpclient,httpClientContext);

    }
	
    private String doGet(String url) {
        return doGet(url,null,null,null);
    }
    private String doGet(String url,CloseableHttpClient httpclient, HttpClientContext httpClientContext) {
        return doGet(url,null,httpclient,httpClientContext);
    }
	/*
	* get请求方法封装
	*/
    private String doGet(String url, Map map, CloseableHttpClient httpclient, HttpClientContext httpClientContext) {
        if(httpclient==null) httpclient = HttpClients.createDefault();
        if(httpClientContext==null)httpClientContext = HttpClientContext.create();
        setDoGetUrl(url,map);
        CookieStore cookieStore223 = httpClientContext.getCookieStore();
        if(cookieStore223!=null){
            System.out.println("===============================什么这里有cookie====================");
            List<Cookie> listCookie = cookieStore223.getCookies();
            System.out.println(listCookie);
        }
        String web="";
        HttpGet httpget = new HttpGet(url);
        CloseableHttpResponse response = null;
        httpget.setHeader("Connection", "close");//请求头设置
        httpget.setHeader("Content-Encoding", "gzip");
        httpget.setHeader("Server", "openresty");
        httpget.setHeader("Transfer-Encoding", "chunked");
        httpget.setHeader("Vary", "Accept-Encoding");
        httpget.setHeader("Content-Type","text/html;charset=UTF-8");
        try {
        //这一步很重要,如果只传一个参数就是普通的get请求,如果是带了httpClientContext上下文没那么可以将多次请求维持在同一session里面,前提是别关闭httpClient
            response = httpclient.execute(httpget,httpClientContext);
            HttpEntity entity = response.getEntity();
            if (entity != null) {
                //获取请求头,根据请求头进行不同的处理
                String contentType = entity.getContentType().getValue();
                web = EntityUtils.toString(entity,"utf-8");

                httpClientContext.getCookieStore().getCookies().forEach(System.out::println);
            }
            response.close();
            httpclient.close();
        }catch (Exception e) {
            e.printStackTrace();
        } finally {
            //关闭连接,释放资源
            try {
                response.close();
                httpclient.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return web;
    }
    
	/*
	*	get请求参数封装,手写
	*/
    private static String setDoGetUrl(String url, Map<String,Object> map) {
        if(StringUtils.isBlank(url))return url;
        if(MapUtils.isEmpty(map))return url;
        if(!url.endsWith("?")) url += "?";
        for (Map.Entry<String,Object> entry:map.entrySet()) {
            if(StringUtils.isBlank(entry.getKey()))continue;
            if(!url.endsWith("?")){
                url += "&" + entry.getKey() + "=" + entry.getValue().toString();
            }else{
                url += entry.getKey() + "=" + entry.getValue().toString();
            }
        }
        return url;
    }

3.2
restTemplate的get请求

restTemplate并没有重写底层的HTTP请求技术,而是提供配置,可选用OkHttp/HttpClient等,说白了就这对这两个封装。

这个就很简单了,

发布了42 篇原创文章 · 获赞 13 · 访问量 8318

猜你喜欢

转载自blog.csdn.net/weixin_43328357/article/details/96442051