java HttpClient + Jsoup build irrigation weapon no longer afraid of the fire

I do not know how long ago have had to write a small software automatically replies has not been achieved, then studied recently retired and sit down, I smattering of dishes for the HTTP protocol can only ask google great God, and my thoughts with big God say google after, google big God said this kid is good, this is to contribute to the cause of fire ah! Special brother gave the following artifacts:

1、HttpClient 4.3.1 (GA)

Listed below are the main functions provided by HttpClient to know more detailed features can be found in HttpClient home page.

  • It implements all methods of HTTP (GET, POST, PUT, HEAD, etc.)
  • Support automatic steering
  • Support for HTTPS protocol
  • Supports proxy servers, etc.

2 Jsoup

The main functions are as follows jsoup

  • Parse HTML from a URL, file or string
  • Using DOM or CSS selectors to locate, retrieve data
  • Operable HTML elements, attributes, text
  • Using almost the same syntax and jquery

Ado directly to the question, containing example folders within this folder HTTPClient source package contains some basic usage of these examples is enough to find entry-ClientFormLogin.java detailed explanation of comments has been very clear roughly meaning simulate HTTP requests stored cookies.

Test Site: http://bbs.dakele.com/

Because this site to log in to do a special deal so the standard of DZ forum may be different from your own modifications

chrome own review elemental analysis of site use, this toss a lot of time

Login Address: http://passport.dakele.com/login.do?product=bbs

Enter the incorrect username and password will find practical Login to http://passport.dakele.com/logon.do note the difference between [i / n the beginning did not pay attention to that hell]

Returns an error message

{"err_msg":"帐号或密码错误"}

Enter the correct information returns

Rediret direct input connection and log in normally

Gets Jump Links:

private LoginResult getRedirectUrl(){
        LoginResult loginResult = null;
        CloseableHttpClient httpClient = HttpClients.createDefault();
        HttpPost httpost = new HttpPost(LOGINURL);
        httpost.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        httpost.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
        httpost.setHeader("Cache-Control", "max-age=0");
        httpost.setHeader("Connection", "keep-alive");
        httpost.setHeader("Host", "passport.dakele.com");
        httpost.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
        List <NameValuePair> nvps = new ArrayList <NameValuePair>();
        nvps.add(new BasicNameValuePair("product", "bbs"));
        nvps.add(new BasicNameValuePair("surl", "http://bbs.dakele.com/"));
        nvps.add(new BasicNameValuePair("username", "yourname"));//用户名
        nvps.add(new BasicNameValuePair("password", "yourpass"));//密码
        nvps.add(new BasicNameValuePair("remember", "0"));

        httpost.setEntity(new UrlEncodedFormEntity(nvps, Consts.UTF_8));
        CloseableHttpResponse response2 = null;
        try {
            response2 = httpClient.execute(httpost);
            if(response2.getStatusLine().getStatusCode()==200){
                HttpEntity entity = response2.getEntity();
                String entityString = EntityUtils.toString(entity);
                JSONArray jsonArray = JSONArray.fromObject("["+entityString+"]");
                JsonConfig jsonConfig=new JsonConfig();
                jsonConfig.setArrayMode(JsonConfig.MODE_OBJECT_ARRAY);
                jsonConfig.setRootClass(LoginResult.class);
                LoginResult[] results= (LoginResult[]) JSONSerializer.toJava( jsonArray, jsonConfig );
                if(results.length==1){
                    loginResult = results[0];
                }
            }
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }finally{
            try {
                response2.close();
                httpClient.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return loginResult;
    }

Login Code:

public boolean login(){
        boolean flag = false;
        LoginResult loginResult = getRedirectUrl();
        if(loginResult.getResult().equals("true")){
            cookieStore = new BasicCookieStore();
            globalClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
            HttpGet httpGet = new HttpGet(loginResult.getRedirect());
            httpGet.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
            httpGet.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
            httpGet.setHeader("Connection", "keep-alive");
            httpGet.setHeader("Host", HOST);
            httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
           try {
            globalClient.execute(httpGet);
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
            List<Cookie> cookies2 = cookieStore.getCookies();
            if (cookies2.isEmpty()) {
                log.error("cookie is empty");
            } else {
                for (int i = 0; i < cookies2.size(); i++) {
                    
                }
            }
        }
        
        return flag;
    }

This has been a successful login can only log in number to do, what? Of course you do not know the fire

First we need to get the message reply address, list page so there is no more regular written all auto-discovered wrote a cycle @ 1

for(int i=1;i<200;i++){
            String basurl="http://bbs.dakele.com/forum-43-"+i+".html";
            log.info(basurl);
            List<String> urls = dakele.getThreadURLs(basurl);
            for(String url:urls){
                //log.info(url);
                ReplayContent content = dakele.preReplay(url);
                if(content!=null){
                    log.info(content.getUrl());
                    log.info(content.getMessage());
                    //dakele.replay( content);
                    //Thread.sleep(15300);
                }
            }
        }

Get in the post list page address:

String html = EntityUtils.toString(entity);
            Document document = Jsoup.parse(html,HOST);
            Elements elements=document.select("tbody[id^=normalthread_] > tr > td.new > a.xst");
            for(int i=0;i<elements.size();i++){
                Element e = elements.get(i);
                urList.add(e.attr("abs:href"));
            }

Get to be submitted in the form need to respond to the message form reply addresses and configuration

public ReplayContent preReplay(String url){
        ReplayContent content = null;
        HttpGet get  = new HttpGet(url);
        get.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        get.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
        get.setHeader("Connection", "keep-alive");
        get.setHeader("Host", HOST);
        get.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
        try {
            CloseableHttpResponse response = globalClient.execute(get);
            HttpEntity entity = response.getEntity();
            String html = EntityUtils.toString(entity);
            Document document = Jsoup.parse(html, HOST);
            Element postForm = document.getElementById("fastpostform");
            if(!postForm.toString().contains("您现在无权发帖")){
                content = new ReplayContent();
                content.setUrl(url);
                
                log.debug(postForm.attr("abs:action"));
                content.setAction(postForm.attr("abs:action"));
                
                
                ////////
                Elements teElements = document.select("td[id^=postmessage_]");
                String message = "";
                for(int i=0;i<teElements.size();i++){
                    String temp = teElements.get(i).html().replaceAll( "(?is)<.*?>", "");
                    if(temp.contains("发表于")){
                        String[] me = temp.split("\\s+");
                        temp = me[me.length-1];
                    }
                    message+=temp.replaceAll("\\s+", "");
                }
                log.debug(message.replaceAll("\\s+", ""));
                ///////////////
                /*取最后一条评论
                Element messageElement= document.select("td[id^=postmessage_]").last();
//                String message = messageElement.html().replaceAll("\\&[a-zA-Z]{1,10};", "").replaceAll("<[^>]*>", "").replaceAll("[(/>)<]", "");
                String message = messageElement.html().replaceAll( "(?is)<.*?>", "");
                */
                if(message.contains("发表于")){
                    String[] me = message.split("\\s+");
                    message = me[me.length-1];
                }
                content.setMessage(message.replaceAll("&nbsp;", "").replaceAll("上传", "").replaceAll("附件", "").replaceAll("下载", ""));
                Elements inputs = postForm.getElementsByTag("input");
                for(Element input:inputs){
                    log.debug(input.attr("name")+":"+input.attr("value"));
                    if(input.attr("name").equals("posttime")){
                        content.setPosttime(input.attr("value"));
                    }else if(input.attr("name").equals("formhash")){
                        content.setFormhash(input.attr("value"));
                    }else if(input.attr("name").equals("usesig")){
                        content.setUsesig(input.attr("value"));
                    }else if(input.attr("name").equals("subject")){
                        content.setSubject(input.attr("value"));
                    }
                }
            }else{
                log.warn("您现在无权发帖:"+url);
            }
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return content;
    }

With the address, the contents begin with the next turn on the water

public void replay(ReplayContent content){
        
        HttpPost httpost = new HttpPost(content.getAction());
        httpost.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        httpost.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
        httpost.setHeader("Cache-Control", "max-age=0");
        httpost.setHeader("Connection", "keep-alive");
        httpost.setHeader("Host", HOST);
        httpost.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
        List <NameValuePair> nvps = new ArrayList <NameValuePair>();
        nvps.add(new BasicNameValuePair("posttime", content.getPosttime()));
        nvps.add(new BasicNameValuePair("formhash", content.getFormhash()));
        nvps.add(new BasicNameValuePair("usesig", content.getUsesig()));
        nvps.add(new BasicNameValuePair("subject", content.getSubject()));
        nvps.add(new BasicNameValuePair("message", content.getMessage()));

        httpost.setEntity(new UrlEncodedFormEntity(nvps, Consts.UTF_8));
        //HTTP 三次握手 必须处理响应刚开始没注意卡在这了
        CloseableHttpResponse response2 = null;
       
        try {
            response2 = globalClient.execute(httpost);
            //log.info(content.getAction());
            //log.info(content.getMessage());
            HttpEntity entity = response2.getEntity();
            EntityUtils.consume(entity);
//            BufferedWriter bw= new BufferedWriter(new FileWriter("d:/tt1.html"));
//            bw.write(EntityUtils.toString(response2.getEntity()));
//            bw.flush();
//            bw.close();
            //System.out.println(EntityUtils.toString(response2.getEntity()));
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        
    }

Of course, this only applies to no code of Forum For verification code can only be a detour,

Irrigation harmful, this is the result after some bombingQQ screenshot 20140109224028

For your reply only just started to take the comments in the last post of the current and then reply, be warned! Then use the word IK obtain key codes are affixed to the venue

Reference connection:

Cons: no use multithreading, not been fully tested

Finishing code provided as soon as possible

Late plan: join sign, do the task function, the @ 1 cycle to automatic discovery

The first brother post where inadequate Wang criticism

------------------------------------------

Download http://pan.baidu.com/s/1jGjwA5g

Order the next morning the code, after now share with everyone, packing directly Myeclipse works decompression can be directly imported

Modifications within IKFenci.java user name and password can be directly run

Reproduced in: https: //my.oschina.net/chbing/blog/198870

Guess you like

Origin blog.csdn.net/weixin_33816821/article/details/91755672