【Java爬虫实战】HDUOj代码自动提交

项目搭建

创建一个Springboot项目。导入下面这些坐标依赖

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.3.0.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>cn.jxj4869</groupId>
    <artifactId>ojsubmitter</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>ojsubmitter</name>
    <description>Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
            <exclusions>
                <exclusion>
                    <groupId>org.junit.vintage</groupId>
                    <artifactId>junit-vintage-engine</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <!-- lombok 插件 记得IDE要安装lombok的插件才行-->
        
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        
        <!--解析HTML代码-->
        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.10.2</version>
        </dependency>

        <!-- httpclient-->
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.2</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

分析需求以及实现

要自动提交代码要做一下几个步骤

  1. 准备一个HDUOJ的账号。
  2. 登录账号。
  3. 选择题目后,提交代码。
  4. 获取提交记录,即提交后的判定结果。

账号登录

打开浏览器的检查工具,分析可以发现,登录的时候提交的URL地址如下。请求方法为POST。

http://acm.hdu.edu.cn/userloginex.php?action=login

需要携带三个参数,分别是账号,密码。最后一个login:对应的值固定是Sign in

username: username
userpass: password
login: Sign In

创建一个HDUSubmitter.java

类中包含两个属性userName,password,定义一个构造方法。

定义一个CloseableHttpClient对象,用做提交请求。使用CloseableHttpClient对象发送登录请求后,就可以去提交代码了。

@Data
@EqualsAndHashCode(callSuper = false)
@Accessors(chain = true)
public class HDUSubmitter {

    private String userName;
    private String password;

    private CloseableHttpClient httpClient = HttpClients.createDefault();
    private RequestConfig config = RequestConfig.custom().setConnectTimeout(1000)
            .setConnectionRequestTimeout(500)
            .setSocketTimeout(10 * 1000)
            .build();
    
    public HDUSubmitter(String userName, String password) throws IOException {
        this.userName = userName;
        this.password = password;
        login();
    }



    private void login() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/userloginex.php?action=login&cid=0&notice=0");
        ArrayList<NameValuePair> params = new ArrayList<>();

        //创建表单的Entity对象,第一个参数就是封装好的表单数据,第二个参数就是编码
        params.add(new BasicNameValuePair("login", "Sign In"));
        params.add(new BasicNameValuePair("username", userName));
        params.add(new BasicNameValuePair("userpass", password));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        httpClient.execute(httpPost);
    }


}

选择题目,提交代码

以第一题为例

通过分析浏览器的检查工具可以看到,提交代码的URL地址如下,请求方法为POST

http://acm.hdu.edu.cn/submit.php?action=submit

需要携带如下几个参数。

  • check参数对应的值默认填0。
  • priblemid 对应的是题目编号
  • language是每个语言对应的值,具体如下所示
    • G++:0
    • GCC:1
    • C++:2
    • C:3
    • Pascal:4
    • Java:5
    • C#:6
check: 0
problemid: 1000
language: 0
usercode: #include<bits/stdc++.h>
using namespace std;
int main()
{
   long long a,b;
while(cin>>a>>b)
{
cout<<a+b<<endl;
}
}

创建一个Submission.java,用于保存提交信息。

package cn.jxj4869.ojsubmitter.bean;

import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.Accessors;

@Data
@EqualsAndHashCode(callSuper = false)
@Accessors(chain = true)
public class Submission {
    private int language;
    private String sourceCode;
    private String originProblemId;

}

HDUSubmitter.java类中添加submit方法。

还有新添一个Submission属性,在提交之前,记得给Submission对象设置一个值。

package cn.jxj4869.ojsubmitter.submitter;

import cn.jxj4869.ojsubmitter.bean.Result;
import cn.jxj4869.ojsubmitter.bean.Submission;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.Accessors;
import org.apache.http.NameValuePair;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

import java.io.IOException;
import java.util.ArrayList;


@Data
@EqualsAndHashCode(callSuper = false)
@Accessors(chain = true)
public class HDUSubmitter {

    private Submission submission;
    private String userName;
    private String password;
    
    private CloseableHttpClient httpClient = HttpClients.createDefault();
    private RequestConfig config = RequestConfig.custom().setConnectTimeout(1000)
            .setConnectionRequestTimeout(500)
            .setSocketTimeout(10 * 1000)
            .build();

    public HDUSubmitter(String userName, String password) throws IOException {
        this.userName = userName;
        this.password = password;
        login();
    }



    private void login() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/userloginex.php?action=login&cid=0&notice=0");
        ArrayList<NameValuePair> params = new ArrayList<>();

        //创建表单的Entity对象,第一个参数就是封装好的表单数据,第二个参数就是编码
        params.add(new BasicNameValuePair("login", "Sign In"));
        params.add(new BasicNameValuePair("username", userName));
        params.add(new BasicNameValuePair("userpass", password));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        httpClient.execute(httpPost);
    }

    private void submit() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/submit.php?action=submit");
        ArrayList<NameValuePair> params = new ArrayList<>();
        params.add(new BasicNameValuePair("check", "0"));
        params.add(new BasicNameValuePair("language", "0"));
        params.add(new BasicNameValuePair("problemid", "1000"));
        params.add(new BasicNameValuePair("usercode", submission.getSourceCode()));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        CloseableHttpResponse response = httpClient.execute(httpPost);
        if (response.getStatusLine().getStatusCode() == 200) {
            String content = EntityUtils.toString(response.getEntity(), "utf8");
            System.out.println(content);
        } else {
            System.out.println(response.getStatusLine());
        }
    }
}

获取提交信息

获取提交信息的URL链接如下,请求方法为GET。

http://acm.hdu.edu.cn/status.php

可以通过以下几种参数来筛选信息。要获取自己的提交记录,可以把user的参数设置为自己的用户名即可。

first=&pid=&user=&lang=0&status=0

下图为提交状态页面的代码。提交记录是存放在一个table中。

由于最新提交的记录是在整个table的第一行,所以我们只需要获取第一行的值即可。

可以通过css选择去获取第一行的所以td。然后再提取出相应的text就行

#fixed_table > table > tbody > tr:nth-child(3) > td

在这里插入图片描述

创建Result.java类,用于保存提交信息结果。

package cn.jxj4869.ojsubmitter.bean;

import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.NoArgsConstructor;
import lombok.experimental.Accessors;

@Data
@EqualsAndHashCode(callSuper = false)
@Accessors(chain = true)
public class Result {
    private String status;
    private String language;
    private String time;
    private String memory;
    private String problemId;

}

HDUSubmitter.java类中添加一个getAns方法和一个Result属性。

具体见代码中的注释。解析HTML采用的是Jsoup。

package cn.jxj4869.ojsubmitter.submitter;

import cn.jxj4869.ojsubmitter.bean.Result;
import cn.jxj4869.ojsubmitter.bean.Submission;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.Accessors;
import org.apache.http.NameValuePair;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

import java.io.IOException;
import java.util.ArrayList;


@Data
@EqualsAndHashCode(callSuper = false)
@Accessors(chain = true)
public class HDUSubmitter {

    private Submission submission;
    private Result result;
    private String userName;
    private String password;

    private CloseableHttpClient httpClient = HttpClients.createDefault();
    private RequestConfig config = RequestConfig.custom().setConnectTimeout(1000)
            .setConnectionRequestTimeout(500)
            .setSocketTimeout(10 * 1000)
            .build();

    public HDUSubmitter(String userName, String password) throws IOException {
        this.userName = userName;
        this.password = password;
        login();
    }


    private void login() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/userloginex.php?action=login&cid=0&notice=0");
        ArrayList<NameValuePair> params = new ArrayList<>();

        //创建表单的Entity对象,第一个参数就是封装好的表单数据,第二个参数就是编码
        params.add(new BasicNameValuePair("login", "Sign In"));
        params.add(new BasicNameValuePair("username", userName));
        params.add(new BasicNameValuePair("userpass", password));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        httpClient.execute(httpPost);
    }

    private void submit() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/submit.php?action=submit");
        ArrayList<NameValuePair> params = new ArrayList<>();
        params.add(new BasicNameValuePair("check", "0"));
        params.add(new BasicNameValuePair("language", "0"));
        params.add(new BasicNameValuePair("problemid", "1000"));
        params.add(new BasicNameValuePair("usercode", submission.getSourceCode()));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        CloseableHttpResponse response = httpClient.execute(httpPost);
        if (response.getStatusLine().getStatusCode() == 200) {
            String content = EntityUtils.toString(response.getEntity(), "utf8");
            System.out.println(content);
        } else {
            System.out.println(response.getStatusLine());
        }
    }


    public void getAns() throws Exception {
        URIBuilder uriBuilder = new URIBuilder("http://acm.hdu.edu.cn/status.php");
        uriBuilder.setParameter("user", userName);
        HttpGet httpGet = new HttpGet(uriBuilder.build());
        int count = 0;
        while (true) {
            Thread.sleep(1000);
            try {
                CloseableHttpResponse response = httpClient.execute(httpGet);
                // 当获取失败时,等待两秒后重新尝试
                if (response.getStatusLine().getStatusCode() != 200) {
                    Thread.sleep(2000);
                    count++;
                    continue;
                }
                String content = EntityUtils.toString(response.getEntity(), "utf8");
                Document doc = Jsoup.parse(content);
                Elements tds = doc.select("#fixed_table > table > tbody > tr:nth-child(3) > td");
                String status = tds.get(2).text();
                
                // 当处于等待或者判题状态的时候,就循环等待
                if (status.contains("ing")) {
                    continue;
                }
                String problemId = tds.get(3).text();
                String time = tds.get(4).text();
                String memory = tds.get(5).text();
                String language = tds.get(7).text();

                result = new Result();
                result.setLanguage(language).setMemory(memory).setTime(time).setStatus(status).setProblemId(problemId);
                System.out.println(result);
//                System.out.println(status);
                break;
            } catch (Exception e) {

            } finally {
                // 当累计四次失败后,直接抛出异常, 可能是网络出现了问题。或者hdu服务器出现问题。
                if (count > 4) {
                    throw new Exception();
                }
            }
        }

    }
}

测试

在进行测试之前,对HDUSubmitter.java做最后的完善

新增两个方法。workwait2SubmitTimeLimit

package cn.jxj4869.ojsubmitter.submitter;

import cn.jxj4869.ojsubmitter.bean.Result;
import cn.jxj4869.ojsubmitter.bean.Submission;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.Accessors;
import org.apache.http.NameValuePair;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

import java.io.IOException;
import java.util.ArrayList;


@Data
@EqualsAndHashCode(callSuper = false)
@Accessors(chain = true)
public class HDUSubmitter {

    private Submission submission;
    private Result result;
    private String userName;
    private String password;

    private CloseableHttpClient httpClient = HttpClients.createDefault();
    private RequestConfig config = RequestConfig.custom().setConnectTimeout(1000)
            .setConnectionRequestTimeout(500)
            .setSocketTimeout(10 * 1000)
            .build();

    public HDUSubmitter(String userName, String password) throws IOException {
        this.userName = userName;
        this.password = password;
        login();
    }




    private void login() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/userloginex.php?action=login&cid=0&notice=0");
        ArrayList<NameValuePair> params = new ArrayList<>();

        //创建表单的Entity对象,第一个参数就是封装好的表单数据,第二个参数就是编码
        params.add(new BasicNameValuePair("login", "Sign In"));
        params.add(new BasicNameValuePair("username", userName));
        params.add(new BasicNameValuePair("userpass", password));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        httpClient.execute(httpPost);
    }

    private void submit() throws IOException {
        HttpPost httpPost = new HttpPost("http://acm.hdu.edu.cn/submit.php?action=submit");
        ArrayList<NameValuePair> params = new ArrayList<>();
        params.add(new BasicNameValuePair("check", "0"));
        params.add(new BasicNameValuePair("language", "0"));
        params.add(new BasicNameValuePair("problemid", "1000"));
        params.add(new BasicNameValuePair("usercode", submission.getSourceCode()));
        UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(params, "utf8");
        httpPost.setEntity(formEntity);
        CloseableHttpResponse response = httpClient.execute(httpPost);
        if (response.getStatusLine().getStatusCode() == 200) {
            String content = EntityUtils.toString(response.getEntity(), "utf8");
            System.out.println(content);
        } else {
            System.out.println(response.getStatusLine());
        }
    }


    public void getAns() throws Exception {
        URIBuilder uriBuilder = new URIBuilder("http://acm.hdu.edu.cn/status.php");
        uriBuilder.setParameter("user", userName);
        HttpGet httpGet = new HttpGet(uriBuilder.build());
        int count = 0;
        while (true) {
            Thread.sleep(1000);
            try {
                CloseableHttpResponse response = httpClient.execute(httpGet);
                if (response.getStatusLine().getStatusCode() != 200) {
                    Thread.sleep(2000);
                    count++;
                    continue;
                }
                String content = EntityUtils.toString(response.getEntity(), "utf8");
                Document doc = Jsoup.parse(content);
                Elements tds = doc.select("#fixed_table > table > tbody > tr:nth-child(3) > td");
                String status = tds.get(2).text();
                if (status.contains("ing")) {
                    continue;
                }
                String problemId = tds.get(3).text();
                String time = tds.get(4).text();
                String memory = tds.get(5).text();
                String language = tds.get(7).text();

                result = new Result();
                result.setLanguage(language).setMemory(memory).setTime(time).setStatus(status).setProblemId(problemId);
                System.out.println(result);
//                System.out.println(status);
                break;
            } catch (Exception e) {

            } finally {
                if (count > 4) {
                    throw new Exception();
                }
            }
        }

    }


    /**
     * //hdu oj限制每两次提交之间至少隔5秒
     */
    private void wait2SubmitTimeLimit() {
        try {
            Thread.sleep(10000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public void work() {
        try {
            try {
                submit();
            } catch (Exception e) {
                e.printStackTrace();
                Thread.sleep(2000);
                // 长时间未提交会自动注销,所以第一次提交失败后,先尝试登录一下。
                login();
                Thread.sleep(2000);
                submit();
            }
            getAns();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

}

创建一个配置类

讲HDUSubmitter交给Spring容器管理。

package cn.jxj4869.ojsubmitter.config;

import cn.jxj4869.ojsubmitter.submitter.HDUSubmitter;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.io.IOException;

@Configuration
public class SubmitterConfig {

    //需要传递两个构造参数,分别是用户名和密码。
    @Bean
    public HDUSubmitter hduSubmitter1() throws IOException {
        return new HDUSubmitter(username,password);
    }
}

编写测试类

package cn.jxj4869.ojsubmitter;

import cn.jxj4869.ojsubmitter.bean.Result;
import cn.jxj4869.ojsubmitter.bean.Submission;
import cn.jxj4869.ojsubmitter.submitter.HDUSubmitter;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.test.context.SpringBootTest;

@SpringBootTest
class OjsubmitterApplicationTests {


    @Autowired
    @Qualifier("hduSubmitter1")
    HDUSubmitter hduSubmitter1;

    @Test
    void contextLoads() throws Exception {
        Submission submission=new Submission();
        submission.setLanguage(0).setOriginProblemId(1000+"").setSourceCode("#include<bits/stdc++.h>\n" +
                "using namespace std;\n" +
                "int main()\n" +
                "{\n" +
                "   long long a,b;\n" +
                "while(cin>>a>>b)\n" +
                "{\n" +
                "cout<<a+b<<endl;\n" +
                "}\n" +
                "}");
        hduSubmitter1.setSubmission(submission);
        hduSubmitter1.work();
        Result result = hduSubmitter1.getResult();
        System.out.println(result);
    }

}

在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/qq_43058685/article/details/106242826
今日推荐