Breakpoint resuming of Winform file download

In the first two articles of this series,   the basic usage and some practical skills of WebClinet  and  WinINet used to complete the download task were introduced to you.

Today, I'm going to tell you about the most common problem of resuming the download from a breakpoint during the download process.

First of all, it is clear that the breakpoint resuming mentioned in this article refers specifically to the breakpoint resuming in the HTTP protocol. The article describes the method ideas and key codes for realizing the breakpoint resuming. Students who want to know more details, please download and Check out the demo attached to this article.


working principle

Some request/response headers are defined in the http protocol. By combining these headers, the purpose of downloading the same file in batches can be achieved. For example, in an http request, only a part of the data in the file is requested, and then the requested data is saved. Next time, only the remaining part of the data needs to be requested. When all the data is downloaded to the local, the data merging work is completed.

The http protocol states that the range of request data can be specified through the Range header in the http request.

The use of the Range header is very simple, and it can be used in the following format:

Range: bytes=500-999

The above means: only request the 500th to 999th, the 500th bytes of the target file.

For example, if there is a 1000-byte file that needs to be downloaded, the Range header is not specified in the first request, which means that the entire file is downloaded; but after the 499th byte is downloaded, the download is interrupted, then the remainder of the next request When downloading a file, only the 500th to 999th bytes of data need to be downloaded.

The principle seems simple, but the following issues need to be considered:

1. Do all web servers support the Range header?

2. There may be a long interval between multiple requests. What if the file on the server changes?

3. How to save some downloaded data and related information?

4. How can we verify that a file is exactly the same as the source file after we have stitched it into its original size through byte manipulation?

Next, this paper provides solutions for the above problems.


1. How to check whether the server side supports the Range header?

When the server responds to the request, it will indicate whether to accept part of the data of the requested resource through Accept-Ranges in the response header. There seems to be a small problem here, that is, different servers may return different values ​​to indicate whether to accept the request to download some resources. A more unified approach is: when the server does not support requesting partial data, it will return Accept-Ranges: none, so you only need to judge whether the return value is equal to none.

code show as below:

private static bool IsAcceptRanges ( WebResponse res )

{

    if ( res.Headers["Accept-Ranges"] != null )

    {

        string s = res.Headers["Accept-Ranges"];

        if ( s == "none" )

        {

            return false;

        }

    }

    return true;

}


2. How to check whether the file on the server side has changed?

When we are in the process of downloading files, the download process is interrupted due to network failures and other reasons. At this time, if the files on the server have changed, we need to start the download again anyway, only when the files on the server have not changed. In this case, it only makes sense to resume the transfer from a breakpoint.

When you need to continue downloading the file next time, how to determine whether the file on the server is still half of the file that was downloaded?

For this problem, the http response header provides us with two options, using ETag and Last-Modified to complete the download task.

First look at ETag:

The ETag response-header field provides the current value of the entity tag for the requested variant. (引自RFC2616 14.19 ETag)

To put it simply, ETag is a string that identifies the content of the current request. When the requested resource changes, the corresponding ETag will also change, so the easiest way is to save the ETag in the response header for the first request. Do the corresponding comparison on the next request.

code show as below:

string  newEtag = GetEtag (response);

// tempFileName refers to the partial file content that has been downloaded to the local

// tempFileInfoName refers to the temporary file that saves the Etag content

if ( File.Exists(tempFileName) && File.Exists(tempFileInfoName) )

{

    string oldEtag = File.ReadAllText( tempFileInfoName );

    if ( !string.IsNullOrEmpty(oldEtag) && !string.IsNullOrEmpty(newEtag) && newEtag == oldEtag )

    {

        // Etag has not changed, you can resume the transfer from a breakpoint

        resumeDowload = true;

    }

}

else

{

    if ( !string.IsNullOrEmpty(newEtag) )

    {

        File.WriteAllText( tempFileInfoName, newEtag );

    }

}

//GetEtag function

private static string GetEtag( WebResponse res )

{

    if ( res.Headers["ETag"] != null )

    {

        return res.Headers["ETag"];

    }

    return null;

}

Look at Last-Modified again:

The Last-Modified entity-header field indicates the date and time at which the origin server believes the variant was last modified. (引自RFC2616 14.29 Last-Modified)

Last-Modified is the last modification time of the requested resource on the server. The usage method is basically the same as that of ETag.

Whether using ETag or Last-Modified, it can achieve the purpose of detecting whether the server-side file has changed.

Of course, you can also use these two methods at the same time to do double check in order to better achieve the purpose of detection.


3. How to save some downloaded data and related information?

This mainly refers to the use of C# to save data and related information. The general idea is that if there are undownloaded files, first save the downloaded data in a certain path, and then add the downloaded byte data to the downloaded file. end of the file.

For the detailed implementation method, please check the demo code.


4. How to verify the consistency between the downloaded file and the source file?

In the process of resuming the upload from a breakpoint, we download and merge files in units of bytes. If there is an abnormality during the whole downloading process, the final file may be different from the source file, so it is better to be able to download and merge files. A good file is checked for consistency with the source file, which is a very important step and the most difficult part to achieve. The reason why it is difficult to implement is that it needs the support of the server side, for example, the server side is required to provide not only the file available for download, but also the MD5 hash of the file.

Of course, if the server side is also created by ourselves, we can implement server-side support. At present, some products provide the ability to resume uploading from a breakpoint during the download process, and the Spread Studio table control is one of them. 

Demo download

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326714656&siteId=291194637