[Reproduced] Multi-threaded download through HTTP protocol

1. The basic principle, each thread starts downloading from a different location of the file, and finally merges the complete data.

2. The advantage of using multi-threaded download
    is fast download speed. why? It's easy to understand, in the past I was a thread to download on the server. That is to say, on the server, there is one of my download threads.
    At this time, I must not be the only one downloading, there must be multiple download threads on the server at the same time, downloading server resources. Concurrent execution is impossible for the CPU.
    The CPU will divide the time slices for these threads fairly, and execute them in turns. The a thread is ten milliseconds, and the b thread is ten milliseconds...
    Assuming that the method of this article is used, it means that my download application can use any number of threads on the server side at the same time. Simultaneous downloads (in theory).
    Assuming that the number of threads is 50, the application will take care of the server CPU more than 50 times.
    But it will always be limited by the speed of the local network.

3. The length of the data that each thread is responsible for downloading can be calculated by dividing the "total length of the downloaded data" by the "total number of threads participating in the download". But take into account that it is not divisible.
    Assuming that there are 5 threads participating in the download, the calculation formula should be:
            int block = total data length % number of threads == 0? 10/3 : 10/3+1; (if not divisible, then add one)

4. Paging with the database query type. Each thread needs to know where it starts downloading the data and where it ends.
   First, equip each thread with an id, the id starts from zero and is 0 1 2 3...
   Starting position: thread id multiplied by the length of data each thread is responsible for downloading.
   End Position: The position preceding the start position of the next thread.
   For example:
      int startPosition = thread id * length of data downloaded by each thread
      int endPosition = (thread id + 1) * length of data downloaded by each thread - 1;
       
5. The Range header of the HTTP protocol can specify where to start in the file Download, where does the download end. The unit is 1byte
   . Range:bytes=2097152-4194304 means that the download starts from the 2M position of the file and ends at 4M.
   If the Range specifies the byte position of 5104389 bytes of the file to be read, but the downloaded file itself has only 4104389 lengths . Then the download operation will automatically stop at 4104389.
   Therefore, redundant invalid data will not be downloaded.

6. Another difficulty is how to write the data to the local file in order. Because the threads are executed synchronously, they are simultaneously writing data to the local target file.
   The data written by threads between threads is not in the order of downloading the data itself. If you follow the normal OutputStream writing method, the final local download file will be distorted.
   So we will use the following class:
     java.io.RandomAccessFile
     because this class implements both DataOutput and DataInput methods. Make them both write and read.
     This class seems to have something similar to a file pointer, and can start reading and writing at any position of the file at will.
     So instances of this class support reading and writing to random access files.
    
     For example:
      
Java code Favorite code
File file = new File("1.txt"); 
        RandomAccessFile accessFile = new RandomAccessFile(file,"rwd") ; 
        accessFile.setLength(1024); 
   
    Although, after executing this code, we haven't written any data to the target file "1.txt". But if you look at its size at this time, it is already 1kb. This is the size we set ourselves.
    This operation is similar to storing a large byte array to the file. This array stretches the file to the specified size. Waiting to be filled.
    In this case, the benefit is that we can randomly access some part of this filesystem through an "index".
    For example, maybe the file size is 500.
    Then, my business requirement may need to write data from the 300 position for the first time, and write to 350.
    The second time, I start to write data from 50, and write to 100.
    In short, I did not write this file "in sequence" "once".
    Well, RandomAccessFile can support this kind of operation.
    
     API
      void setLength(long newLength)
          Sets the length of this file. (设置文件的预计大小)
      void seek(long pos)
          Sets the file-pointer offset, measured from the beginning of this file, at which the next read or write occurs.
          假设为这个方法传入 1028 这个参数,表示,将从文件的 1028 位置开始写入。
      void write(byte[] b, int off, int len)
          Writes len bytes from the specified byte array starting at offset off to this file.
      write(byte[] b)
          Writes b.length bytes from the specified byte array to this file, starting at the current file pointer.
      void writeUTF(String str)
          Writes a string to the file using modified UTF-8 encoding in a machine-independent manner.
       String readLine()
          Reads the next line of text from this file.

      Experimental code:
    

Java code Collection code
public static void main(String[] args) throws Exception { 
 
File file = new File("1.txt");  
 
RandomAccessFile accessFile = new RandomAccessFile( file,"rwd"); 
 
/* set the file size to 3 bytes */ 
 
accessFile.setLength(3); 
 
/* write '2' to the second position */ 
 
accessFile.seek(1); 
 
accessFile. write("2".getBytes()); 
 
/* write '1' to the first position */  
 
accessFile.seek(0); accessFile.write("1".getBytes(   ));
 
/ * write to the first position Write '3' in the third position */ 
 
accessFile.seek(2);  
 
accessFile.write("3".getBytes()); accessFile.close();  
 
// Expect the content of the file to be: 123 
 
}  


   
    The above experiment is successful, although the sequence of writing strings is "2", "1", "3", but because the file offset is set, the final data saved in the file is: 123
    Another Doubt, after writing these three data, the size of the file is already 3 bytes. It has been filled with written data, so what will be the effect if we continue to put data in it?
   
    /* Write data to the fourth byte position that exceeds the size */
    accessFile.seek(3);
    accessFile.write("400".getBytes());
   
    The above code regardless of the file pointer offset specified by the seek method And the stored data has exceeded the size of 3 bytes initially set for the file.
    According to my guess, at least the "accessFile.seek(3)" position will throw an "ArrayIndexOutOfBoundsException" exception, indicating that the subscript is out of bounds.
    However, executing "accessFile.write("400".getBytes())" alone should succeed. Because this requirement is reasonable, there should be a mechanism to enforce it.
    The result of the experiment is that both codes are successful. It seems to indicate that the large byte array implicit in the file can be automatically expanded.
   
    But the problem to pay attention to is that every position of the set file size must be guaranteed to have legal data, at least not empty.
    E.g:

        accessFile.seek(2);
        accessFile.write("3".getBytes());
       
        accessFile.seek(5);
        accessFile.write("400".getBytes());
    then combined with the previous code, the final The result is:
        123, 400
    , garbled characters appear in two blank positions. This is as it should be.
   
    Also, let's say we specified a hundred lengths for the file:
        accessFile.setLength(100);
    whereas, in reality, we only set values ​​for its first five positions. Of course, the data saved in the file will eventually be suffixed with 95 garbled characters.
   
7. The preparations should be quite sufficient. Code next.


  
Java code Collection code
import java.io.File;   
import java.io.IOException;   
import java.io.InputStream;   
import java.io.RandomAccessFile;   
import java.net.HttpURLConnection;   
import java.net.URL;   
/** 
* Multi-threaded file download 
*/   
public class MulThreadDownload {   
    /* Download URL */   
    private URL downloadUrl;   
    /* Local file for saving*/   
    private File localFile;   
    /* Data length downloaded by no thread*/   
    private int block;   
    public static void main(String[] args) {   
        /* Can be any legal download address on the network*/   
        String downPath = "http://192.168.1.102:8080/myvideoweb/down.avi";   
        MulThreadDownload threadDownload = new MulThreadDownload();   
        /* Open 10 threads to download and download */   
        try {   
            threadDownload.download(downPath, 10);   
        } catch (Exception e) {   
            e.printStackTrace();   
        }   
    }   
    /** 
     * Multi-threaded file download 
     *  
     * @param path download address 
     * @param threadCount thread count 
     */   
    public void download(String path, int threadCount) throws Exception {   
        downloadUrl = new URL(path );   
        HttpURLConnection conn = (HttpURLConnection) downloadUrl   
                .openConnection();   
        /* Set the GET request method*/   
        conn.setRequestMethod("GET");   
        /* Set the response time timeout to 5 seconds*/   
        conn.setConnectTimeout(5 * 1000) ;   
        /* Get the local file name */   
        String filename = parseFilename(path);   
        /* Get the total size of the downloaded file*/   
        int dataLen = conn.getContentLength();   
        if (dataLen < 0) {   
            System.out.println("Failed to get data");   
            return;   
        }   
        /* Create a local target file and set its size to the total size of the file to be downloaded*/   
        localFile = new File(filename);   
        RandomAccessFile accessFile = new RandomAccessFile(localFile, "rwd");   
        /* At this time, in fact, under the local directory , a file whose size is the total size of the downloaded file has been created*/   
        accessFile.setLength(dataLen);   
        accessFile.close();   
        /* Calculate the size of the data to be downloaded by each thread*/   
        block = dataLen % threadCount == 0 ? dataLen / threadCount : dataLen / threadCount + 1;   
        /* start thread to download file*/   
        for (int i = 0; i < threadCount; i++) {   
            new DownloadThread(i).start() ;   
        }   
    }   
    /** 
     * Parse file 
     */   
    private String parseFilename(String path) {   
        return path.substring(path.lastIndexOf("/") + 1);   
    }   
    /** 
     * Inner class: file download thread class 
     */   
    private final class DownloadThread extends Thread {   
        /* thread id */   
        private int threadid;   
        /* where to start downloading */   
        private int startPosition;   
        /* End download position*/   
        private int endPosition;   
        /** 
         * Create a new download thread 
         * @param threadid thread id 
         */   
        public DownloadThread(int threadid) {   
            this.threadid = threadid;   
            startPosition = threadid * block ;   
            endPosition = (threadid + 1) * block - 1;   
        }   
        @Override   
        public void run() {   
            System.out.println("thread'" + threadid + "'Start download..");   
               
            RandomAccessFile accessFile = null;   
            try {   
                /* Set where to start writing data from the local file, "rwd" means read, write and delete permissions to the file*/   
                accessFile = new RandomAccessFile(localFile, "rwd");   
                accessFile.seek(startPosition);   
                HttpURLConnection conn = ( HttpURLConnection) downloadUrl.openConnection();   
                conn.setRequestMethod("GET");   
                conn.setReadTimeout(5 * 1000);   
                /* Set the Range property for HTTP, you can specify the range of data returned by the server */   
                conn.setRequestProperty("Range" , "bytes=" + startPosition + "-"   
                        + endPosition);   
                /* write data to local file*/   
                writeTo(accessFile, conn);   
                   
                System.out.println("Thread'" + threadid + "'Download completed");   
            } catch (IOException e) {   
                e.printStackTrace();   
            } finally {   
                try {   
                    if(accessFile != null) {   
                        accessFile.close( );   
                    }   
                 } catch (IOException ex) {   
                     ex.printStackTrace();   
                 }   
            }   
        }   
        /** 
         * Write the downloaded data to a local file 
         */   
        private void writeTo(RandomAccessFile accessFile,   
                HttpURLConnection conn){   
            InputStream is = null;   
            try {   
                is = conn.getInputStream();   
                byte[] buffer = new byte[1024];   
                int len = -1;   
                while ((len = is.read(buffer)) != -1) {   
                    accessFile.write(buffer, 0, len);   
                }   
            } catch (IOException e) {   
                e.printStackTrace();   
            } finally {   
                try {   
                    if(is != null) {   
                        is.close();   
                    }    
                } catch (Exception ex) {   
                    ex.printStackTrace();   
                }   
            }   
        }   
    }   
}   

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327058529&siteId=291194637