myZip self-made compression software (completed based on Huffman tree)----I vomited half a liter of blood, to teach Xiaomeng new (。→‿←。) (As long as you have hands, you won’t hit me)

Demonstration video link : You can click in to see it!
insert image description here

principle:

According to the UTF-8 encoding method, characters often occupy multiple bytes. If we represent them in binary bits (and can also be decoded with the help of binary bits), it will greatly save space.
And because we use the number of occurrences to build a Huffman tree to obtain the representative binary code (you can think about it a little, how to get it?), the higher the frequency of occurrence, the shorter the representative binary code, making the compression effect more superior

Created with Raphaël 2.3.0 得到文件的文本 借助HashMap,key存字符,value存出现次数,便于得到权值 借助于哈夫曼树(依据权值), 得到字符编码 编码写入压缩文件(注意同时写入所需的hashMap) 读取出压缩的编码内容,根据hashMap译码出来,写到解压文件
knowledge points Fundamental
file input and output Use the input and output stream to complete (OutputStream/InputStream), both read and write operations by byte, and how to write and read
hash table Use the key to find a linked list array of values, you can read the article I wrote: hashMap
Huffman tree Build a tree according to the weight of each data, that is, the two values ​​with the smallest weight are merged into a tree whose root node is the sum of weights, and then the weight of the parent node is put into the remaining data, and the loop continues until, only there is a number
string concatenation The underlying logic is that every time splicing, a StringBuilder object must be created, and then the Copy method (traversing the string) is assigned to the object, and after splicing, it returns the value to the string.

accomplish:

Compression method:

1. Read the document and build hashMap<>:

It is worth noting that whenever we want to use the input and output streams, be careful to throw an exception.
At the same time, the text can be obtained by the method written in the code segment.
Finally, the use of hashMap is also very simple. You only need to first traverse the character array of the string, check whether the character (key) exists, and then modify the weight (number of occurrences) to store it. After depositing, use Set to map out and create our initial node

//读取待压缩文件
    private String readWaitCop(File wait) throws IOException {
    
    
        //读取文件内容
        FileInputStream fip = new FileInputStream(wait);
        byte[] buf = new byte[fip.available()];
        //**重中之重,一定要先读文本内容,填充这个byte数组,不然出大乱子——————大冤种的劝告 ~_~。
        /**
         * 在文件读写里:string 和 byte 可以互换,达到input/output都成为可阅读文本
         * 写文件:fop.write(String.getBytes());
         * 读文件: fip.read(bytes[]);
         *         new String(bytes[])
         */
        fip.read(buf);
        return new String(buf);
    }
    
     //获取权值和内容,构建初始节点
    private String initSet(File wait) throws IOException {
    
    
        String str = readWaitCop(wait);
        HashMap<Character,Integer> hashMap = new HashMap<Character,Integer>();
        char[] chars = str.toCharArray();
        for (char cc:chars) {
    
    
            String sinStr = String.valueOf(cc);
            if(hashMap.containsKey(cc)){
    
    
                hashMap.put(cc, hashMap.get(cc) + 1);
            }else{
    
    
                hashMap.put(cc,1);
            }
        }
        Set<Map.Entry<Character, Integer>> entrys = hashMap.entrySet();
        for(Map.Entry<Character, Integer> entry:entrys) {
    
    
            Node node = new Node(null,null, entry.getKey(), entry.getValue(), null);
            arrayList.add(node);
        }
        return str;

2. Take the Huffman tree:

First of all, we need to understand the establishment of nodes: you can take a look at my internal Node class: the
left and right nodes are convenient for building trees, the weights are used for comparison, road can record the encoding made, and content records the character content

//内部节点类
    public class Node{
    
    
        private Node left;
        private Node right;
        //出现频率(权值)
        private int num;
        //记载路径
        private String road;
        //记载内容
        private char content;
        public Node(Node left, Node right,char content, int num,String road) {
    
    
            this.left = left;
            this.right = right;
            this.content = content;
            this.num = num;
            this.road = null;
        }
    }

The idea is the same as the Huffman tree in the principle. You can see how I realized it. It is
worth noting that every time the minimum weight is found, the node needs to be deleted to prevent repeated comparisons. At
the same time, when the last node is left, The tree has been built, record the root node.
Since the ArrayList of this node is always used, remember to clear the last node.

//搭建树
    private void setTree(){
    
    
        while(arrayList.size() != 1){
    
    
            //最小权值结点
            Node bro1 = Min();
            Node bro2 = Min();
            Node father = new Node(bro1,bro2,'0',bro1.num + bro2.num,null);
            arrayList.add(father);
        }
        root = arrayList.get(0);
        arrayList.remove(0);
    }
    //找到最小权值
    private Node Min(){
    
    
        int minIndex = 0;
        for (int i = 0; i < arrayList.size(); i++) {
    
    
            if(arrayList.get(minIndex).num > arrayList.get(i).num){
    
    
                minIndex = i;
            }
        }
        Node node = arrayList.get(minIndex);
        arrayList.remove(minIndex);
        return node;
    }

3. Get the character code and store it in hashMap<code, character>:

The character encoding mentioned here is the road of the node. How to realize it?
In fact, we can traverse the Huffman tree. Going to the left is '0', and going to the right is '1', which solves the encoding, and the weight we used before makes some commonly used data layers more Less, so the encoding is shorter to achieve the compression effect

The following code is encoded:
the node that stores the actual data is actually a leaf node, that is, the node does not have child nodes. You can think about the reason for this.
The traversal tree actually has a variety of methods such as pre-order traversal and post-order traversal, but what I wrote uses layer-order traversal, which can avoid recursion and simplify thinking.
But it is worth noting that the Queue queue is written by myself, so it may be different from the regular one.
insert image description here
You can explain the hierarchical order traversal: Since the Huffman tree is a perfect binary tree, each node has either no or two child nodes, so we can use the Queue data structure to pop a node, push two nodes, and then Due to the FIFO mechanism, the traversal effect is achieved.
Then, as for why it is stored in the hashMap, it is importantForeshadowing, followed by an explanation.

//得到路径
    private void setRoad() throws IOException {
    
    
        Queue<Node> queue = new Queue<>();
        queue.enqueue(root);
        root.road = "";
        while (!queue.isEmpty()){
    
    
            if(queue.peak().left != null) {
    
    
                queue.enqueue(queue.peak().left);
                queue.peak().left.road = queue.peak().road + '0';
            }
            if(queue.peak().right != null){
    
    
                queue.enqueue(queue.peak().right);
                queue.peak().right.road = queue.peak().road + '1';
            }else{
    
    
                arrayList.add(queue.peak());
            }
            queue.dequeue();
        }
        //得到readMap
        for (int i = 0; i < arrayList.size(); i++) {
    
    
            readMap.put(arrayList.get(i).road,arrayList.get(i).content);
        }
    }

4. Decode according to hashMap<Character, String> to get 0/1 string:

It is relatively simple here, traverse the text we read, replace the characters with their corresponding 0/1 codes according to the hashMap, and get a new string, but you must understand that since 1 byte = 8 bits, we need Complement to a multiple of 8, otherwise there will be an error when converted to byte and written to the file. At the same time, after completion, the number of supplementary digits needs to be added to the string header.

 		//存入内容
        char[] chars = comContent.toCharArray();
        //先将comContent清空,得到有效位数+二进制字符串
        /**
         * 这是压缩时最耗时间的部分,可以考虑换成string来算,因为这个比char的hashcode算的快,
         * 因为算hashCode从map取值,重复过多了
         */
        //注意清空啊,嘚吧!
        comContent = "";
        StringBuilder builder = new StringBuilder();
        for (char cc:chars) {
    
    
                //hashMap.get可查看源码,返回值为V;
                builder.append(hashMap.get(cc));
        }

        //存入补足编码
        int num = 8 - builder.length() % 8;
        for (int i = 0; i < num; i++) {
    
    
            builder.append('0');
        }
        //存入补的位数,直接存数字,eg:……+"num"。
        byte b = (byte) num;
        //运用 StringBuilder ,出现了heap溢出
        builder.insert(0,String.valueOf(b));
        comContent = builder.toString();

It is worth noting: comContent is obtained by means of initSet(wait) [initialize the construction node], and remember to convert it into a char [] array, and remember to clear it to get the encoded string

5. (0/1 string -> byte[]) write to the document (note that hashMap<encoding, character> is written):

Here we mainly talk about the method of converting 0/1 strings into bytes. When '0' is read, the byte is shifted to the left, and when '1' is read, the byte is shifted to the left, and + 1, read eight times. It can form a btye value and write it into byte【】.
And the first bit of our byte[] stores the effective 0/1 bits of the last byte, which is convenient for the decompression process,
so we convert the text into a byte[] array, which meets the requirements for writing files.

	//写入到文件的具体方法
    private void outputStream(String comContent,File after) throws IOException {
    
    
        //得到要写入字符串的字符
        char[] chars = comContent.toCharArray();
        //得到需要写多少byte,初始位置存num
        byte[] bytes = new byte[(chars.length - 1) / 8 + 1];
        //得到最后一个字节中的几位有效
        int num = 8 - ((int) chars[0] - 48);
        //(byte)转型,可以得到int num的最后一个字节
        bytes[0] = (byte) (num & 255);
        //第一位是存储需要丢弃几个数字,所以可以从下标1开始
        for (int i = 1; i < bytes.length; i++) {
    
    
            byte b = 0;
            int bb = i - 1;
            for (int j = 1; j <= 8; j++) {
    
    
                if(chars[bb * 8 + j] == '0'){
    
    
                    b = (byte) (b << 1);
                }else{
    
    
                    b = (byte) (b <<1 + 1);
                }
                bytes[i] = b;
            }
        }
        //开写开写
        FileOutputStream fop = new FileOutputStream(after);
        /**
         * 可以去仔细研究:::
         * 写入readMap,只能用ObjectInputStream,写入任何类型
         * 但要注意,该类型要实现java.io.Serializable,使其“序列化”
         */
        ObjectOutputStream foop = new ObjectOutputStream(fop);
        //写入任何类型
        foop.writeObject(readMap);
        fop.write(bytes);
    }

Here we need to emphasize the 3. stepForeshadowingYes, because the encoding rules are different, so to decompress and read it out, we need a decoding table for reference, which is the hashMap we saved at that time. If this form is lacking, one can imagine that after the telegram was intercepted during World War II, the expression on his face was bewildered, as if it had been abolished, and he did not know what to say. The decoding table is written with the help of ObjectOutputStream (both based on File I/O Stream), which is powerful and can be written into any type that implements the Serializable interface.
If you are curious about it, you can check it out.

Unzip method:

1. (byte[]->0/1 string) process:

Note that the shift is continuously shifted from 7->0, so that the 0/1 string can be obtained in order, you can think about it yourself.

for (int i = 1; i < bytes.length; i++) {
    
    
            //将byte转换为字符串
            for (int j = 7; j >= 0 ; j--) {
    
    
                int num = bytes[i];
                if(((num >> j) & 1) == 0){
    
    
                    codeBulider.append('0');
                }
                else{
    
    
                    codeBulider.append('1');
                }
            }
        }

2. According to the read hashMap<code, character>, (0/1 string -> String), get the real content string

Read our hashMap<encoding, character> decoding table

        FileInputStream fip = new FileInputStream(wait);
        // Object的输入输出流都是基于file输入输出流
        ObjectInputStream fiop = new ObjectInputStream(fip);

Start decoding, use sinBuilder (single character) to splice a 0/1, and then find out whether the key (code) exists, and repeat until the code exists, then use contentBuilder to
splice hashMap through the key to find out the value (character) , the browsing is completed, and we have completed the decoding.在这里插入代码片

for (char cc:chars) {
    
    
            sinBuilder.append(cc);
            if(readMap.containsKey(sinBuilder.toString())){
    
    
                contentBuilder.append(readMap.get(sinBuilder.toString()));
                sinBuilder.delete(0, sinBuilder.length());
            }
        }
        content = contentBuilder.toString();

3. Write to the decompressed file:

Remember to use the String.getBytes() method when writing to a String.

		FileOutputStream fop = new FileOutputStream(after);
        fop.write(content.getBytes());

interface design:

Appearance design:

It is nothing more than creating a form and then some components. The code is in the source code, which is very simple.

Create ActionListener

We can pass the internal listener class, so that we can call the required data at will, without trying to pass it.
For example, here we can directly use the text of the text box, without first getting its address before we can operate.
Note: There is a semicolon at the end of the inner listener class, which can be understood as a long statement.

ActionListener ac = new ActionListener() {
    
    
            @Override
            public void actionPerformed(ActionEvent e) {
    
    
                Color tuWhite = new Color(238,238,238);
                g.setColor(tuWhite);
                g.fillRect(300,550,400,400);
                g.drawImage(starBuf,300,550,400,400,null);
                long start = System.currentTimeMillis();
                String btname = e.getActionCommand();
                File wait = new File(jt1.getText());
                File after = new File(jt2.getText());
                if(btname.equals("压缩")){
    
    
                    try {
    
    
                        hf.copCode(wait,after);
                    } catch (IOException ex) {
    
    
                        ex.printStackTrace();
                    }
                }else{
    
    
                    try {
    
    
                        hf.unPack(wait,after);
                    } catch (IOException ex) {
    
    
                        ex.printStackTrace();
                    } catch (ClassNotFoundException ex) {
    
    
                        ex.printStackTrace();
                    }
                }
                long end = System.currentTimeMillis();
                double time = (end - start) / 1000;
                System.out.println(btname + ":  "+ time + "  S");
                g.setColor(tuWhite);
                g.fillRect(300,550,400,400);
                g.drawImage(finBuf,300,550,400,400,null);
            }
        };

Add running picture logo

When the button is clicked, we can draw our loading bar picture, and draw the seal success picture after the end. These codes are all above.

optimization

boost speed

If you use the original string splicing process, you will find that it is extremely slow and has reached an unbearable point. Here I propose my own solution.

1. Find the minimum weight:

Solution: If you are using low-level sorting such as bubble sorting, its efficiency is low, and it is obviously not applicable when there are many characters. You can use quick sort, or, like me, directly find the subscript of the minimum value, which can greatly improve the speed

2. Time-consuming string splicing:

Solution: If you directly use the "+" of the string to splice, according to its underlying principle, we can think about how much time is wasted. Every connection needs to be traversed and copied, and then spliced, [the time complexity is (n^2 )]. Therefore, we only need to create a StringBuilder object, which splices all the content, and then converts it into a String type.
The effect is very remarkable. At that time, it took more than 200 seconds for a file of less than 1M, but now it only takes about 6 seconds for a 100M file.
If you are interested, you can take a look at the append method of String.

Boost capacity

StringBuilder has a limit

When using 300M text compression, heap overflow exception will be reported. After querying the source code of StringBuilder, it is found that its bottom layer is a char【】array, and its length is limited to int type, so when there are too many texts, it will pile up memory overflow
Solution: We can create an array of StringBuilder【】, specify how many characters to connect, then go to the next StringBuilder, and finally splicing, so that the capacity can be expanded.
This is just a personal idea, and it has not been completed yet. I hope that people with lofty ideals can operate it.

Packaged into a jar file: as shown in the figure

jar packaging process

Source code (with all code):

1. Interface

package fcj1028;

import javax.imageio.ImageIO;
import javax.swing.*;
import java.awt.*;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

public class Ui extends JFrame {
    public void ui() throws IOException {
        Huffman_Compress hf = new Huffman_Compress();
        this.setTitle("压缩软件");
        this.setSize(1000,1000);
        this.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
        this.setLocationRelativeTo(null);
        JLabel jl1 = new JLabel("待处理文件:");
        jl1.setFont(new java.awt.Font("Dialog", 1, 30));
        JLabel jl2 = new JLabel("处理后文件:");
        jl2.setFont(new java.awt.Font("Dialog", 1, 30));
        JTextField jt1 = new JTextField();
        Dimension dim1 = new Dimension(800,100);
        Dimension dim2 = new Dimension(400,150);
        jt1.setFont(new java.awt.Font("Dialog", 1, 30));
        JTextField jt2 = new JTextField();
        jt2.setFont(new java.awt.Font("Dialog", 1, 30));
        JButton jb1 = new JButton("压缩");
        jb1.setFont(new java.awt.Font("Dialog", 1, 30));
        JButton jb2 = new JButton("解压");
        jb2.setFont(new java.awt.Font("Dialog", 1, 30));
        jt1.setPreferredSize(dim1);
        jt2.setPreferredSize(dim1);
        jb1.setPreferredSize(dim2);
        jb2.setPreferredSize(dim2);
        this.add(jl1);
        this.add(jt1);
        this.add(jl2);
        this.add(jt2);
        this.add(jb1);
        this.add(jb2);
        this.setLayout(new FlowLayout());
        this.setVisible(true);
        //先可视化,再得到画笔
        Graphics g = this.getGraphics();
        File starPic = new File("C:\\Users\\27259\\Pictures\\java_pic\\9df86687_E848203_9ea3a43f-removebg-preview.png");
        BufferedImage starBuf = ImageIO.read(starPic);
        File finPic = new File("C:\\Users\\27259\\Pictures\\java_pic\\R-C-removebg-preview(1).png");
        BufferedImage finBuf = ImageIO.read(finPic);
        ActionListener ac = new ActionListener() {
            @Override
            public void actionPerformed(ActionEvent e) {
                Color tuWhite = new Color(238,238,238);
                g.setColor(tuWhite);
                g.fillRect(300,550,400,400);
                g.drawImage(starBuf,300,550,400,400,null);
                long start = System.currentTimeMillis();
                String btname = e.getActionCommand();
                File wait = new File(jt1.getText());
                File after = new File(jt2.getText());
                if(btname.equals("压缩")){
                    try {
                        hf.copCode(wait,after);
                    } catch (IOException ex) {
                        ex.printStackTrace();
                    }
                }else{
                    try {
                        hf.unPack(wait,after);
                    } catch (IOException ex) {
                        ex.printStackTrace();
                    } catch (ClassNotFoundException ex) {
                        ex.printStackTrace();
                    }
                }
                long end = System.currentTimeMillis();
                double time = (end - start) / 1000;
                System.out.println(btname + ":  "+ time + "  S");
                g.setColor(tuWhite);
                g.fillRect(300,550,400,400);
                g.drawImage(finBuf,300,550,400,400,null);
            }
        };
        jb1.addActionListener(ac);
        jb2.addActionListener(ac);
    }

    public static void main(String[] args) throws IOException {
        new Ui().ui();
    }
}

```java

```java

```java

2. Compression software

package fcj1028;

import Algorithm.Linear.Queue;

import java.io.*;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;

public class Huffman_Compress {
    //写到压缩文件里,方便之后,文件读出内容(避免程序结束,所存hashMap消失)
    private HashMap<String,Character> readMap = new HashMap<>();
    //根节点,便于寻找
    private Node root;
    /**哈夫曼树:a.先对节点排序保存到List中
    b.取出List中最小的两个节点,让它们的权值累加,保存新的父节点中
    c.让最小两个节点作为左右子树,把父节点添加到List中重新排序
    d.直到List只有一个节点
   3.设置叶子节点的编码,往左编码为1 往右编码为0
    统计每个叶子节点对应的编码
     */
    //用途:节省空间存储内容
    //原理:使用路径编码即可代表字符,而且越常用字符编码越少
    //用来存节点
    private ArrayList<Node> arrayList = new ArrayList<Node>();

    //内部节点类
    public class Node{
        private Node left;
        private Node right;
        //出现频率(权值)
        private int num;
        //记载路径
        private String road;
        //记载内容
        private char content;
        public Node(Node left, Node right,char content, int num,String road) {
            this.left = left;
            this.right = right;
            this.content = content;
            this.num = num;
            this.road = null;
        }
    }
    //获取权值和内容,构建初始节点
    private String initSet(File wait) throws IOException {
        String str = readWaitCop(wait);
        HashMap<Character,Integer> hashMap = new HashMap<Character,Integer>();
        char[] chars = str.toCharArray();
        for (char cc:chars) {
            String sinStr = String.valueOf(cc);
            if(hashMap.containsKey(cc)){
                hashMap.put(cc, hashMap.get(cc) + 1);
            }else{
                hashMap.put(cc,1);
            }
        }
        Set<Map.Entry<Character, Integer>> entrys = hashMap.entrySet();
        for(Map.Entry<Character, Integer> entry:entrys) {
            Node node = new Node(null,null, entry.getKey(), entry.getValue(), null);
            arrayList.add(node);
        }
        return str;
    }
    //读取待压缩文件
    private String readWaitCop(File wait) throws IOException {
        //读取文件内容
        FileInputStream fip = new FileInputStream(wait);
        /**
         * V1.0:一开始的想法是,得到的是byte【】数组,转换为字符串
         * 其中借用了fip.available(),可以返回文件的字节数
         * 但是问题在于,这样的话byte[]数组并不能转换成字符串内容,反而得到的是其长度
         * 原因:当时也忘记了用fip.read(bytes)存放内容,所以读出长度十分正常
         * 故,我们的想法可以是用将每个字节转换为char。
         * V2.0:由于数字和中文编码并不一样,所以这个方法也要淘汰
         * V3.0:在得到bytes[]后,可以使用String()方法,规定好长度自动会帮我们转型
         * 值得注意的是:String.valueOf()会得到乱码
         */
        byte[] buf = new byte[fip.available()];
        //**重中之重,一定要先读文本内容,填充这个byte数组,不然出大乱子——————大冤种的劝告 ~_~。
        /**
         * 在文件读写里:string 和 byte 可以互换,达到input/output都成为可阅读文本
         * 写文件:fop.write(String.getBytes());
         * 读文件: fip.read(bytes[]);
         *         new String(bytes[])
         */
        fip.read(buf);
        return new String(buf);
    }
    //搭建树
    private void setTree(){
        while(arrayList.size() != 1){
            //最小权值结点
            Node bro1 = Min();
            Node bro2 = Min();
            Node father = new Node(bro1,bro2,'0',bro1.num + bro2.num,null);
            arrayList.add(father);
        }
        root = arrayList.get(0);
        arrayList.remove(0);
    }
    //找到最小权值
    private Node Min(){
        int minIndex = 0;
        for (int i = 0; i < arrayList.size(); i++) {
            if(arrayList.get(minIndex).num > arrayList.get(i).num){
                minIndex = i;
            }
        }
        Node node = arrayList.get(minIndex);
        arrayList.remove(minIndex);
        return node;
    }
    //得到路径
    /**
     * 首先表扬一下自己,通过不断输出语句还是找到了bug
     * 这里的问题并非是树连接失败,而是我(nt)的忘了把队列里的节点叉出去,导致死循环
     */
    private void setRoad() throws IOException {
        Queue<Node> queue = new Queue<>();
        queue.enqueue(root);
        root.road = "";
        while (!queue.isEmpty()){
            if(queue.peak().left != null) {
                queue.enqueue(queue.peak().left);
                queue.peak().left.road = queue.peak().road + '0';
            }
            if(queue.peak().right != null){
                queue.enqueue(queue.peak().right);
                queue.peak().right.road = queue.peak().road + '1';
            }else{
                arrayList.add(queue.peak());
            }
            queue.dequeue();
        }
        //得到readMap
        for (int i = 0; i < arrayList.size(); i++) {
            readMap.put(arrayList.get(i).road,arrayList.get(i).content);
        }
    }
    //写入到文件的具体方法
    private void outputStream(String comContent,File after) throws IOException {
        //得到要写入字符串的字符
        char[] chars = comContent.toCharArray();
        //得到需要写多少byte,初始位置存num
        byte[] bytes = new byte[(chars.length - 1) / 8 + 1];
        //得到最后一个字节中的几位有效
        int num = 8 - ((int) chars[0] - 48);
        //(byte)转型,可以得到int num的最后一个字节
        bytes[0] = (byte) (num & 255);
        //第一位是存储需要丢弃几个数字,所以可以从下标1开始
        for (int i = 1; i < bytes.length; i++) {
            byte b = 0;
            /**
             * V1.0:使用byte存储,思路是想用位运算使得效率更高,但是发现不是想要的效果,存入的都是0
             * 原因:暂时还不清楚,可去看这篇文章
             * https://blog.csdn.net/weixin_39775428/article/details/114176698
             * 所以,现在还是用 * 2.
             */
            int bb = i - 1;
            for (int j = 1; j <= 8; j++) {
                if(chars[bb * 8 + j] == '0'){
                    b = (byte) (b * 2);
                }else{
                    b = (byte) (b * 2 + 1);
                }
                /**
                 * 值得注意:byte表示范围为 -128~127,一旦超出127,会自动转换
                 * 但是读取时就需要主要判断正负,得到第一位了,问题在于补码中的0,存在 10000000,00000000两种情况,需要特殊判断
                 */
                bytes[i] = b;
            }
        }
        //开写开写
        FileOutputStream fop = new FileOutputStream(after);
        /**
         * 可以去仔细研究:::
         * 写入readMap,只能用ObjectInputStream,写入任何类型
         * 但要注意,该类型要实现java.io.Serializable,使其“序列化”
         */
        ObjectOutputStream foop = new ObjectOutputStream(fop);
        //写入任何类型
        foop.writeObject(readMap);
        fop.write(bytes);
    }
    //得到压缩后的编码(除了补足八位,构成byte存入),还需要记录补了几个),并写入到文件
    public void copCode(File wait,File after) throws IOException {
        String comContent = initSet(wait);
        setTree();
        setRoad();
        //利用HashMap存好字符的数据编码,即路径
        HashMap<Character,String> hashMap = new HashMap<>();
        for (Node node:arrayList) {
            hashMap.put(node.content, node.road);
        }
        //存入内容
        char[] chars = comContent.toCharArray();
        //先将comContent清空,得到有效位数+二进制字符串
        /**
         * 这是压缩时最耗时间的部分,可以考虑换成string来算,因为这个比char的hashcode算的快,
         * 因为算hashCode从map取值,重复过多了
         */
        //注意清空啊,嘚吧!
        comContent = "";
        /**
         *     神之一手 ———— 郝为高
         *     由于每次字符串的连接,都会先创建 StringBuilder builder对象,浪费大量时间,不如直接一步到位,先创建出来
         */
        StringBuilder builder = new StringBuilder();
        for (char cc:chars) {
                //hashMap.get可查看源码,返回值为V;
                builder.append(hashMap.get(cc));
        }

        //存入补足编码
        int num = 8 - builder.length() % 8;
        for (int i = 0; i < num; i++) {
            builder.append('0');
        }
        //存入补的位数,直接存数字,eg:……+"num"。
        byte b = (byte) num;
        //运用 StringBuilder ,出现了heap溢出
        builder.insert(0,String.valueOf(b));
        comContent = builder.toString();
        builder.delete(0,builder.length());
        outputStream(comContent,after);
    }

    //解压压缩后的文本
    public void unPack(File wait,File after) throws IOException, ClassNotFoundException {
        //开始读出readMap(sb,你自己要注意你是用的那个文件啊啊啊啊啊!!!!!)
        FileInputStream fip = new FileInputStream(wait);
        // Object的输入输出流都是基于file输入输出流
        ObjectInputStream fiop = new ObjectInputStream(fip);
        HashMap<String,Character> readMap = (HashMap<String, Character>) fiop.readObject();
        //将内容都写到byte[]数组里
        byte[] bytes = new byte[fip.available()];
        fip.read(bytes);
        /**
         *     神之一手 ———— 郝为高
         *     由于每次字符串的连接,都会先创建 StringBuilder builder对象,浪费大量时间,不如直接一步到位,先创建出来
         */
        //存储编码后的0/1串
        String copCode = "";
        StringBuilder codeBulider = new StringBuilder();
        for (int i = 1; i < bytes.length; i++) {
            //将byte转换为字符串
            for (int j = 7; j >= 0 ; j--) {
                int num = bytes[i];
                if(((num >> j) & 1) == 0){
                    codeBulider.append('0');
                }
                else{
                    codeBulider.append('1');
                }
            }
        }
        copCode = codeBulider.toString();
        //删除补足‘0’字符串
        copCode = copCode.substring(0,copCode.length() - bytes[0] - 1);
        //为了提高效率的到str的char[]数组
        char[] chars = copCode.toCharArray();
        //得到全体字符
        String content = "";
        /**
         *     神之一手 ———— 郝为高
         *     由于每次字符串的连接,都会先创建 StringBuilder builder对象,浪费大量时间,不如直接一步到位,先创建出来
         */
        StringBuilder sinBuilder = new StringBuilder();
        StringBuilder contentBuilder = new StringBuilder();
        for (char cc:chars) {
            sinBuilder.append(cc);
            if(readMap.containsKey(sinBuilder.toString())){
                contentBuilder.append(readMap.get(sinBuilder.toString()));
                sinBuilder.delete(0, sinBuilder.length());
            }
        }
        content = contentBuilder.toString();
        FileOutputStream fop = new FileOutputStream(after);
        fop.write(content.getBytes());
    }
}

Guess you like

Origin blog.csdn.net/AkinanCZ/article/details/127873236