【CodeForces】700 D. Huffman Coding on Segment

【Title】D. Huffman Coding on Segment

【Question】Given n numbers, ask m times the total length of Huffman codes for numbers in the interval [l,r]. 1<=n,m,ai<=10^5.

[Algorithm] Huffman tree + Mo team + block

[Huffman tree] Huffman tree is also known as the optimal construction tree. The Huffman tree of n numbers is the point weight path and the smallest binary tree containing n leaves with given weights.

The point weight path sum (WPL) can be expressed as the depth*weight value of each point .

Construction method : Each time, the two root nodes with the smallest point weight are taken as the left and right subtrees (the left is small and the right is large) to form a new root node (the point weight is the sum of the left and right), and multiple operations are performed until there is only one tree left. (similar to merged fruit)

The WPL of a Huffman tree is the sum of the weights of all nodes - the numerical sum (also the sum of the weights of non-leaf nodes, or the sum of weights of all non-root nodes).

Huffman coding : starting from the root, left 0 and right 1, the binary of each character is its code.

Huffman Code is a binary encoding method with the shortest total length of the message. The corresponding Huffman code is obtained by constructing the Huffman tree by taking the occurrence frequency of n characters as the point weight.

The Huffman code length sum of n numbers is WPL.

[Solution] For an interval, let f[x] represent the number of occurrences of the number x, and let g[x] represent the number of numbers with the number of occurrences of x, which is maintained by Mo team.

For each interval query, consider the naive approach is to use the number of occurrences as a point weight to construct a Huffman tree, but this is too slow. Let S=sqrt(n).

For the number of occurrences > S, there are no more than S in total, and the Huffman tree can be constructed according to the naive method to solve it, and the complexity is O(S log S).

For the number of occurrences < S, considering that the point weight is very small, it can be recorded in a bucket, and it can be done in batches. The complexity is O(S). Once the point weight exceeds S, it will enter the previous process.

The total complexity is O(n√n).

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324451967&siteId=291194637