Paddle2.0 + CPM-LM: Let AI help you write articles

introduce

Quick experience

  • You can quickly experience the generation of text continuation on the Baidu AIStudio platform: project link

Show results

  • As usual, first show the generation effect of general text continuation

Input: "Homeland China" is Mr. Fei Xiaotong's work discussing Chinese social structure from a macro perspective on the basis of community research,
output: "Homeland China" is Mr. Fei Xiaotong's study of Chinese society from a macro perspective on the basis of community research His works are part of the entire social theory system. From the perspective of macroscopic anatomy, he selected ten typical examples of Chinese society, taking rural China as the main body, and conveying a broad concept and idea. This is its characteristic. Can resonate with many people.
Output: "Homeland China" is Mr. Fei Xiaotong's work on the study of Chinese social structure from a macro perspective on the basis of community research. The research results of the three topics of "Traditional Chinese Society". "Chinese Peasant Survey" (formerly known as "Chinese Rural Social Survey") is an important work by Mr. Fei Xiaotong on the basis of "rural research" and discusses the problems of Chinese peasants from a sociological perspective. gaps in sociology research.
Output: "Homeland China" is a book published in 1984 by Mr. Fei Xiaotong who explored the social structure of China from a macro perspective on the basis of community research. The "Chinese social structure" discussed in the book has two meanings, one is the economic structure, and the other is the political structure. Mr. Fei Xiaotong believes that "the change of Chinese social structure is the result of the change of political structure." Specifically, Mr. Fei Xiaotong divides the various structures of Chinese society into three types: "division", "layer" and "level". class.

Input: enter the beginning of the text here,
output: enter the beginning of the text here, and then press the space bar to start reading.
Output: Enter the beginning of the text here, then enter the prefix of the replacement text in the "Find and Replace" dialog box, and then use the IF command in the "Replace" dialog box to select the characters to be replaced.
Output: Enter the beginning of the text here, and then enter the next character, you can get a "Hello World!" text.

Input: Fang Ping led the crowd out of the city on horseback, and the remaining snow dotted the originally yellowed ground. Dressed in black, he stood out especially among a group of armored soldiers.
Output: Fang Ping led the crowd out of the city on horseback, and the remaining snow dotted the originally yellowed ground. Dressed in black, he stood out especially among a group of armored soldiers. He hesitated with his hands behind his back, the military advisers who once led the army in the Copper Palace and swept the world are now just a few arrogant pawns, and now he, even one of the entourages around him is a high position even if he doesn't know martial arts. Hengdao immediately, what did he get in exchange? He didn't know, he was a little shaky when he walked today.
Output: Fang Ping led the crowd out of the city on horseback, and the remaining snow dotted the originally yellowed ground. Dressed in black, he stood out especially among a group of armored soldiers. "Who is here?" "I am Yang Ping, the commander of the Black Mountain Army, and I have come to investigate the enemy's situation by holy order.

Input: After Daiyu was rejected from Yihongyuan on a stormy night,
output: After Daiyu was rejected from Yihongyuan on a stormy night, she went to join Liu Xianglian in Jia's mansion. Although they have never met, but because they are old friends, Daiyu has an indescribable affection for Liu Xianglian. Once, Daiyu passed by the window and suddenly saw a small red bean growing on Liu Xianglian's willow tree. She picked it off and wanted to give it to Liu Xianglian, but Liu Xianglian refused.
Output: Daiyu went to Rong Mansion after being rejected by Yihong Courtyard on a stormy night, and she was dismissed by Mrs. Xing after gossiping with Jia Mu. Daiyu returned to her room, thinking of her mother's "it's just my stuff, because it's not in my hands, who can I blame", so she took a broom to clean the room, trying to clean it up, but she remembered Yi Regarding the matter in the Red Courtyard, he made up his mind, "Let's make do with it first."

  • Do you feel that the text generated above is not bad, so of course I can pick it out and display it ?
  • Using sampling to generate text can indeed make the generated text more diverse and vivid, more like human-written text.
  • But because such text generation is not very controllable, some strange texts are often generated, like the following:

Input: After Daiyu was rejected from Yihong Courtyard on a stormy night,
output: Daiyu held a grudge against Baoyu after being rejected from Yihong Courtyard on a stormy night. What ingredients in venison balls can be inferred to be "toxic"? The muscle attack/hemolysis/hormone used...Can these ingredients be inferred to be toxic? How can liberal arts students understand these problems... These ingredients seem to need a small amount, but It's really poisonous, how did it kill you? I just read it on Weibo. There are no typical meatballs (whether there are any balls, I doubt...ask Kusaka, right), most of them are sugar

Input: Fang Ping led the crowd out of the city on horseback, and the remaining snow dotted the originally yellowed ground. Dressed in black, he stood out especially among a group of armored soldiers.
Output: Fang Ping led the crowd out of the city on horseback, and the remaining snow dotted the originally yellowed ground. Dressed in black, he stood out especially among a group of armored soldiers. "Who else?" Liu Guanzhang and the others stood around Can Xue's direction, "I have another question.

decoding method

  • This time, the two decoding methods of top-k filtering and nucleus filtering are used
  • The reference code is the decoding function in GPT-2 Chinese
def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
    """ Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
        Args:
            logits: logits distribution shape (vocabulary size)
            top_k > 0: keep only top k tokens with highest probability (top-k filtering).
            top_p > 0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
                Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
        From: https://gist.github.com/thomwolf/1a5a29f6962089e871b94cbd09daf317
    """
    assert logits.dim() == 1  # batch size 1 for now - could be updated for more but the code would be less clear
    top_k = min(top_k, logits.size(-1))  # Safety check
    if top_k > 0:
        # Remove all tokens with a probability less than the last token of the top-k
        indices_to_remove = logits < torch.topk(logits, top_k)[0][..., -1, None]
        logits[indices_to_remove] = filter_value

    if top_p > 0.0:
        sorted_logits, sorted_indices = torch.sort(logits, descending=True)
        cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)

        # Remove tokens with cumulative probability above the threshold
        sorted_indices_to_remove = cumulative_probs > top_p
        # Shift the indices to the right to keep also the first token above the threshold
        sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1].clone()
        sorted_indices_to_remove[..., 0] = 0

        indices_to_remove = sorted_indices[sorted_indices_to_remove]
        logits[indices_to_remove] = filter_value
    return logits
  • Because some functions seem to be unable to be used directly by Paddle Tensor, Numpy is used as a transfer (or it may be my dish, but I didn’t find out how to implement it)
  • The following is the code implemented by myself using Paddle2.0, for reference only
    def top_k_top_p_filtering(logits, top_k=0, top_p=1.0, filter_value=-float('Inf')):
        """ Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
            Args:
                logits: logits distribution shape (vocabulary size)
                top_k > 0: keep only top k tokens with highest probability (top-k filtering).
                top_p > 0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
                    Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
            From: https://gist.github.com/thomwolf/1a5a29f6962089e871b94cbd09daf317
        """
        top_k = min(top_k, logits.shape[-1])  # Safety check
        logits_np = logits.numpy()
        if top_k > 0:
            # Remove all tokens with a probability less than the last token of the top-k
            indices_to_remove = logits_np < np.sort(logits_np)[-top_k]
            logits_np[indices_to_remove] = filter_value

        if top_p < 1.0:
            sorted_logits = paddle.sort(logits, descending=True)
            sorted_indices = paddle.argsort(logits, descending=True).numpy()
            cumulative_probs = paddle.cumsum(paddle.nn.functional.softmax(sorted_logits, axis=-1), axis=-1).numpy()

            # Remove tokens with cumulative probability above the threshold
            sorted_indices_to_remove = cumulative_probs > top_p
            # Shift the indices to the right to keep also the first token above the threshold
            sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[..., :-1]
            sorted_indices_to_remove[..., 0] = 0

            indices_to_remove = sorted_indices[sorted_indices_to_remove]
            logits_np[indices_to_remove] = filter_value

        return paddle.to_tensor(logits_np)

Summarize

  • CPM-LM can generate text with decent quality without fine-tuning
  • But it's just not bad, it's far from being excellent, let alone it will generate deviations from time to time
  • Moreover, the model is too large so that it is impossible to directly generate long text, and it can only barely perform 200 token operations in a 32G video memory environment.
  • Looking forward to more models with more natural generation effects in the future

Guess you like

Origin blog.csdn.net/jm_12138/article/details/111599530