常用算法实现(三)——递归生成句子(模板匹配func_recursive)

一.概述(根据模板生成同义句)

         递归生成句子(模板匹配func_recursive),在计算机科学与技术中,函数Func调用函数本身(直接或者间接),称为递归函数。

         实现和理解起递归函数有时会有点小困难,不过成功运行起来会很有意思呀。

         python中递归函数使用栈存储,所以会有限制,一般来说,一次递归超过1000会报错;

         github地址: 

              https://github.com/yongzhuo/Tookit-Sihui/blob/master/tookit_sihui/ml_common/func_recursive/func_recursive.py

二.实现(常见模板生成句子)

            实现的是: 根据构建好的模板生成句子,有同义词的情况

# -*- coding: UTF-8 -*-
# !/usr/bin/python
# @time     :2019/6/28 15:49
# @author   :Mo
# @function :recursive of recursive of sentence, 模板匹配-递归函数


def FuncRecursive(len_curr=0, sen_odd=[], sen_curr=[]):
    """
      递归函数,将形如 [['1'], ['1', '2'], ['1']] 的list转为 ['111','121']
    :param count: int, recursion times
    :param candidate_list_set: list, eg.[['你'], ['是', '是不是'], ['喜欢', '喜爱', '爱'], ['米饭']]
    :param syn_sentences: list, Storing intermediate variables of syn setnence, eg.['你是喜欢米饭', '你是不是喜欢米饭', '你是不是爱米饭']
    :return: list, result of syn setnence, eg.['你是喜欢米饭', '你是不是喜欢米饭', '你是不是爱米饭']
    """
    syn_sentences = []
    len_curr = len_curr - 1
    if len_curr == -1:
        return sen_curr
    for sen_odd_one in sen_odd[0]:
        for syn_one in sen_curr:
            syn_sentences.append(syn_one + sen_odd_one)
    syn_sentences = FuncRecursive(len_curr=len_curr,
                                  sen_odd=sen_odd[1:],
                                  sen_curr=syn_sentences)
    return syn_sentences


def gen_syn_sentences(org_data):
    """
        同义句生成等
    :param org_data: list, list of rule
    :return: list
    """
    # 获取数据
    sentences_pre = []
    for org_sen in org_data:
        org_sen_sp = org_sen.split("][")
        sentences_add = []
        for words in org_sen_sp:
            words_sp = words.split("|")
            words_sp = [word.replace("]", "").replace("[", "") for word in words_sp]
            sentences_add.append(words_sp)
        sentences_pre.append(sentences_add)

    # 递归生成
    sentences_syn = []
    for sen_rule in sentences_pre:
        len_sen_rule = len(sen_rule)
        if len_sen_rule == 1: # 长度为1不递归
            sentences_syn = sentences_syn + sen_rule[0]
        else:
            sentences_syn = sentences_syn + FuncRecursive(len_curr=len_sen_rule-1,
                                                          sen_odd=sen_rule[1:],
                                                          sen_curr=sen_rule[0])
    return sentences_syn



if __name__=="__main__":
    org_data = ["[你][喜欢|喜爱|爱][虾米|啥子|什么]", "[1|11][2|22][3|33][44|444]", "大漠帝国"]
    syn_sentences = gen_syn_sentences(org_data)
    # syn_sentences = sorted(syn_sentences)
    print(syn_sentences)
    gg = 0

希望对你有所帮助!

发布了96 篇原创文章 · 获赞 72 · 访问量 12万+

猜你喜欢

转载自blog.csdn.net/rensihui/article/details/94589739