3.6 The specific implementation of the Markov chain algorithm (in the C++ environment)

3.5 The specific implementation of the Markov chain algorithm (in the Java environment)

    Since I am not a Java programmer, I only know a little about Java, so I dare not write it. If there is a big guy who can write it, welcome to add it. I will also repost.

3.6 The specific implementation of the Markov chain algorithm (in the C++ environment)

    Because the C++ language is almost a superset of C, as long as you pay attention to some writing methods and rules, C++ can be used in the form of C. In fact, in the previous article, all the input and output I used were the basic input in C++. output stream. For C++, a more appropriate usage should be to define some classes and establish various objects required in the program, which can hide many implementation details. Here, we go a step further and use C++'s STL (Standard Template Library), which not only provides us with a lot of internal mechanisms, but more importantly, it has been included in the language definition of C++ by ISO, we can rest assured use.

    Here, when it comes to why you need to learn C when you have C++, even the first implementation I wrote is C. Personally speaking, I may think that C is more fun, and all the operations can be seen by you. I like to compare C to a double-edged sword, and I like that kind of sharpness (have you seen too much fantasy). It is also possible that C was taught in the first computer language class in school, and I can still remember the Mandelbrot set written in C+OpenCV1.0...

    STL provides many container classes, such as vectors, linked lists, sets, and also includes many basic algorithms for retrieval, sorting, insertion and deletion, and so on. Using C++'s template feature, each STL algorithm can be used on many different container classes. Elements of a container class can be of user-defined type or of internal type. The containers here are all described as C++ templates that can be instantiated for specific types. For example, there is a vector container class in STL, which can export various specific types, such as vector<int>;vector<string> and so on. All vector operations, including standard algorithms for sorting, can be applied directly to these data types.

    In STL , in addition to the vector container ( which is similar to Java 's vector class ) , a deque container class is also provided. deque( pronounced deck ) is a double-ended queue that does exactly what we need for prefix operations: we can use it to store NPREF elements, drop the first element and add a new one at the end, all in O (1 ) operation. In fact, STL 's deque is more general than what we need, it allows pushing and popping on both ends, and the performance guarantees of execution are the reasons we chose it.

    The STL additionally provides a map container, which is used internally to implement balance-based trees. Pairs of (key-value) can be stored in the map. The internal code implementation of map guarantees that the operation of extracting the relevant value from any key is O (log n ) . Although this is not as efficient as a hash table, the advantage is that it does not need to write a lot of code (although I personally still like hash tables)

    When we have these powerful tools, we can write code. First let's make a statement:

typedef deque<string> Prefix;
map<Prefix, vector<string>> statetab;

    Also don't forget:

using namespace std;

    STL provides a deque template, the notation deque<string> specifies it as a string-element deque . Since this type will appear multiple times in the program, here it is declared with a typedef , and it is also named Prefix . The prefix and suffix will be stored in the mapped type, and since it only occurs once in the program, we don't give it a name. A variable statetab of type map is declared here , which is a mapping from prefix to suffix vector.

    For the whole program, the add function should be a part of it that is not easy to understand.

void add(Prefix &prefix, const string &s)
{
	/* Determine whether the read string meets the requirements */
	if (prefix.size() == NPREF)
	{
		statetab[prefix].push_back(s);
		prefix.pop_front();
	}
	prefix.push_back(s);
}

    These few very simple statements really do a lot. The map container class overloads the subscript operator ([ ] operator ) , making it a query operation here. The expression statetab[prefix] completes a query in statetab , using prefix as the query key , and returns a reference to the found item. If the corresponding vector does not exist, this operation will create a new vector. The push_back functions of the vector and deque classes respectively add a new string to the end of the vector or deque ; pop_front pops the first element from the deque .

    For using this method, the implementation is simple (of course I still prefer C), but the speed is far worse than C, although not the slowest.

    I like C probably because I like to write step by step for basic things.

    Coding time

#include "stdafx.h"
#include <vector>
#include <string>
#include <iostream>
#include <deque>
#include <cstdio>
#include <map>

using namespace std;

/*deque This container can store NPREF elements, and can drop the first one and add a new element at the end */
typedef deque<string> Prefix;
/*Map can store (key-value) pairs*/
map<Prefix, vector<string>> statetab;

enum
{
	NPREF = 2, /*Number of words in prefix*/
	NHASH = 99999, /*Hash table (hash table) size*/
	MAXGEN = 10000, /*Maximum number of words entered*/
	MULTIPLIER = 31,
	BUFSIZE = 100
};
string NONWORD = "\n";
int go_on = 1;

void build(Prefix &prefix, istream &in);
void add(Prefix &prefix, const string &s);
void generate(int nwords);

intmain()
{
	int nwords = MAXGEN;
	Prefix prefix;
	for (int i = 0; i < NPREF; i++)
	{
		add(prefix, NONWORD);
	}
	build(prefix, cin);
	add(prefix, NONWORD);
	generate(nwords);
    return 0;
}

void build(Prefix &prefix, istream &in)
{
	string buf;
	/* Read in is buf (can be considered a suffix) */
	while (cin >> buf)
	{
		add(prefix, buf);
		printf("Whether to continue: 1(Yes) 0(No)");
		scanf("%d", &go_on);
		if (!go_on)
		{
			break;
		}
	}
}

void add(Prefix &prefix, const string &s)
{
	/* Determine whether the read string meets the requirements */
	if (prefix.size() == NPREF)
	{
		statetab[prefix].push_back(s);
		prefix.pop_front();
	}
	prefix.push_back(s);
}

void generate(int nwords)
{
	Prefix prefix;
	int i;
	for (i = 0; i < NPREF; i++)
	{
		add(prefix, NONWORD);
	}
	for (i = 0; i < nwords; i++)
	{
		vector<string> &suf = statetab[prefix];
		const string &w = suf[rand() % suf.size()];
		if (w == NONWORD)
		{
			break;
		}
		cout << w << endl;
		prefix.pop_front();
		prefix.push_back(w);
	}
}

postscript:

    I have absolutely no experience with Awk and Perl, so I won't explain them here.

    For so many languages, the programming practice book gives various times for code to run, as we can see in the figure below: C code executes much faster than other types of code. Of course, with the development of time, the running speed of cpu is getting faster and faster, but this time still has reference significance.



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325202717&siteId=291194637