["Game Engine Architecture" Extraction Summary] (2) Game Support System


foreword

Every game needs some underlying support system to manage some routine but critical tasks. For example

  • Start and stop the engine
  • access file system(s)
  • Access to different types of assets (meshes, textures, animations, audio)
  • Debugging tools for game teams

This article will discuss the underlying support systems found in most game engines


Subsystem startup and termination

A game engine is a complex piece of software composed of multiple subsystems. When the engine starts, each subsystem must be configured and initialized in turn, and the interdependence between subsystems implicitly determines the startup order of the subsystems.

Unfortunately, C++'s static initialization does not support this feature. In C++, global and static objects are constructed before entering the main() function, and we don't know the order in which they are constructed. Also, we don't know the order in which they are destructed after the main() function ends.

To deal with this problem, you can use a little trick of C++: the static variables declared in the function will not be constructed before main(), but will be constructed at the first function call, so that we can control their construction order.

To control the order of destruction, a simple and crude way is to explicitly define startup and termination functions for each class and call them manually. Used to replace native constructors and destructors.

Here is a typical implementation:

class rendermanager{
    
    
public:
	//于首次调用时被构建
	static rendermanager& get(){
    
    
		static rendermanager sSingleton;
		return sSingleton;	
	}
	
	rendermanager(){
    
    
		//不做事情
	}
	
	~rendermanager(){
    
    
		//不做事情
	}

	void startup(){
    
    
		//手动启动
		//先启动需依赖的管理器
		videomanager::get()

		//现在启动渲染管理器
		// ... ::get()
	}
	void shutDown(){
    
    
		//终止管理器
	}
	
}

memory management

Code efficiency is the eternal pursuit of game programmers. The performance of any software, in addition to being affected by the choice of algorithm and the efficiency of algorithm coding, is also an important factor in the organization and storage of data, that is, how the program uses memory (RAM). There are two main aspects to the impact of memory on performance:

  • Dynamic memory allocation with malloc( ) or new is a very slow operation. To improve performance, the best way is to avoid dynamic memory allocation, or use a self-made memory allocator.
  • According to the principle of locality, placing data in small continuous memory blocks is much more efficient than spreading data across a wide range of memory addresses.

This section will introduce how to optimize memory usage from the above two aspects.

Optimize dynamic memory allocation

Dynamic allocation of memory through malloc() / free() or new / delete operators is called "heap allocation". The inefficiency of heap allocation mainly comes from two aspects:

  1. The heap allocator is general-purpose infrastructure that needs to handle allocation requests of any size, from 1 byte to 1000 megabytes, which requires a lot of management overhead
  2. In most operating systems, heap memory allocation is accompanied by switching from user mode to kernel mode and back again, and these context switches consume a lot of time

So a common rule of thumb in game development is: keep heap allocations to a minimum, and never use heap allocations in tight loops .

Of course no game engine can avoid heap allocations, so they implement one or more custom allocators . The custom allocator performs better than the operating system's allocator for two reasons. First, the custom allocator fulfills allocation requests from pre-allocated memory (pre-allocated memory comes from heap allocations) thereby avoiding context switches. Second, by making assumptions about usage patterns a custom allocator can be made much more efficient than a generic heap allocator. The following will introduce the following common custom allocators:

  • stack-based allocator
  • pool allocator
  • Allocator with alignment
  • Single-frame and double-buffered memory allocators

stack-based allocator

Many games allocate memory in a stack-like fashion. When a new game level is loaded, memory will be allocated for the level. After the level is loaded, memory will not be dynamically allocated any more. When the level ends, the memory occupied by the level will be unloaded. For this type of memory allocation, the stack is well suited.

The stack allocator is very easy to implement (stack allocator). We use new to allocate a large block of continuous memory as the space to be allocated, and arrange a pointer to point to the top of the stack. Above the pointer is unallocated, and below the pointer is allocated. . For each allocation request, you only need to move the pointer up by the corresponding number of bytes, and to release the last allocated memory, you only need to move the pointer down.

According to the above description, many readers probably noticed that when using the stack allocator, the release of memory must be in the reverse order of allocation. The way to implement this restriction is also very simple. We do not allow the release of individual memory blocks at all, but provide two functions. The first function can mark the top of the current stack, and the second function can roll back the pointer to marked location.

insert image description here
On this basis, some games use a more efficient double-ended stack allocator . A piece of memory can actually be allocated to two stack allocators, which allocate memory from both ends to the middle. In the arcade game Hydro Thunder, the designer used the bottom stack to load levels and the top stack to allocate temporary memory, and this design achieved extremely good performance.

pool allocator

When we need to allocate a large number of small blocks of memory of the same size in game programming, the pool allocator is the perfect choice.

The pool allocator works as follows, it first pre-allocates a chunk of memory that is exactly a multiple of the size of the element to be allocated. Then this large block of memory will be cut into small blocks according to the size of the elements to be allocated and mounted in a linked list. When the pool allocator receives the allocation request, it will take out the next small block of the linked list, and only need to re-attach this small block to the linked list when releasing the element.

In this way, whether it is allocation or release, only a few pointer operations are required.

Single-frame and double-buffered memory allocators

In the game loop, we often need to allocate some temporary data, which is either discarded at the end of the loop iteration, or discarded at the end of the next loop iteration. Many game engines support these two allocation modes, called single frame allocator and double buffer allocator

The single frame allocator, as the name implies, means that the memory is only valid in a certain frame. For the single-frame allocator, we first reserve a block of memory and use the previous stack allocator to manage it. At the beginning of each frame, reset the top pointer of the stack to the low-end address of the memory block. So back and forth.

A double-buffered allocator creates two single-frame allocators of the same size and uses them alternately each frame. In this way, when we are at frame i + 1, we can safely use the contents of the memory allocated in the i-th frame (which may store the data generated in the i-th frame), which is very useful when caching the results of asynchronous processing.

memory fragmentation

Another problem with dynamic memory allocation is "memory fragmentation" over time. Regarding the concept of memory fragmentation, I won't go into details here. The problem with memory fragmentation is that even if there is enough memory, allocation requests will still fail. The crux of the problem is that the allocated memory must be contiguous. Memory fragmentation is not a big problem in operating systems that support virtual memory, which makes it appear to applications that the memory is contiguous. But due to its high overhead, most game consoles don't use virtual memory.

Regarding how to solve this problem, the first thing to say is that neither the stack nor the pool allocator mentioned above will generate memory fragmentation. Here's another approach: Defragmentation and Relocation

The process of defragmentation is very intuitive and easy to understand:
insert image description here
it is very easy for us to "compact" the memory in the way shown above. But here's the really tricky part: we move the allocated memory blocks, then the pointers to those memory blocks become invalid. We need to update the pointers one by one, and let the pointer point to the updated address again, which is called " relocation ".

Unfortunately, in C++ we cannot search for all pointers to a certain address range. So, if we want to support defragmentation in the game engine, we have to abandon pointers and use some components that are easier to modify during relocation, such as smart pointers or handles .

A smart pointer is actually a small class that contains a pointer and behaves almost exactly like a pointer. But since it is implemented with a class, we can have all the smart pointers added to a global linked list, and when moving memory, scan the global linked list and update the pointers to that memory.

The handle is actually an index, which points to a handle table, and each element in the handle table stores a pointer. When the memory needs to be moved, we scan the handle table and modify the corresponding pointer. It should be noted that the handle itself should not be relocated.

Another difficulty with relocation is that some third-party libraries may not use handles or smart pointers. In this regard, the common practice is to designate an area for these libraries without participating in the relocation process. If the number of such memory blocks is small and the size is small, the relocation system can still achieve good efficiency.


container

Game engines use a variety of collection-type data structures, called "containers" or "collections". Here is only a rough introduction, and the detailed usage method is not the focus of this series of articles (it may be updated in the [C++ Summary and Extraction] series).

The role of each type of container is to house and manage containers, but they vary widely in detail and how they operate. Common types of containers include, but are certainly not limited to, the following:

  • Array: A collection of elements that are stored contiguously in memory and support random access. The length of each array is statically defined by the compiler.
  • Dynamic array: Data whose length can be changed dynamically at runtime.
  • Linked list: However, data is stored non-contiguously in memory and does not support random access, but it can be inserted and deleted efficiently.
  • Stack: Data is added and removed in a last-in, first-out manner.
  • Queue: Add and remove data in a first-in, first-out manner.
  • Double-ended queue: A queue that can be inserted and deleted at both ends.
  • Priority queue: The elements in the queue are no longer sorted by insertion time, but by priority, and are generally implemented with a heap.
  • Dictionary: A table composed of key-value pairs, which can efficiently find the corresponding value through the key.
  • Sets: No repeated elements.

With containers, another important point to discuss is iterators. An iterator is like a miniature or a pointer, it knows how to efficiently access an element in the container. The following examples use pointers and iterators to access containers:

void processArray(int container[], int numElements){
    
    
	int *pBegin() = &container[0];
	int *pEnd() = &container[numElements];
	for(int *p = pBegin(); p != pEend(); p++){
    
    
		int element = *p;
	}
}

void processArray(list<int>& container){
    
    
	iterator pBegin() = container().begin();
	iterator pEnd() = container().end();

	iterator p;
	for(p = pBegin(); p != pEnd(); p++){
    
    
		int element = *p;
	}
}

The reader may have noticed the post-increment p++ used in the loop in the above example. Pre- and post-increment are exactly equivalent in the update of the for loop, but this small difference may be important for optimization. ++p first increments the operand, and then returns the modified value, p++ first returns the modified value and then increments. This means that ++p will generate " data dependence " in the code , and the value must be used after the self-increment operation is completed. In a deeply pipelined CPU, this data dependence may cause " pipeline stall ". p++ doesn't have this problem, so the best practice for game programmers is always to use post-increment (unless you're using class type variables).


Summarize

This article introduces the underlying support systems that appear in most game engines, including the startup and termination sequences of subsystems, memory management, and the use of containers. This series of articles only extracts and summarizes the content that the author is interested in in the book "Game Engine Architecture". If you want to know more, readers are also recommended to read the original book by themselves.

Guess you like

Origin blog.csdn.net/qq_35595611/article/details/126486242