Discussion on low-code platform-MetaStore metadata cache | JD Cloud technical team

Background and needs

As mentioned before, our model-driven implementation chose the interpretive type, which requires the metadata information of the model and dynamically processes the logic after receiving the request.

In addition, the general capabilities of the application also include: page DSL query, menu query, etc.

Moreover, after triggers are added later and users customize the API, these metadata also need to provide query services.

So we need a metadata module that provides two basic functions: loading metadata and providing metadata query services.



Special note: At the beginning, we supported two sources: local and remote. Later, we removed the remote logic to prevent network isolation problems in separate deployment.

The metadata processed in the first version of iteration includes: model, page DSL and menu, triggers added later, user-defined API, interceptor, etc. Today we will discuss the design and implementation according to the first version of iteration .

The requirement for model metadata is to cache a batch of model metadata, and the specific information of the model can be obtained based on the model name.

The requirement for page dsl is to cache a batch of page dsl and obtain page dsl information based on dslId.

The requirements for the menu are relatively simple, caching the menu list and getting the menu list.

The above mentioned functional requirements, and then the non-functional requirements :

  • Performance, metadata queries are particularly frequent and high performance must be ensured. Caching is usually used, which is also one of the core values ​​of our module.
  • The data must be accurate, and there must be no problems or discrepancies in the data obtained from MetaStore.
  • It is easy to expand. First of all, metadata is not only required to be obtained based on ID, but may also have other query requirements; secondly, when other metadata storage is added later, the changes should be small.

First version design

Design ideas

The requirements have been discussed above, and now we will start the formal design. First, we will choose a specific scenario: model metadata.

For high performance, caching must be used . There are two caching methods commonly used in development: remote and local.

The remote usually uses NoSql middleware, such as Redis and MemCache, which is definitely not suitable in this scenario.

The most suitable way for this scenario is to use memory cache. The query logic is simple: get it based on name or id, so you can directly use the map data structure.

The metadata is loaded when the application starts and will not change (it needs to be refactored here during later hot deployment). By using spring's startup mechanism, there is no need to consider thread safety issues. You can just use HashMap directly. The jvm should be used here . The Happens-before principle.

At this point, we have determined the cache data structure and interface:



It contains an internal variable cache of HashMAP type, an internal method to load data, and an external interface method getByKey. The functions are relatively cohesive and the class design is no problem. Let’s take a look at the specific logic below.

Detailed logic

The logic of getByKey is obtained from the cache variable cache, and the get method of map is used directly. No need to go into details (there are pitfalls here, which will be explained below). Let’s mainly look at the load method of loading data.



Main logic:

1. Read the specified directory, which can be obtained from the configuration. If it cannot be obtained, use the default value: models to read out all files in the directory.
2. Start processing each file in a loop, detailed steps:
1. Read the file content: json format.
2. Convert json data into Model object.
3. Put the Model object into the cache, the key is modelName, and the value is the Model object.

abstract

Once the specific plan is determined, the logic described above is not complicated to implement, and can even be said to be simple.

But when we look at it as a whole, we will find that the metadata caching logic of the model and the page DSL is very similar !

Looking back at the logic flow chart above, the purple parts are exactly the same. The only difference is the two pieces of logic in red: "read the specified directory" and "parse into objects". We can abstract away the common logic. , making an abstract parent class allows subclasses to inherit, taking advantage of the inheritance code reuse scenario . This is a typical application scenario of the template pattern .

Let’s briefly talk about the definition of template pattern: define an algorithm skeleton in a method, and postpone certain steps to be implemented in subclasses, allowing the subclasses to redefine certain steps in the algorithm without changing the overall structure of the algorithm. .

Combined with our scenario: the algorithm skeleton is the overall data loading process and metadata acquisition method. The subclasses (ModelMetaStore and PageMetaModel) need to implement two extension methods in the skeleton: "Read the specified directory" and "Parse into an object" ".

After refactoring the first version of the design, the class diagram is as follows



illustrate:

  • The model data files and page dsl files are placed in their respective directories: models and dsls. The upper two directories are default, but can be customized.
  • The file types in the model data directory are all json, and the file name is the model name.
  • The file types in the page dsl directory are also json, and the file name is the id of the page.
  • Abstract a parent class AbstractDataStore<R>, the generic is the object type of the subclass, including core logic and skeleton. Internally there is a cache variable of map type and a load private function for loading data, and two abstract methods are provided , one is directoryName: get the directory where the file is located, the other is parse: parse the file content into a single object, these two abstract methods need to be implemented by subclasses, and finally there is a public method to get metadata based on key.
  • ModelMetaStore and PageDataStore inherit the parent class AbstractDataStore, the generics are ModelEntity and PageEntity respectively, and implement the two abstract methods mentioned above: directoryName and parse. The logic of the two methods is very simple. The first one directly returns its own directory, and the second one Use fastjson to parse strings into specified objects.

combination

The metadata cache design of the model and page is completed above, and there is also a metadata cache for the menu.

The metadata structure of the menu is very different from the two above. There will only be one metadata file for the menu in each application, so the template mode above cannot be used. We need to deal with its logic separately.

In fact, the metadata loading logic of the menu partially overlaps with the other two (more accurately, it should be an abstract class): reading the content of the file, which requires determining whether the file is empty, reading the content string and io reading Report errors, etc.

This part of the logic can be copied and placed in the menu's metadata cache class, but it actually violates the DRY principle: do not write repeated code.

So how to solve this problem of repeated code? There are two options:

The first option: use inheritance.

Add a layer of abstract class to AbstractDataStore<R>, which only includes one method: read file content. The new menu metadata cache also inherits this abstract class, and moves the method of reading file content to the top layer. Class structure The picture is as follows:

There are two problems with this solution:

  1. The inheritance level is too deep and the logic is a bit messy
  2. The top-level abstract class is difficult to name

It is feasible for AbstractDataStore and MenuDataCache to inherit LoadFileData from the code point of view, but it is not logically reasonable. Abstract inheritance is an is-a relationship. Here, LoadFileData needs to be given a suitable name to meet the design specifications, and this name is not easy to come up with.

The first option: use combination .

This solution is actually more reasonable. There is a principle in object-oriented design: composition is better than inheritance.

The logic of reading file content can be separated as a new helper class DataLoadSupport, AbstractDataStore and MenuDataCache can be introduced to solve the problem of code duplication.





Initial load

The previous step has almost completed all the logic, but the data loading trigger is missing, which needs to be triggered when the service starts.

We have used spring's ApplicationListener, but there is a detail issue that needs to be determined: how to implement ApplicationListener, whether to implement it separately in each class, or to implement it uniformly.

In order to unify the logic and consider that other services need to be loaded at startup later, I chose to implement unified implementation. The specific logic and class diagram are as follows:

Specific logic:

  • Add a new interface StartLoadListener : start loading listening, and the operation class that needs to be triggered at startup implements this interface.
  • AbstractDataStore and MenuDataCache implement the StartLoadListener interface. The mark needs to be triggered when the application starts, and its own load method is triggered after the application starts.
  • Add a new StartLoadListenerManager class, which will automatically inject all objects that implement the StartLoadListener interface. This class also implements spring's ApplicationListener interface. It is triggered after spring starts. After triggering, the startLoad method of all StartLoadListeners is called to complete all the tasks that need to be triggered at startup. operate.

Class Diagram:



Second version design

As mentioned at the beginning of the article, there is a problem with using keys from the metadata cache to obtain information from the map and then returning it directly . I was aware of it when the development was completed, but did not deal with it until one day the mine really exploded.

Let me first talk about the process of the problem. The initial function was simple, and the applications generated by the low-code platform ran well. After iterating several versions, permission capabilities were added: simple data permissions and menu permissions were supported.

The processing logic of menu permissions is roughly as follows:

  • Get the cached data directly from the menu cache.
  • If the menu list is empty, the empty list is directly returned to the client, ending the menu query.
  • Query the menu coding list (set) that the user has permissions. The data source is the technology permission system. In the early stage, the relatively crude code was coupled to the menu permission code, but this is not the focus of today's discussion. A later article will talk about the reconstruction of the permission module.
  • If the menu encoding list (set) for which the user has permission is empty, the empty list is directly returned to the client and the menu query is ended.
  • Recursively process the cached menu obtained. If the encoding is in the encoding list (set) returned by the permission system, recursively process the sub-node list (the menu is tree-shaped), otherwise delete the node directly and return to the upper level recursion.
  • Repeat the previous step until all menu nodes are verified.





The logic is not too complicated. During development, we specifically focused on recursion and permission system queries. There were no problems in local testing, and no problems were found when deployed to the test environment.

But in the end, a problem was discovered during the test before going online: after switching accounts, the menu was wrong and there were fewer menus with permissions!!!

When locating the problem, we first checked the return of the permission system, and the result was no problem; after taking the data locally and running a mock test, there was no problem; in the final debugging, we discovered the real problem, which was the problem that we were aware of but had not solved before. :The menu list operated in the menu permission logic and the menu list cached in the menu metadata are the same object!

After a user has processed his own permission menu logic, some nodes may be removed. When the next user obtains the menu, his menu metadata may have been processed (removed during permission processing).

Metadata storage is actually similar to the prototype model . The definition of the prototype model: If the creation cost of an object is relatively large, and there is not much difference between different objects of the same class (most fields are the same), in this case, we can Create new objects by copying (or copying) existing objects to save creation time.

My implementation only implements caching, but does not implement replication, so that the same object is operated by multiple scenes. In the end, it leads to data errors and confusion.

Copying is generally called copying and is divided into two categories:

  • Shallow copy: Only the basic data types in the object (such as int, long) and the memory address of the reference object will be copied, and the reference object itself will not be copied recursively.
  • Deep copy: not only copies the index, but also copies the data itself

We must choose deep copy, but when a problem occurs, there are already several metadata stored, especially the menu data has a complex structure (tree shape), and deep copy cannot be completed in a short time. Tools on the market can usually only copy One layer, deep copy cannot be automatically completed.

The solution at that time was relatively crude: the problem occurred in the menu, and it was only done when handling menu permissions: define a new menu list, and the permissioned menu deeply copies the current object and adds it to the list; recursively processes the child nodes. Continue to re- Define a list and set it as the child node of the upper node, put the child nodes with permission into the new list, and recursively loop the above steps.

The superficial bug has been solved, but the actual problem has not been solved. It has just been covered. Take the menu as an example. I handled it in the permission service, but if it is used in other places, the same data missing error will occur.

What's more serious is that the trigger provides an interface for obtaining model metadata. If the developer modifies the model metadata after obtaining it, the errors will be more serious and more difficult to troubleshoot, because the changes to the model metadata will This leads to the lack or confusion of the model's common interface logic, so the deep copy function must be implemented.

There are generally two implementations of deep copy:

The first is deserialization after serialization , such as serializing data into json and deserializing it back. This solution has two problems: (1) Inherited subclass polymorphism is prone to problems (2) Serialization and deserialization Transformation has performance costs, so the solution is not suitable here.

The second is recursive processing : when encountering Collection, Map and Javabean types, recursive deep copying is performed. The development cost of this solution is slightly higher, and a large amount of recursion requires special attention. In addition, some special beans cannot be copied: for example, the internal variables are final type, or only supports passing in the constructor.

We are using the second method. This method is purely algorithmic. I will not go into details. I encountered a pit in the middle: I have a habit of using Collections directly when there is no data and an empty list needs to be returned. emptyList() will directly report an error when creating an object based on the constructor during deep copying. The fix is: if list or map is not a common object type, use common objects ArrayList and HashMap directly.



In fact, there is another solution to the problem of data modification, using immutable objects. Immutable objects mean that the object cannot be modified after it is created: there is no set method, and internal variables will not be exposed. It should be noted that internal variables are also not exposed. To be protected: unchanged or deep copy.

Author: Jingdong Technology Wu Ziliang

Source: JD Cloud Developer Community Please indicate the source when reprinting

Lei Jun: The official version of Xiaomi's new operating system ThePaper OS has been packaged. The pop-up window on the lottery page of Gome App insults its founder. Ubuntu 23.10 is officially released. You might as well take advantage of Friday to upgrade! Ubuntu 23.10 release episode: The ISO image was urgently "recalled" due to containing hate speech. A 23-year-old PhD student fixed the 22-year-old "ghost bug" in Firefox. RustDesk remote desktop 1.2.3 was released, enhanced Wayland to support TiDB 7.4 Release: Official Compatible with MySQL 8.0. After unplugging the Logitech USB receiver, the Linux kernel crashed. The master used Scratch to rub the RISC-V simulator and successfully ran the Linux kernel. JetBrains launched Writerside, a tool for creating technical documents.
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10120058