A mechanical engineer also talks about understanding of data and domain modeling

Foreword: The jBeanBox project has recently come to an end. I feel that it is good to use Java initialization blocks to replace Spring's XML configuration mode. I have this hammer in my hand, so I look at everything like a nail. No, I also like Hibernate, SSH three Brother, following Spring, it is also configured in XML or Annotation. The configuration is also fixed and cannot be dynamically generated and modified at runtime. Well, the nail is it. The name of the new project has been thought out, and it is called jSQLBox, which is used to replace Hibernate. By the way, by the way, do you have any other problems with Hibernate? But I have searched and found no shortcomings, except that it is too complicated, I can't think of any problems. It would be ridiculous to reinvent the wheel just to add a configuration method to Hibernate. No, you must find some faults, and you have to pick out the diseases if you don't have any. So I looked left and right, and I don't know if I was dazzled or I looked for too long. Now, the more I look, the more I feel that Hibernate is an example of over-design. So this Tucao article was born. Hardcore fans of Hibernate don't have to read it, because the purpose of this article is to devalue Hibernate, so as to raise the jSQLBox that has not been written yet.

Before we start, let's clarify the difference between the words "domain logic" and "domain model". "Domain logic" refers to the internal business rules of an enterprise and has nothing to do with whether or not software is used. There should be no doubt about this. "Domain Model" should be literally understood in Chinese as the business model of an enterprise and has nothing to do with software terminology, but the word is transformed into another meaning in Martin's "Enterprise Architecture" book, which is a concept based on a face object Model, the model itself contains data and behavior, which is exactly one-to-one correspondence with the concept of object model in software engineering, so "domain model" is often equivalent to "object model", and it is implemented by face-like object software. The "domain model" is not the focus of this article. But the author's point of view is that enterprise applications are precisely "domain models" If it can't work, this is a personal opinion. If you have different opinions, please don't beat the bricks and listen to my reasons: 1) When a problem is so complicated that it cannot be handled by a table, it must be realized with the help of the face-to-face object model. , which is usually very complex or lengthy, beyond the scope of human understanding and manipulation, and is suitable for computer processing; while enterprise logic usually deals with people, money, and things, and the logic is simple and can be understood and explained by business experts in the enterprise. , using a complex model to solve an essentially simple problem is to complicate the problem. Enterprise business experts basically do not have objects and inherit these concepts in their minds. They usually have data sheets in their minds, and the typical representative is the corporate accountant. The object of talking to them is to play the piano to the cow. Which country you come to, you must speak the language of which country, so that you can communicate easily. 2) The face-to-face object model for solving complex problems is usually not concurrent access, because the complexity of the business itself makes concurrent access programming extremely difficult, that is to say, when the problem you want to solve becomes more and more complex, the concurrency limit of the business itself becomes The higher and higher, the less and less it can not be called enterprise application 3) The face object model is usually represented by a tree or graph structure, and it must be saved in a special format, and it is difficult to store it in a unified format, such as Office documents , 3D graphics, circuit layout software, web documents, etc., each of which has its own proprietary file format. Although there are general formats such as XML, XML is obviously a tree structure rather than a table structure that can be put into a database. In rare cases, the object model itself is the persistence layer, such as a circuit board layout, which can be printed and used in production layouts. When a face-to-face object model can be mapped to a database table with an OR tool, it can only mean that the model is too simple, so simple that it can be directly expressed with a few tables. 4) The face object model only appeared with the development of computer technology. The most common application is that with the help of computers, complex models that could not be built by manpower in the past can be built. Modeling is often the whole or the focus of work; while enterprise application building The model itself is not the point, because the enterprise logic existed before the concept of face-to-face object was proposed, and the software only expresses and sorts out the enterprise logic to help the enterprise logic operate better and generate benefits. To sum up, using the face-image object tool to deal with enterprise applications is likely to use a knife to kill a chicken. Can a knife kill a chicken? Of course, but it is difficult to practice the knife well, so that a development team can do it well. Harder.
Before the emergence of "domain model" and "domain modeling", before enterprise application software began to be compiled, it was also necessary to model, which was usually expressed as interrelated tables, and supplemented by text or graphics, lines, etc. to express these tables This is the traditional ER model (that is, what is commonly referred to as "data modeling"). This article uses Figure 1 to illustrate the advantages of establishing an enterprise logic model based on data (tables). p1.png, because the following discussion is based on this diagram)



This is a partial screenshot of the database modeling extracted from a small MRP program I used in the company two years ago. It is based on the ER diagram, but slightly The difference is that the common key association lines in the ER diagram are gone, and the CRUD method will not appear in the above diagram, because it is not only too pediatric, but also interferes with the real enterprise logic. Instead, lines that reflect the business logic are added (the original image is full of black lines, bolded and colored to explain this example). The lines in the above figure represent constraints or driving relationships between fields. I call each line a domain logic line because it has nothing to do with programming languages ​​or interfaces, but only reflects business logic and is relatively stable. This example is relatively simple. , only data-driven relationship lines, no constraints, check relationship lines, and no roles (data can drive roles, roles can also drive data, for example, a tag field drives an audit role, which drives this field after auditing, and then This field starts to drive the audit role again..., the date can also be used as a driving source, etc.)
Only one business is explained here: goods receipt and sales order:
This is a small business, the business is relatively simple, after the warehouse management receives the goods Check a little, immediately put in the warehouse and sell the order, the following demonstrates the process of receiving a part with a part number of "001", the following pseudo-program script is included in a transaction There are many steps, one step corresponds to one color line):
Step 1 (red words): Order receipt record form (POReceiveing) Receive 1 piece (ReceiveQTY = 1)
Step 2 (blue line): Add 1 to the order details (PODetail) received (Received = Received + ReceivedQTY)
Step 3 (purple line): Update the number of undelivered orders = number of orders - number of receipts (BackOrder = POQty -Received)
Step 4 (Blue Line): Update Inventory Table (Part) Backorders (PendingPOs =selectsum(backorder ) from PODetail where PartID="001")
Step 5 (Green Line): Update Available Inventory = Current Inventory - Reservations + Backlogs (StockAvailible=TotalCurrentStock-StockOnHold+PendingPOs)
Step 6 (Orange Line): Update Inventory Shortage = Safety Stock - Available Stock (Shortage=SafetyStockLevel-StockAvailible)
Step 7 (red line): change the current actual inventory (TotalCurrentStock=TotalCurrentStock+ ReceivedQTY)
Step 8 (yellow and green lines): add a new inventory change record (oldQTY=last inventory, newQTY=TotalCurrentStock, change quantity ChangedQTY=NewQTY- OldQTY)
Step 9 (green line) is the same as step 5
Step 10 (orange line) is the same as step 6 The
above is a complete operation of receiving and selling orders, which is a typical (notorious) transaction script Operation, the UI calls the Service, and the Service uses 10 steps to directly operate the database and package it in a transaction, as shown in Figure 2.



The advantages of data modeling + transaction scripting pattern are:
1) In terms of implementation, the domain logic and transaction scripts are in one-to-one correspondence. The technology is simple and mature, which is very conducive to the transformation of domain logic into actual code and unit testing. The development is fast and the project success rate is high, even if it is a bad database design (such as Figure 1). The inventory table and the part number table share the same table and undertake the heavy task of demand analysis) and strange customer requirements (my last project was to ship the goods first, and then put the goods on the assembly line into the warehouse), as long as the domain logic lines are all If it is crossed, there is no shortage of one, and it can be justified. If there are no business logic errors such as loop calls, the project is unlikely to fail (except for external factors).
2) There is no advanced theory, just a table, which is convenient for communication with business personnel. Tables and lines are languages ​​that business personnel can understand. The manager of the mechanical design department of our company (he is the person who proposed the MRP requirement) printed the picture 1 (full picture) on A3 paper and pasted it on the wall for research, and he understood the whole process.

The disadvantages of the transaction script mode are:
1) The service is too heavy, all logic is placed in the same transaction script, and the scripts corresponding to different requirements may have duplicate codes, which is not conducive to maintenance.
2) The phenomenon of repeated reading and writing of the database increases the burden on the database.
3) Unit tests must depend on the database.
4) When the domain logic is very complex, or the database designer is not proficient in the business or the table and field design is unreasonable, the lines in the above figure may be very many and messy. At the other extreme, programmers do not know how to use drawing tools at all, and it is difficult to grasp complex domain logic relationships. In fact, many projects currently based on transaction scripts may not have a domain logic model similar to Figure 1 at all. When they get the requirements, they start streaking directly. This is simply joking with the maintainers. As a mechanical engineer, the first step in the design is the final assembly sketch, and the filing of the drawings is also based on the final assembly drawing. If I don't have a general assembly drawing for the instrument I designed, I'll have to roll up and go home the next day. So, if I am a software project manager, the first thing is to ask programmers to learn AutoCAD, why can't software developers learn to learn to do machinery to draw a general assembly drawing? It is too complicated to draw on A3 paper, so I may use A2 paper, and then draw You can also use No. 1 and No. 0 drawings. Google "domain model diagram", and most of what comes out are small lonely squares, with one or two lines between them that reflect one-to-many, many-to-one and other lines are thinly connected, in a mechanical engineer's In my eyes, this kind of diagram is too simple, like a child's toy, and has no practical value. It does not reflect the complex multi-table business assembly relationship. It is basically the same as the ER diagram with only thin key association lines. UML has many kinds of graphics, but there is really no one that can reflect the assembly details between modules compared with the concept of mechanical general assembly drawing.

For the four shortcomings of transaction scripts, let's analyze them one by one:
1) All the logic of the Service is placed in the same script transaction, and there is duplicate code between multiple different scripts. If the business itself is repeated, it can be seen in the logic diagram that there are two lines in the same position. If it is confirmed that the two lines represent the same business logic (AutoCAD can be scaled steplessly, and small text can be added on the line or graphic description), simply delete one of the lines. Some duplication of data access code, etc., can put this low-level code in a common subroutine. The script in the above example has a total of 10 steps, which basically correspond to 10 subroutine calls. For logic that is always bound together in business, they can be packaged together. For example, the change of the current inventory in the above example will always trigger green, orange, yellow, and cyan lines, and these four sub-processes can be combined. Packed into one, so the total number of scripts in the Service will be reduced. The above example is done with Delphi, which is also practiced in actual programming. For Java, the principle is the same, except that the public subroutines are usually moved to the singleton Service class by using Spring's declarative services. The singleton is equivalent to a global static method. It cannot have local variables to maintain the state, and all parameters are placed in the method. passed in the parameters. The only purpose of configuring a singleton is to use Spring to generate a proxy singleton at the time of creation, so as to realize declarative transactions. It is difficult for static methods to know the name of the method at runtime. Unless preprocessing is used at compile time, it is not possible to do tricks on transactions based on the method name. For programs based on transaction scripts, methods represent business logic and are first-class citizens. It doesn't matter which class the method is placed in. Because in the general assembly drawing, the line representing the business is always in that fixed position. The enterprise logic that may change can be placed in a separate service. The fixed logic does not need to be refactored. It can be written as a data set attribute method or service sub-method. When other business logic accesses data set attributes or The sub-method is automatically triggered during service, which can reduce the number of repeated scripts in the transaction. Through the above optimization, the 10-step script can be simplified into two scripts, and the rest of the scripts will be triggered automatically. Of course, it is ideal that there is a fixed linkage relationship between services. In practice, whether automatic triggering is allowed depends on the elastic coupling of the services themselves.
2) There may be repeated reading and writing of the database (for example, 9, 10 and 5, 6 are repeated in the above example), and the burden on the database is increased (I guess it is between 2 and 10 times), this is a difficult point, not too much easy to overcome. If it's not serious, you can ignore it, because it's a performance problem. Today's computers are fast enough, and usually performance is not a problem. If it really becomes a problem, it can be solved with money. Problems that money can solve are not problems for the software industry. However, if the money is less, and it is more realistic, or maintain a business data cache in the transaction script manually, but it will cause a serious problem that the subroutine depends on the upper-level script in reverse; or use a cache with a first-level cache like Hibernate , ORM or Active Record tool with dirty data checking function, the access to the database in the same transaction is first found in the first-level cache, so as to achieve transparent persistence across multiple sub-processes (if Hibernate can only choose one advantage, I will vote Transparent persistence, the essence of transparent persistence is to reduce database pressure). If you can't find a suitable transparent persistence tool, you can temporarily use Hibernate instead. But personally I don't like Hibernate too much, because it is too complicated, and it is easy to be misused as object-oriented modeling, which does not conform to the idea of ​​database-oriented modeling advocated in this article. It's hot, and I don't think that the data structure of the object may not match the table structure. Most object-oriented software users never take the time to think about how to store it. Only Hibernate is a special case. The user should not only consider it, but also must be an expert, because this is an enterprise application. Not only need to know which object attribute each data table field corresponds to, but also which business attribute each field corresponds to, and call native SQL if necessary. The JSQLBox tool I conceived should be very similar to Hibernate, with most of its functions but removing the association of objects. It only needs to be database-oriented and not object-oriented. The design is based on ActiveRecord, so that Dao can become an attribute of ActiveRecord. Session (equivalent to database connection) Inject Dao in the declaration transaction of Spring or jBeanBox (here is purely an advertisement), it is no longer necessary to display acquisition and injection of DAO and Session, and the CRUD method of ActiveRecord class can even be called directly in the script, which is very convenient for programming. efficient. At present, there are many ActiveRecord tools with similar functions to the above descriptions, but the overall function is not as easy to use as Hibernate, regardless of the object association, mainly reflected in the lack of support for setter method name refactoring, the lack of transparent persistence, and the lack of native SQL packaging. Good or reinvent another set of SQL syntax, of course the strength of Hibernate makes them unpopular and lack of developers is also a factor. By the way, both the object-oriented object model and the data-oriented table model based on ActiveRecord can simulate complex data structures, which is the basis for them to solve complex problems. Database tables can also express structures such as graphs and trees. If the object and database structure do not match, it becomes a source of trouble. But if objects and database structures are designed to be exactly the same, what do I want your objects to do? Take a look around the corner, oh, domain modeling and database modeling draw two identical model diagrams! Unlike Hibernate's POJO objects, ActiveRecord mode occupies precious Java single inheritance, but ActiveRecord itself is a part of the database. From a static structure, the database itself can express data structures of any structure. From a dynamic point of view, the business logic line It represents the method. Database-oriented modeling already has two programming elements: data structure and method. This is similar to the face object model, so with ActiveRecord, there is no need to repeat face object modeling. Use There is a little more syntactic sugar when the object is related (the object does not need to be loaded manually after setting the association), but the business model is essentially the same (and must be the same). As for the "encapsulation" of another element of the face object, it is a joke in enterprise applications. The synonym of enterprise logic is "unreasonable, existence is reasonable logic", any irrelevant attributes will be changed in the next requirement change. There may be correlations, the uncommon correlation (underground factory?) of changing part numbers and phone numbers before each shipment, or the bizarre need to calculate the factory director's nephew's income this month last year. So, forget about encapsulation. Of course, firing your nephew is a neat way to get a nice model if you can make the decision, but you're just a coder and you're too embarrassed to say to the factory manager "fire your nephew, it'll make it easier for me to program" , so it's better to implement these weird requirements honestly, just treat each field of the data table as public. Take a look around the corner, oh, domain modeling and database modeling draw two identical model diagrams! Unlike Hibernate's POJO objects, ActiveRecord mode occupies precious Java single inheritance, but ActiveRecord itself is a part of the database. From a static structure, the database itself can express data structures of any structure. From a dynamic point of view, the business logic line It represents the method. Database-oriented modeling already has two programming elements: data structure and method. This is similar to the face object model, so with ActiveRecord, there is no need to repeat face object modeling. Use There is a little more syntactic sugar when the object is related (the object does not need to be loaded manually after setting the association), but the business model is essentially the same (and must be the same). As for the "encapsulation" of another element of the face object, it is a joke in enterprise applications. The synonym of enterprise logic is "unreasonable, existence is reasonable logic", any irrelevant attributes will be changed in the next requirement change. There may be correlations, the uncommon correlation (underground factory?) of changing part numbers and phone numbers before each shipment, or the bizarre need to calculate the factory director's nephew's income this month last year. So, forget about encapsulation. Of course, firing your nephew is a neat way to get a nice model if you can make the decision, but you're just a coder and you're too embarrassed to say to the factory manager "fire your nephew, it'll make it easier for me to program" , so it's better to implement these weird requirements honestly, just treat each field of the data table as public. Take a look around the corner, oh, domain modeling and database modeling draw two identical model diagrams! Unlike Hibernate's POJO objects, ActiveRecord mode occupies precious Java single inheritance, but ActiveRecord itself is a part of the database. From a static structure, the database itself can express data structures of any structure. From a dynamic point of view, the business logic line It represents the method. Database-oriented modeling already has two programming elements: data structure and method. This is similar to the face object model, so with ActiveRecord, there is no need to repeat face object modeling. Use There is a little more syntactic sugar when the object is related (the object does not need to be loaded manually after setting the association), but the business model is essentially the same (and must be the same). As for the "encapsulation" of another element of the face object, it is a joke in enterprise applications. The synonym of enterprise logic is "unreasonable, existence is reasonable logic", any irrelevant attributes will be changed in the next requirement change. There may be correlations, the uncommon correlation (underground factory?) of changing part numbers and phone numbers before each shipment, or the bizarre need to calculate the factory director's nephew's income this month last year. So, forget about encapsulation. Of course, firing your nephew is a neat way to get a nice model if you can make the decision, but you're just a coder and you're too embarrassed to say to the factory manager "fire your nephew, it'll make it easier for me to program" , so it's better to implement these weird requirements honestly, just treat each field of the data table as public.
3) Unit tests must depend on the database. It may be a bit troublesome to empty the database and fill data every time during the test, but I personally think this is a small problem that can be tolerated and can be ignored. But since there is such a problem, let me mention it by the way. If you use cross-database tools (such as Hibernate or the imaginary jSQLBox) to populate the data, it is possible to make the test code independent of a specific database (except for native SQL, but native SQL itself is also easily portable). Mock databases seem to be an anti-pattern.

4) When the domain logic is very complex, the domain logic diagram similar to Figure 1 will be very large (it is possible to use the No. 0 drawing) and complex, making it difficult to read and check for errors. I have a few ideas, or guesses, to solve this problem, because I am just a drawing, and the software is purely a cameo, and I have never encountered a complex business model:
The first possibility: the designer of the domain logic diagram has a problem , it can be optimized by replacing a database designer who is more familiar with the business.
The second possibility: the domain logic that has a very large scale (that is, there are many database tables) and is very complex (the relationship between the tables is criss-crossed like a spider web, see Figure 3) does not exist at all or is simply a wrong assumption business design, because the enterprise model is dealing with people, too complicated logic not only cannot be designed by designers, but also cannot be understood by people who use it, so don’t worry that the domain logic diagram will be too complicated to be designed and read. , a good business logic embodied in the drawing must also have the characteristics of high cohesion and low coupling, as shown in Figure 4.



This is not a coincidence. It has something to do with the way the human brain deals with problems. The human brain is not used to dealing with a bunch of messy information. Human society has long grouped closely related things together to facilitate processing. This is the basic reason for the operation of this society. , is the factual status quo that already exists, and we just need to enjoy its achievements. (The reason why many ERP implementations fail is that although the model is perfect, it is too complicated and exceeds the limit that the user can understand in the limited training time; Closing the button, bringing down one department will bring down the entire system.) Remember one key (important thing to say several times): enterprise applications are the simpler and inferior of all software categories, and it must be simple enough to be Proposed and used by business people, it must be simple enough to be quickly understood by lay programmers.
The third possibility is: Enterprise applications with many tables and many and complex connections between tables do exist (although this is suspected), then you can refer to the layout of integrated circuits and use a similar electronic board automatic routing software (this is really object-oriented). ) instead of AutoCAD manual wiring, the lines are laid out at regular intervals, and annotations can be attached to the lines, which can help designers and readers to understand the diagrams more easily. The wiring here is the key point. At present, some tools such as Rose or PowerDesigner Class, may already have this function, or may not, I am not too familiar with it. It is even possible to use the software to automatically generate the names of all subprograms and the unit test process names, and roughly draw a draft of the software from the overall structure, because the model based on the database table design is grounded, and the table model is equal to the database table. The line is equal to the method name (driving method, constraint method, etc.), and I guess the larger the scale and the more lines in the structure diagram, the simpler the single method is. The reason is the same as above, because it is in line with the way people deal with things. Thousands of transistors are known, the more complex the chip, the less, and the simpler the chip, the more.

To sum up, transaction scripts based on data modeling have many advantages and disadvantages, but there are still ways to improve them. Compared with domain modeling, which is full of problems and even some questions cannot be explained by experts, data modeling + Transaction scripts should undoubtedly be the first choice for programmers whose first goal is completion. As for maintainability, if you can always keep the code and the data model line diagram roughly the same, maintainability is also very good, no matter how complex, how many lines, even if you copy and paste 10,000 lines of code a day, if the concept is clear Error prone, maintainability is not necessarily related to lines of code. In addition, data modeling does not completely exclude face objects. In some local occasions, such as the expansion calculation of BOM, the BOM table can be read into the memory and converted into an object tree structure for easy operation. This is a partial use, but the whole The system does not need to be completely changed into objects, and data modeling is still the main focus. In fact, ActiveRecord is also an object, just an anemic object one-to-one with the database table row. Even if the business logic method is written in and waits for automatic triggering, it is still anemic, because the logic method only belongs to the general assembly diagram or Partial assembly drawing does not belong to any data sheet, so be prepared to delete, modify, and remove methods at any time. There is only one reason: because this is an unreasonable enterprise application. On the other hand, data modeling and object modeling are interlinked, and can be regarded as the first version of object modeling. If the method is closely integrated with the model and can be written in the model method, or several models are closely related and have no attributes directly related to the outside world, they can be packaged into an object for management, then data modeling naturally evolves into object modeling. Object modeling With the complexity and irrationality of business requirements, private fields have to be associated with the outside world, and may degenerate into a simple anemic data model. Familiarity with the business and prediction of the possibility of business changes are the key here. If you are unfamiliar with the business, in order to improve the success rate of the project, you can start with data modeling, in the process of modeling and optimization (even the operation and feedback of prototypes) The business rules are summed up in the paper, and the correct business model is not obtained from the sky, but from the gradual optimization of the final assembly drawing. However, if you have a clear understanding of the business and are proficient in face-to-face object technology, you may naturally form a high-cohesion, low-coupling business logic diagram in your mind, so you can use the object model from the beginning and refine it from the top to the bottom. , but the disadvantages are: 1) The business level of this person is extremely high 2) Without the help of the general assembly drawing, it is difficult to not miss the complex enterprise logic 3) The object modeling may be too far from the database model structure, and the business It is quite difficult to degenerate into an anemia model after irrationalization.

The above content is purely personal opinion. If you have different opinions, please give an example to refute. Which enterprise application cannot use data modeling (plus domain logic lines) to represent business logic, but must use domain modeling To express? Or after using domain modeling, it cannot be expressed by data modeling? Or the modeling speed and project completion time using domain modeling are much faster than using data modeling. (Let's not talk about maintainability here, because most of the current transaction script projects do not have the general assembly diagram I am talking about, which was invented only two years ago, and any similarity will be purely coincidental. )

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326646813&siteId=291194637