Velocity template parser to parse according to AST tree, then traverse the tree to render the entire page. But for the time some complex pages, the efficiency is relatively low, through inheritance syntax standard Velocity, and re-enact its rendering mechanism, so that both the template does not change the development mode, but also improve the efficiency of the template, the template engine is named for the sketch.
There is real problem
Taobao is the use of Velocity, when the size of the site is relatively large, only one increased by 10%, then the whole system will increase significantly, but there are some problems using Velocity:
- The entire page output is relatively large, with an average of about 80KB, most of the time out.print
- CPU greater pressure, the pressure test at around 80%, with a detection tool discovery template rendering takes up more than 60 percent of the CPU time
- The method template contains a large number of variables call, do not even dynamic resolution
- Generate a template to render a lot of temporary objects, a great impact on the JVM GC, GC lead to frequent system
- In the page template more hollow character, wasted network traffic
which is the cause of Velocity efficiency do not increase when rendering templates
The theoretical basis for optimization
Triangular configuration programming language
Efficiency level language structure of the program and the language of a pair of inverted triangular structure, the more upper-level language, its efficiency tends to be lower.
Reducing data structure abstraction
Algorithms + data structures = programs, is a process algorithm and the data structure is a vector. Data structure refers to the abstraction of the underlying need to call the program interface to achieve change by our own, to reduce the extent of the program package, so as to achieve the purpose of enhancing performance.
Simple procedure complicated
Write a simple JDBC will be higher than the efficiency achieved ibatisi as a data layer calls the database query data.
Reduce the cost of translation
Programming translation problems exist, the problem is the codec, in order to reduce errors reduce this translation, to improve efficiency
Converted to constant change
Some changes in the content of the constant transformation of content, dynamic Web pages have a lot of static parts.
Efficient template engine realization of ideas
Sketch of the overall design is divided; two parts: compile-time and run-time environment environment. Runtime environment is mainly used as a template to render HTML, the template compile-time environment is mainly compiled into Java classes.
How templates are compiled vm
Optimized Velocity template of an object is to explain the implementation of changes to the template compiler implementation, the syntax in the final vm be interpreted as a syntax tree, and then rendering results through the implementation of Fengyun syntax tree.
Velocity will follow a template vm AST interpreted as a syntax tree, but the tree rendering rebuilt modify rules will redefine the syntax of each node generates corresponding Java syntax, rather than rendering results.
No reflection method call optimization
Velocity's approach is to find $ exampleDO Java objects corresponding to the variable, and then look for the existence getItemList in this object () methods, such as method calls invoke method of existence, reflecting the implementation of the results obtained by this method. Use reflection to invoke time-consuming.
The output character byte into the output
Direct static string is out.write (_S0), - S0 is an array of bytes in the template vm string, the string will be transferred into a byte array in the template class initialization is completed. The character encoding is very time-consuming, such as static string encoded in advance, the final will be written in the stream Socket eliminates the need for coding time, thereby improving efficiency. From a practical point of view of the test, which helps improve performance.
Optimization results
Will turn into byte char
Byte char will turn into found output performance than the byte stream output increase of 100%, how much time-consuming character encoding.
No reflection execution
Performance of the system approaches 50%
Other optimization methods
- Remove the page output in excess of non-Chinese space.
- TAB and compression wrap.
- The combined data is the same, the same data in the loop to avoid output
- Asynchronous rendering, some static content extracted into asynchronous rendering.