MapReduce is a programming model for parallel operations on large datasets (greater than 1TB)
thrift is a software framework for the development of scalable and cross-language services. It combines a powerful software stack and code generation engine to build between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml programming languages Seamless, efficient service.
In a sense, both WebService and REST are the implementation of RPC, so what is the development process of RPC? This article refers to wikipedia for a brief summary of RPC.
RPC (RemoteProcedureCall) is a technology of Inter-Process Communication (IPC, Inter-Process Communication), which generally refers to inter-process communication on different machines. When programming in ancient languages such as C, RPC is called a call to a "subroutine" on the S side, so it is called a "procedure call". After the emergence of OOP, RPC can also be called remote method invocation (RemoteMethodInvocation), or remote invocation (RemoteInvocation).
The RPC process can be synchronous or asynchronous. Synchronous mode: The C-side sends a request to the S-side, blocking and waiting; the S-side executes a subroutine and sends a response; the C-side continues to execute; asynchronous methods, such as XHTTP calls.
The calling process of RPC (the term Stub should be borrowed from JavaRMI):
- Client sends a request (Call) to ClientStub.
- The ClientStub encapsulates the request parameters (also called Marshalling), issues a system call, and the OS sends a message to the S side.
- After receiving the message, the S end passes the packetized message to the ServerStub. ServerStub unpacking (UnMarshalling).
- ServerStub calls the subroutine on the S side. After processing, send the result to the C side in the same way.
Note: ServerStub is also called Skeleton.
What is a stub?
A stub is a piece of code that converts the parameters passed in the RPC process. Handling content includes big and small endian issues between different OS. In addition, the client side is generally called Stub, and the server side is generally called Skeleton.
Production method: 1) Manual generation, which is more troublesome; 2) Automatic generation, using IDL (InterfaceDescriptionLanguate) to define the interface of C/S.
Interaction mechanism standard: IDL is generally used, and the tool RPCGEN() is used to generate IDL.
RPC related implementation
- JavaRMI
- XML-RPC, XML+HTTP to make calls between machines
- JSON-RPC
- SOAP, an upgraded version of XML-RPC
- Facebook Thrift
- CORBA
- AMF AdobeFlex
- Libevent is a framework for building RPC Server and Client.
- WCF, from Microsoft
- .net Remoting, gradually replaced by WCF
Most companies are not unfamiliar with the concept of big data, but few people understand the technology used in it. On the one hand, the software used by many people is relatively simple and professional, and only needs to be able to operate it. For example, the FineBI business intelligence solution software, all the internal technologies do not need to be understood by the operator, and only need to understand some superficial communication needs. That is, you don't need to consider how to model, which can save communication time during project implementation and bring more benefits to the enterprise.
On the other hand, many bosses are not good at this, and understanding it has no practical effect on them, but for those who often come into contact with these software, understanding the technology behind it will be of better help to their work. So, what technologies are used behind big data?
1. NoSQL database
In the environment we live in, the emergence of new technologies does not take too long to be reused by people. In fact, many technologies will be used by people one month after they appear. In a broad sense, NoSQL databases themselves also contain many technologies. They focus on the limitations of relational database engines such as indexing, streaming media, and high-access website services, as well as others. In these areas, NoSQL databases are used most frequently.
二、HadoopMapReduce
This is a technology that can handle the challenges posed by big data analysis, not only with high frequency of application, but also with unique advantages in processing. In my hometown, many companies think that the data platform developed by Hadoop MapReduce technology is the best to use. It can be seen that this technology can indeed bring unexpected benefits to enterprises.
3. Memory Analysis Technology
Memory was expensive when it first appeared, but with the advancement of technology, more and more memory began to appear, and the price naturally dropped again and again. However, the performance has not declined, but has an upward trend, which is why memory is very popular in the network.
Not only that, but professionals also mentioned that low-cost memory applications in big data centers have real-time and high-efficiency advantages, and can also improve big data insights, thereby providing enterprises with better data analysis and mining.
4. Integrated equipment
Business intelligence and big data analysis were only stimulated after the emergence of data warehouse equipment. This way of using data warehouse technology to enhance their own competitive advantages and stay ahead of competitors has made many companies happy. However, there are still many functions of integrated equipment. Among them, the ability to enhance the role of traditional database systems is the most used by many enterprises. In addition, integrated equipment has become an important tool for enterprises to cope with data challenges. Therefore, this technology has also attracted much attention.