The Evolution of Lake Storage Integrated Storage Architecture

Table of contents

01 Evolution of integrated storage architecture for lakes and warehouses

1. Evolution stage of storage architecture

2. Features of HDFS

3. Characteristics of object storage

02 Comparison of different types of storage systems

1. Achilles' heel of HDFS - NameNode

2. Achilles' heel of object storage - metadata

3. Object storage metadata performance and API limitations

03Explore the future storage selection of the integrated architecture of lake and warehouse

1. Technical key points

2.JuiceFS

04Practice of integrated architecture of lake and warehouse on JuiceFS

1. Integrated lake and warehouse architecture

2. Metadata performance comparison

3. Data query performance comparison

04Q&A

Q1: Does the fact that HDFS is not a storage-computing separation architecture mean that it will be eliminated?

Q2: The S3 object storage design itself has KV design flaws, does that also mean elimination?

Q3: Storage-computing separation architecture, storage-computing integration, which storage architecture is more suitable for which scenarios?

Q4: Differences between JuiceFS and Alluxio.

Q5: Is the storage-computing separation architecture less efficient than the integrated architecture?


Benefits: At the end of the article, share a full set of cloud-native related information

01

The Evolution of Lake Storage Integrated Storage Architecture

1. Evolution stage of storage architecture

The evolution of big data storage systems can be divided into two stages: the computer room era and the cloud computing era.

The first stage is also the earliest era when Hadoop was born. This era is mainly based on the computer room system, and HDFS is basically the only storage selection solution.

With the popularization and development of cloud computing, object storage has gradually become the mainstream storage solution for enterprises. Especially in the data lake architecture, object storage has become a popular underlying storage solution due to its high scalability and support for diverse data types. We will review and compare the architectures of HDFS and object storage, discuss their respective advantages and disadvantages, and development trends. At the same time, it will also discuss how the cloud-native data lake storage architecture should be designed.

If you analyze the architecture design of HDFS and object storage in essence, you will find that they are actually two completely different storage systems. In the future, the storage system in the cloud-native era will provide a storage base for the integration of big data lakes and warehouses. What characteristics should it have? We will discuss this later.

2. Features of HDFS

HDFS originated from GFS (Google File system) and was officially released in 2006. Its features include independent metadata storage (NameNode), multiple copies, storage-computing coupling, etc. The entire HDFS design is relatively more friendly to large file storage, but not so friendly to small files. In terms of storage scale, generally a single namespace (Namespace) is on the order of 100 million. If there is a larger amount of data storage, some scalable architecture adjustments are required.

3. Characteristics of object storage

Object storage S3 was also released in 2006, but its goal at the beginning of its release was to store massive amounts of unstructured data, not for the ecological use of big data, so the architecture design of object storage is completely different from that of HDFS. Its main features include low storage costs and sufficiently high data persistence. Its API is based on the HTTP protocol, and the metadata design is a flat KV structure. The flat metadata design will cause some problems in big data scenarios. Like HDFS, its data does not support modification. In terms of consistency, object storage is based on eventual consistency, and some cloud factories or some interfaces can achieve strong consistency.

02

Comparison of different types of storage systems

Let's compare the characteristics of HDFS and object storage in different dimensions mentioned above as a whole.

First of all, the storage scale directly determines the amount of business data that can be supported by the data platform or the entire big data system. Compared with HDFS, object storage can achieve a larger storage scale. In daily data tasks, such as ETL or ad-hoc, metadata operation performance will affect the overall task efficiency, and the performance of object storage is far weaker than that of HDFS. The complexity of operation and maintenance is also a key factor affecting storage selection. The maintenance cost of HDFS is much higher than that of object storage, which includes not only labor costs, but also some additional components to ensure high availability and scalability.

1. Achilles' heel of HDFS - NameNode

The biggest bottleneck of HDFS is the NameNode. The initial design of NameNode is to imitate the Master of GFS. This is a single-point design, so HDFS also has the same problem.

On the issue of single-point NameNode scalability, the community has made some attempts, such as the ViewFs + Federation solution, and later launched the RBF (Router-based Federation) solution, which is essentially to solve the scalability problem of NameNode under a single namespace , through multi-cluster horizontal expansion to ensure that the scale of HDFS cluster storage capacity will not be greatly limited.

In addition to storage scale, another problem is high availability. In a single cluster, the NameNode is a single point, so there is a Standby NameNode later, and then there is a JournalNode. In the final analysis, all these components and architectures are designed to ensure that the entire HDFS storage cluster can meet the high availability requirements, rather than the entire big data cluster being unavailable once the Active NameNode goes down.

2. Achilles' heel of object storage - metadata

The problem with object storage is also related to metadata. As mentioned earlier, the metadata design of object storage is a flat KV structure. The foo directory in the figure above has so many subdirectories or subfiles. From the perspective of object storage, it is not a directory structure. What we see in the traditional file system is a tree-like directory structure, but there is no directory tree in the design of object storage, but a one-dimensional flat KV structure, and then the "/" separator is used to simulate the directory Tree.

If you want to rename directories in object storage, for example, rename the foo directory to bar directory, there will be some steps in the implementation of object storage: the first step needs to be copied recursively, if it is a deep directory, Then you need to recursively copy the data of the entire directory to the location prefixed with bar; the second step is to update some internal indexes; finally delete all relevant data in the original foo directory.

Although the whole process seems to be just a one-step operation of renaming folders, it is divided into several large steps inside the object storage. Each step does not seem to be particularly lightweight, especially the first and third steps. step.

We naturally have a question, in this process, especially when renaming a directory with a large amount of data, how to ensure consistency? In fact, there is no way to guarantee object storage. There are many components in the community to try to solve, or to optimize based on the interface of object storage. But it can only try to avoid the consistency problem as much as possible, and it cannot completely solve it. Because there is no way to simply bypass the problems in the design of the object storage architecture through the outer components.

3. Object storage metadata performance and API limitations

In the components of the data lake, whether it is Hudi or Iceberg, some problems of object storage have been mentioned. For example, the above table is mentioned in the Hudi document. If you want to list a directory on object storage, along with the files in the directory As the number increases, its delay will gradually increase, so the Hudi community has the Metadata Table design.

Iceberg also mentioned in the document that object storage will have some API QPS restrictions based on a certain fixed prefix, which will lead to frequently accessed directories or large data sets. If this API QPS limit is touched, the stability of the task will be directly affected. affected. Especially when the scale of your data platform becomes relatively large, it is easy to touch the API QPS limit of object storage on the cloud. So Iceberg has ObjectStorageLocationProvider to try to avoid this problem. It is essentially an evasive implementation method, adding a random hash in front of different Keys. This hash is to try to disperse the prefix-based QPS restrictions to different prefixes, so as to avoid touching the API QPS soon limit.

03

Explore future storage options for the lake-warehouse architecture

Assuming that the design is started from 0 now, based on the current cloud computing or cloud-native environment to build a storage system that is more suitable for the present and the future, what capabilities should it have?

1. Technical key points

Here are some points that are important for most scenarios:

First and foremost is the issue of scalability. As the architecture of big data platforms continues to evolve from data warehouses, to data lakes, to lake warehouses, the requirements for platform storage capacity will become higher and higher, not only to meet the traditional The storage requirements of the data warehouse need to cover more and more businesses as much as possible. The data managed in different storage systems may need to be stored and managed in one place in the future to avoid the problem of data islands.

The second very important point is the issue of high availability. The entire system must have high availability capabilities. It cannot only be scalable without high availability, otherwise it will not be able to meet production needs.

The third point is performance issues. It is not possible to sacrifice performance at the expense of storing large-capacity data. Certain optimization strategies are required to ensure performance, including metadata performance and data read and write performance.

The next step is to take advantage of the advantages of the cloud as much as possible. The biggest feature of the cloud is elasticity, and both computing and storage can be elastic. If the entire storage system architecture cannot adapt to the elastic scaling characteristics of the cloud, and the storage architecture of the previous computer room is still used, the characteristics of the cloud cannot be maximized.

The cloud is naturally a scenario where storage and computing are separated. How to separate computing components and storage components in the entire ecosystem and ensure overall computing efficiency and scalability after separation are issues that need to be considered by the underlying storage system.

Small file management is more of a problem that needs to be considered by data lakes. Traditional big data platforms may still focus on large files, but the problem of small files will become more and more prominent as data from AI or other fields come in.

For each of the above technical key points, there will be some corresponding solutions, as shown in the figure above.

2.JuiceFS

The JuiceFS introduced to you today is positioned as a strongly consistent distributed file system. It can be considered that this design goal is to completely replace HDFS.

The overall structure of JuiceFS is divided into three parts, very similar to HDFS, metadata storage, data storage, and interfaces of different clients. But in fact, each piece will have some different designs.

First of all, metadata is a plug-in engine design in JuiceFS. Different databases can be selected for different scenarios, such as Redis, MySQL, TiKV, etc. You can choose the most suitable database as the metadata of JuiceFS according to business scenarios and business needs. data engine.

In terms of data storage, JuiceFS is mainly based on object storage, which can maximize the use of storage resources on the cloud. All data is placed in object storage, but it is not simply written to object storage. The data written through the JuiceFS client , will do some format processing (similar to the block design of HDFS DataNode), and the metadata will try not to depend on the metadata of the object storage, because the metadata of the object storage mentioned earlier has many problems, such as consistency issues and performance issues. JuiceFS does not rely on the metadata of object storage, but assumes all metadata requests by its own independent metadata storage.

The metadata engine can be expanded horizontally, and it can easily store massive (such as tens of billions) files, whether large or small. In the production environment of JuiceFS users, it has indeed achieved a scale of tens of billions, which proves that the storage system has excellent scalability.

Caching is a very important feature, especially in scenarios where storage and computing are separated. If you need to upgrade to a storage-computing-separated architecture, you must consider the caching function, whether it is metadata caching or data caching.

04

Practice of integrated architecture of lake and warehouse on JuiceFS

1. Integrated lake and warehouse architecture

This is the overall architecture diagram. The bottom part contains not only traditional structured data, but also a lot of semi-structured and unstructured data. The storage layer is a unified storage base provided by JuiceFS combined with object storage.

Further up is the data management layer such as Delta Lake, Hudi or Iceberg, which will manage the entire data warehouse based on the underlying storage system JuiceFS, HDFS or object storage.

Then various query engines, BI tools, and computing engines are connected to different management layer components.

2. Metadata performance comparison

The object storage represented by OSS, as well as HDFS and JuiceFS are compared in the figure above. On the left is the latency of metadata, the lower the value, the better; on the right is the throughput, the larger the value, the better.

First look at the blue object storage, compared with the other two, Latency operates on different metadata will be very different. In particular, the Rename operation mentioned above has a very large performance drop in object storage compared to the other two storage systems. HDFS and JuiceFS are similar in architectural design, and both have independent metadata storage, so both latency and throughput can be relatively fast. In the figure on the right, JuiceFS can even do better in terms of throughput than HDFS.

3. Data query performance comparison

The figure above is a data query performance comparison. The picture on the left is a TPC-DS query test done on object storage and JuiceFS with Spark. The picture on the right is the test with Presto in HDFS and JuiceFS.

As can be seen in the figure on the left, in terms of query performance, JuiceFS can double the performance compared to directly using object storage, especially in some relatively complex queries (such as the ninth query) The performance improvement will be more obvious .

As can be seen from the figure on the right, when JuiceFS is fully cached, the storage-computing separation architecture can also achieve the same level as HDFS query performance.

Therefore, in general, whether it is metadata performance testing or TPC-DS-based query testing, JuiceFS can achieve the same performance as HDFS, and has a significant performance advantage over object storage.

04

Q&A

Q1: Does the fact that HDFS is not a storage-computing separation architecture mean that it will be eliminated?

A1: I think that in the short to medium term, HDFS will definitely not be eliminated, because it is still a de facto standard in the big data ecosystem. It's just that after the big data platform goes to the cloud, everyone will rethink how to build a storage system for the big data platform on the cloud. Some companies may also choose to move the entire infrastructure of the computer room to the cloud. In fact, the cost of doing so may be higher than that in the computer room, but at least the overall architecture is guaranteed to be verified, but it will not be up to the standard. The original intention of the cloud cannot make more use of the characteristics of the cloud. If you still move the HDFS-based storage-computing coupling architecture to the cloud, you will not get any benefits from the cloud. This is why many companies want to choose object storage or a new-generation storage system like JuiceFS.

Q2: The S3 object storage design itself has KV design flaws, does that also mean elimination?

A2: It is a different positioning, because the initial positioning of S3 is not designed for big data ecology, its positioning is very clear: store massive unstructured data, it is easy to expand, the reliability is high enough, data cannot be lost, storage The cost should be low enough (so it uses technologies like EC). These are the advantages of S3, as long as there is such a demand scene, it will be very applicable. But what is special about scenarios such as big data or AI is that it is not only necessary to meet the requirements of storage scale, but also to meet various other requirements, such as performance requirements, overall impact on business, etc. We think that there is no way for S3 or object storage to meet the more demanding requirements, so some new designs or new systems are needed, and how to make further improvements and upgrades based on object storage. The S3 itself is certainly not going to be eliminated.

Q3: Storage-computing separation architecture, storage-computing integration, which storage architecture is more suitable for which scenarios?

A3: Generally speaking, the separation of storage and calculation or the coupling of storage and calculation are more based on your current overall structure. If you are considering building an entire data platform or AI platform on the cloud, you will naturally consider making more use of resources on the cloud. To maximize the elasticity of resources on the cloud, we must move towards the separation of storage and computing. Even though it may be a storage-computing coupled architecture at the beginning, in the long run, it still needs to move towards a storage-computing separation architecture. If there is no need to go to the cloud, but to have its own private cloud architecture in the computer room, there is not much difference between whether to separate storage and calculation or to couple storage and calculation. More is how to take advantage of the cloud under the architecture of separation of storage and computing.

Q4: Differences between JuiceFS and Alluxio.

A4: This is a good question. In the JuiceFS official document, there is a special article comparing JuiceFS and Alluxio. If you are interested, you can go to the official document to have a look. To put it simply, the design goal or positioning of JuiceFS is a distributed file system, and the target is a traditional distributed file system such as HDFS. I understand that Alluxio's design positioning is more of a cache layer. The goal of this cache layer is not to achieve complete file system features. For example, it does not need to be fully compatible with POSIX, and it does not need to provide a variety of non-big data scenarios. characteristic. But JuiceFS is positioned as a distributed file system, so we need to consider many points that the file system needs to consider. Maybe Alluxio and JuiceFS have some overlaps in some features (such as caching), but it does not mean that the design direction of these two projects is the same.

Q5: Is the storage-computing separation architecture less efficient than the integrated architecture?

A5: If you simply separate storage and computing without any optimization, the efficiency will definitely decrease. Even now that network hardware technology has developed well, it may still not be as efficient as local I/O. Therefore, simply splitting the storage and computing efficiency will definitely decrease, so it is necessary to consider how to further optimize performance after the separation of storage and computing. Try not to degrade the business or hurt some day-to-day metrics of the business.

01 Evolution of integrated storage architecture for lakes and warehouses

1. Evolution stage of storage architecture

The evolution of big data storage systems can be divided into two stages: the computer room era and the cloud computing era.

The first stage is also the earliest era when Hadoop was born. This era is mainly based on the computer room system, and HDFS is basically the only storage selection solution.

With the popularization and development of cloud computing, object storage has gradually become the mainstream storage solution for enterprises. Especially in the data lake architecture, object storage has become a popular underlying storage solution due to its high scalability and support for diverse data types. We will review and compare the architectures of HDFS and object storage, discuss their respective advantages and disadvantages, and development trends. At the same time, it will also discuss how the cloud-native data lake storage architecture should be designed.

If you analyze the architecture design of HDFS and object storage in essence, you will find that they are actually two completely different storage systems. In the future, the storage system in the cloud-native era will provide a storage base for the integration of big data lakes and warehouses. What characteristics should it have? We will discuss this later.

2. Features of HDFS

HDFS originated from GFS (Google File system) and was officially released in 2006. Its features include independent metadata storage (NameNode), multiple copies, storage-computing coupling, etc. The entire HDFS design is relatively more friendly to large file storage, but not so friendly to small files. In terms of storage scale, generally a single namespace (Namespace) is on the order of 100 million. If there is a larger amount of data storage, some scalable architecture adjustments are required.

3. Characteristics of object storage

Object storage S3 was also released in 2006, but its goal at the beginning of its release was to store massive amounts of unstructured data, not for the ecological use of big data, so the architecture design of object storage is completely different from that of HDFS. Its main features include low storage costs and sufficiently high data persistence. Its API is based on the HTTP protocol, and the metadata design is a flat KV structure. The flat metadata design will cause some problems in big data scenarios. Like HDFS, its data does not support modification. In terms of consistency, object storage is based on eventual consistency, and some cloud factories or some interfaces can achieve strong consistency.

02 Comparison of different types of storage systems

Let's compare the characteristics of HDFS and object storage in different dimensions mentioned above as a whole.

First of all, the storage scale directly determines the amount of business data that can be supported by the data platform or the entire big data system. Compared with HDFS, object storage can achieve a larger storage scale. In daily data tasks, such as ETL or ad-hoc, metadata operation performance will affect the overall task efficiency, and the performance of object storage is far weaker than that of HDFS. The complexity of operation and maintenance is also a key factor affecting storage selection. The maintenance cost of HDFS is much higher than that of object storage, which includes not only labor costs, but also some additional components to ensure high availability and scalability.

1. Achilles' heel of HDFS - NameNode

The biggest bottleneck of HDFS is the NameNode. The initial design of NameNode is to imitate the Master of GFS. This is a single-point design, so HDFS also has the same problem.

On the issue of single-point NameNode scalability, the community has made some attempts, such as the ViewFs + Federation solution, and later launched the RBF (Router-based Federation) solution, which is essentially to solve the scalability problem of NameNode under a single namespace , through multi-cluster horizontal expansion to ensure that the scale of HDFS cluster storage capacity will not be greatly limited.

In addition to storage scale, another problem is high availability. In a single cluster, the NameNode is a single point, so there is a Standby NameNode later, and then there is a JournalNode. In the final analysis, all these components and architectures are designed to ensure that the entire HDFS storage cluster can meet the high availability requirements, rather than the entire big data cluster being unavailable once the Active NameNode goes down.

2. Achilles' heel of object storage - metadata

The problem with object storage is also related to metadata. As mentioned earlier, the metadata design of object storage is a flat KV structure. The foo directory in the figure above has so many subdirectories or subfiles. From the perspective of object storage, it is not a directory structure. What we see in the traditional file system is a tree-like directory structure, but there is no directory tree in the design of object storage, but a one-dimensional flat KV structure, and then the "/" separator is used to simulate the directory Tree.

If you want to rename directories in object storage, for example, rename the foo directory to bar directory, there will be some steps in the implementation of object storage: the first step needs to be copied recursively, if it is a deep directory, Then you need to recursively copy the data of the entire directory to the location prefixed with bar; the second step is to update some internal indexes; finally delete all relevant data in the original foo directory.

Although the whole process seems to be just a one-step operation of renaming folders, it is divided into several large steps inside the object storage. Each step does not seem to be particularly lightweight, especially the first and third steps. step.

We naturally have a question, in this process, especially when renaming a directory with a large amount of data, how to ensure consistency? In fact, there is no way to guarantee object storage. There are many components in the community to try to solve, or to optimize based on the interface of object storage. But it can only try to avoid the consistency problem as much as possible, and it cannot completely solve it. Because there is no way to simply bypass the problems in the design of the object storage architecture through the outer components.

3. Object storage metadata performance and API limitations

In the components of the data lake, whether it is Hudi or Iceberg, some problems of object storage have been mentioned. For example, the above table is mentioned in the Hudi document. If you want to list a directory on object storage, along with the files in the directory As the number increases, its delay will gradually increase, so the Hudi community has the Metadata Table design.

Iceberg also mentioned in the document that object storage will have some API QPS restrictions based on a certain fixed prefix, which will lead to frequently accessed directories or large data sets. If this API QPS limit is touched, the stability of the task will be directly affected. affected. Especially when the scale of your data platform becomes relatively large, it is easy to touch the API QPS limit of object storage on the cloud. So Iceberg has ObjectStorageLocationProvider to try to avoid this problem. It is essentially an evasive implementation method, adding a random hash in front of different Keys. This hash is to try to disperse the prefix-based QPS restrictions to different prefixes, so as to avoid touching the API QPS soon limit.

03Explore the future storage selection of the integrated architecture of lake and warehouse

Assuming that the design is started from 0 now, based on the current cloud computing or cloud-native environment to build a storage system that is more suitable for the present and the future, what capabilities should it have?

1. Technical key points

Here are some points that are important for most scenarios:

First and foremost is the issue of scalability. As the architecture of big data platforms continues to evolve from data warehouses, to data lakes, to lake warehouses, the requirements for platform storage capacity will become higher and higher, not only to meet the traditional The storage requirements of the data warehouse need to cover more and more businesses as much as possible. The data managed in different storage systems may need to be stored and managed in one place in the future to avoid the problem of data islands.

The second very important point is the issue of high availability. The entire system must have high availability capabilities. It cannot only be scalable without high availability, otherwise it will not be able to meet production needs.

The third point is performance issues. It is not possible to sacrifice performance at the expense of storing large-capacity data. Certain optimization strategies are required to ensure performance, including metadata performance and data read and write performance.

The next step is to take advantage of the advantages of the cloud as much as possible. The biggest feature of the cloud is elasticity, and both computing and storage can be elastic. If the entire storage system architecture cannot adapt to the elastic scaling characteristics of the cloud, and the storage architecture of the previous computer room is still used, the characteristics of the cloud cannot be maximized.

The cloud is naturally a scenario where storage and computing are separated. How to separate computing components and storage components in the entire ecosystem and ensure overall computing efficiency and scalability after separation are issues that need to be considered by the underlying storage system.

Small file management is more of a problem that needs to be considered by data lakes. Traditional big data platforms may still focus on large files, but the problem of small files will become more and more prominent as data from AI or other fields come in.

For each of the above technical key points, there will be some corresponding solutions, as shown in the figure above.

2.JuiceFS

The JuiceFS introduced to you today is positioned as a strongly consistent distributed file system. It can be considered that this design goal is to completely replace HDFS.

The overall structure of JuiceFS is divided into three parts, very similar to HDFS, metadata storage, data storage, and interfaces of different clients. But in fact, each piece will have some different designs.

First of all, metadata is a plug-in engine design in JuiceFS. Different databases can be selected for different scenarios, such as Redis, MySQL, TiKV, etc. You can choose the most suitable database as the metadata of JuiceFS according to business scenarios and business needs. data engine.

In terms of data storage, JuiceFS is mainly based on object storage, which can maximize the use of storage resources on the cloud. All data is placed in object storage, but it is not simply written to object storage. The data written through the JuiceFS client , will do some format processing (similar to the block design of HDFS DataNode), and the metadata will try not to depend on the metadata of the object storage, because the metadata of the object storage mentioned earlier has many problems, such as consistency issues and performance issues. JuiceFS does not rely on the metadata of object storage, but assumes all metadata requests by its own independent metadata storage.

The metadata engine can be expanded horizontally, and it can easily store massive (such as tens of billions) files, whether large or small. In the production environment of JuiceFS users, it has indeed achieved a scale of tens of billions, which proves that the storage system has excellent scalability.

Caching is a very important feature, especially in scenarios where storage and computing are separated. If you need to upgrade to a storage-computing-separated architecture, you must consider the caching function, whether it is metadata caching or data caching.

04Practice of integrated architecture of lake and warehouse on JuiceFS

1. Integrated lake and warehouse architecture

This is the overall architecture diagram. The bottom part contains not only traditional structured data, but also a lot of semi-structured and unstructured data. The storage layer is a unified storage base provided by JuiceFS combined with object storage.

Further up is the data management layer such as Delta Lake, Hudi or Iceberg, which will manage the entire data warehouse based on the underlying storage system JuiceFS, HDFS or object storage.

Then various query engines, BI tools, and computing engines are connected to different management layer components.

2. Metadata performance comparison

The object storage represented by OSS, as well as HDFS and JuiceFS are compared in the figure above. On the left is the latency of metadata, the lower the value, the better; on the right is the throughput, the larger the value, the better.

First look at the blue object storage, compared with the other two, Latency operates on different metadata will be very different. In particular, the Rename operation mentioned above has a very large performance drop in object storage compared to the other two storage systems. HDFS and JuiceFS are similar in architectural design, and both have independent metadata storage, so both latency and throughput can be relatively fast. In the figure on the right, JuiceFS can even do better in terms of throughput than HDFS.

3. Data query performance comparison

The figure above is a data query performance comparison. The picture on the left is a TPC-DS query test done on object storage and JuiceFS with Spark. The picture on the right is the test with Presto in HDFS and JuiceFS.

As can be seen in the figure on the left, in terms of query performance, JuiceFS can double the performance compared to directly using object storage, especially in some relatively complex queries (such as the ninth query) The performance improvement will be more obvious .

As can be seen from the figure on the right, when JuiceFS is fully cached, the storage-computing separation architecture can also achieve the same level as HDFS query performance.

Therefore, in general, whether it is metadata performance testing or TPC-DS-based query testing, JuiceFS can achieve the same performance as HDFS, and has a significant performance advantage over object storage.

04Q&A

Q1: Does the fact that HDFS is not a storage-computing separation architecture mean that it will be eliminated?

A1: I think that in the short to medium term, HDFS will definitely not be eliminated, because it is still a de facto standard in the big data ecosystem. It's just that after the big data platform goes to the cloud, everyone will rethink how to build a storage system for the big data platform on the cloud. Some companies may also choose to move the entire infrastructure of the computer room to the cloud. In fact, the cost of doing so may be higher than that in the computer room, but at least the overall architecture is guaranteed to be verified, but it will not be up to the standard. The original intention of the cloud cannot make more use of the characteristics of the cloud. If you still move the HDFS-based storage-computing coupling architecture to the cloud, you will not get any benefits from the cloud. This is why many companies want to choose object storage or a new-generation storage system like JuiceFS.

Q2: The S3 object storage design itself has KV design flaws, does that also mean elimination?

A2: It is a different positioning, because the initial positioning of S3 is not designed for big data ecology, its positioning is very clear: store massive unstructured data, it is easy to expand, the reliability is high enough, data cannot be lost, storage The cost should be low enough (so it uses technologies like EC). These are the advantages of S3, as long as there is such a demand scene, it will be very applicable. But what is special about scenarios such as big data or AI is that it is not only necessary to meet the requirements of storage scale, but also to meet various other requirements, such as performance requirements, overall impact on business, etc. We think that there is no way for S3 or object storage to meet the more demanding requirements, so some new designs or new systems are needed, and how to make further improvements and upgrades based on object storage. The S3 itself is certainly not going to be eliminated.

Q3: Storage-computing separation architecture, storage-computing integration, which storage architecture is more suitable for which scenarios?

A3: Generally speaking, the separation of storage and calculation or the coupling of storage and calculation are more based on your current overall structure. If you are considering building an entire data platform or AI platform on the cloud, you will naturally consider making more use of resources on the cloud. To maximize the elasticity of resources on the cloud, we must move towards the separation of storage and computing. Even though it may be a storage-computing coupled architecture at the beginning, in the long run, it still needs to move towards a storage-computing separation architecture. If there is no need to go to the cloud, but to have its own private cloud architecture in the computer room, there is not much difference between whether to separate storage and calculation or to couple storage and calculation. More is how to take advantage of the cloud under the architecture of separation of storage and computing.

Q4: Differences between JuiceFS and Alluxio.

A4: This is a good question. In the JuiceFS official document, there is a special article comparing JuiceFS and Alluxio. If you are interested, you can go to the official document to have a look. To put it simply, the design goal or positioning of JuiceFS is a distributed file system, and the target is a traditional distributed file system such as HDFS. I understand that Alluxio's design positioning is more of a cache layer. The goal of this cache layer is not to achieve complete file system features. For example, it does not need to be fully compatible with POSIX, and it does not need to provide a variety of non-big data scenarios. characteristic. But JuiceFS is positioned as a distributed file system, so we need to consider many points that the file system needs to consider. Maybe Alluxio and JuiceFS have some overlaps in some features (such as caching), but it does not mean that the design direction of these two projects is the same.

Q5: Is the storage-computing separation architecture less efficient than the integrated architecture?

A5: If you simply separate storage and computing without any optimization, the efficiency will definitely decrease. Even now that network hardware technology has developed well, it may still not be as efficient as local I/O. Therefore, simply splitting the storage and computing efficiency will definitely decrease, so it is necessary to consider how to further optimize performance after the separation of storage and computing. Try not to degrade the business or hurt some day-to-day metrics of the business.

Charger will bring you the latest and most comprehensive interpretation as soon as possible, don't forget the triple wave.

                                                            

Pay attention to the official account: Resource Charging Bar
Reply: Chat GPT
Charging Jun sent you: Enjoy using the Chinese version for free
Click on the small card to follow, reply: IT

All the information you want to share

Guess you like

Origin blog.csdn.net/CDB3399/article/details/131304185