Clustered NAS technology architecture

Reprinted link: https://blog.csdn.net/liuaigui/article/details/6422700

1 What is a cluster NAS?
Cluster (Cluster) is a collection of a plurality of computing nodes of a loosely coupled configuration of nodes, that work together to provide services. Clusters are divided into high-performance cluster HPC (High Performance Cluster), High Availability Cluster HAC (High Availablity Cluster) and load-balancing cluster LBC (Load Balancing Cluster). It refers to a plurality of clustered NAS cooperative nodes (commonly known as NAS head) providing a high-performance, high-availability or load balancing NAS (NFS / CIFS) service.

Unstructured data is currently showing rapid growth trend, IDC research report analysis pointed out that the year 2012 will be unstructured data accounts for more than 80% of the total data storage. Clustered NAS is a lateral extension (Scale-out) memory architecture with linear scaling of capacity and performance advantages, it has been recognized by the global market. From EMC to ISILON, HP for IBRIX, DELL acquisition of Exanet and other events, as well as the introduction of IBM SONAS, NetApp released Data ONTAP 8, it can be seen clustered NAS has become one of the mainstream storage technology. In China, we also see UIT UFS, Long deposit LoongStore, Kyushu early Chi CZSS, US Sen YFS and other clustered NAS solutions. Clustered NAS future potential market is huge, high-performance computing industry in the field of HPC, broadcasting IPTV, video surveillance, cloud storage will gradually be widely used.

2 NAS cluster of three mainstream technology architecture
from the overall architecture of view, by the clustered NAS storage subsystems, NAS cluster (head), the client and network components. The storage subsystem may employ a storage area network SAN, direct-attached storage or DAS architecture object storage device for storing the OSD, SAN and DAS infrastructure needed to manage the way through the rear end of the storage medium storing the cluster, and a cluster file system or in a SAN file system file access interface provides a standard way for the NAS cluster. In the architecture is based on the OSD, NAS cluster management metadata, client directly interact directly with the OSD device data access, which is parallel NAS, namely pNFS / NFSv4.1. NAS cluster is NFS / CIS Gateway, provides clients with a standard file-level NAS services. For SAN and DAS architectures, NAS cluster and to assume metadata and I / O data access capabilities, architecture and OSD only way to bear the metadata access. Depending on the rear end of the storage subsystems employed, can be divided into three clustered NAS technology architecture, i.e. SAN shared storage architecture, the cluster file system architecture and pNFS / NFSv4.1 architecture.

(1) SAN shared storage architecture
This architecture (Figure 1) using the SAN backend storage, all NAS cluster nodes connected by optical fiber to the SAN, shared by all storage devices, SAN usually parallel file system management interfaces and outputs POSIX to NAS cluster. SAN parallel file system metadata typically require the control server, the MDC may be a dedicated, fully distributed manner distributed to the SAN client may be employed. It can achieve concurrent access to shared storage SAN File System SAN client installation NAS cluster, and then run the NFS / CIFS services for client service. Here the front end Ethernet network, connecting the storage behind a SAN network.

Figure 1 SAN shared storage cluster NAS architectures

As a result of high-performance network SAN storage, NAS architectures such clusters may provide a stable and high-bandwidth IOPS performance, and can be individually extended or increase storage disk array NAS cluster nodes are implemented by the storage capacity and performance. The client can be connected directly to a specific NAS cluster nodes, and the use of cluster management software to achieve high availability; DNS or LVS can also be used for load balancing and high availability, the client uses the virtual IP to connect. SAN network storage and parallel file system costs are high, so this shortcoming clustered NAS architecture is the high cost, but also inherited the shortcomings SAN storage architecture, such as complex deployment management, limited expansion and scale. With this architecture is a typical case of clustered NAS IBM SONAS (Figure 2) and the Symantec FileStore .

FIG 2 SONAS

(2) a cluster file system architecture
This architecture (FIG. 3) back-end storage using the DAS, each directly connected to each storage server storage systems, typically for a group of SATA disk, the cluster file system and physical distribution management unified storage space and form a single namespace file system. In fact, the cluster file system is a RAID, Volume, functional unity of the three File System. The current mainstream cluster file systems typically require a dedicated or distributed metadata service metadata services cluster, providing metadata control and unified name space, of course, there are exceptions, such as no GlusterFS metadata services architecture. Installation on clustered NAS cluster file system clients, enabling access to the global memory space, and run the NFS / CIFS NAS services provide external services. Generally clustered NAS Service metadata cluster or cluster of storage nodes running on the same physical node, thereby reducing the size of the physical node deployment, of course, have some impact on performance. And SAN architectures, cluster file systems may share service with NAS TCP / IP network performance impact each other, resulting in jitter I / O performance. Among other uses, such as clustered file system storage ISILON InfiniBand network interconnection node, this effect can be eliminated, maintain the stability of performance.

Figure 3 cluster file system, clustered NAS architecture

In this architecture, a clustered NAS extension is accomplished by adding a storage node, storage space and often also extended performance, many systems can achieve nearly linearly extended. Client Access clustered NAS architecture with the first way the same way, load balancing, and availability can also be used in a similar manner. Because server and storage media are common standard inexpensive device can be used with great advantage in cost, size can be large. However, such devices are very susceptible to failure, damage to the disk or server can cause some data is not available, we need a mechanism to ensure HA server availability, using replication to ensure data availability, which tends to reduce system performance and storage utilization . In addition, because more server nodes, this architecture is not suitable for commercialization, may be more suitable for storage solutions. NAS cluster with a typical case of this architecture includes EMC ISILON , Long deposit LoongStore , Kyushu early Chi CZSS , US Sen YFS and GlusterFS (Figure 4) and so on.

Figure 4 GluterFS architecture

(3) pNFS / NFSv4.1 architecture
This architecture (FIG. 5) is actually a parallel NAS, i.e. pNFS / NFSv4.1, RFC 5661 standard was approved by 2010.01. Its rear end face object storage device using the storage OSD, supports FC / NFS / OSD data multiple access protocol, the client read and write data directly to another device with the OSD, unlike the above two architectures NAS cluster needs to be carried out by data transfer. Here merely as a clustered NAS metadata service, I / O data by the OSD processing, the metadata separation data. This architecture is more like a native parallel file system, not only easier on the system architecture, and has been greatly improved performance, scalability is very good.

Figure 5 pNFS / NFSv4.1 clustered NAS architecture

Obviously, with this architecture the two essentially different, the metadata clustering solutions using the pNFS single point of failure and performance bottlenecks conventional NAS, the metadata separation data is to solve the performance and scalability issues. This is the real future of true parallel NAS, pNFS is a cluster of NAS. However, after only a year pNFS standard is approved, there is no mature product realization, OSD storage device development for many years has not been widely recognized and popular. Panasas Inc. PanFS (6) should be the closest to this clustered NAS architecture, of course, is one of the Panasas pNFS standard major makers. At present, many companies are developing pNFS storage products, such as BlueArc, we predict that by 2012 there will be a product launched.

Figure 6 PanFS architecture

3 open source solutions
mentioned above clustered NAS storage products or solutions, mostly commercial implementations, and the cost is relatively expensive. Some users may want to use open source software to achieve clustered NAS, there is no such open source solution? Clustered NAS is the core of the underlying parallel file system, clustered file system or pNFS protocol, following a brief introduction of open source in terms of NAS cluster support and implementation.
(1) SAN shared storage architecture: Redhat GFS is the open source SAN shared file system, it also supports DAS connection, and then integrate NFS / Samba services can be realized clustered NAS.
(2) cluster file system architecture: Lustre, Gluster, PVFS2, Ceph , these are excellent cluster file system, Gluster itself is a complete clustered NAS system. Gluster achieve similar cluster file system provides NAS services through NFS / Samba gateway for clustering NAS.
(3) pNFS / NFSv4.1 architecture: Linux kernel has been integrated pNFS current source, but in the experimental stage. Also open source OSD achieve very little, GFS2 supports pNFS. The user may want to try something new a try, or to practice caution.

Published an original article · won praise 0 · Views 1794

1 What is a cluster NAS?
Cluster (Cluster) is a collection of a plurality of computing nodes of a loosely coupled configuration of nodes, that work together to provide services. Clusters are divided into high-performance cluster HPC (High Performance Cluster), High Availability Cluster HAC (High Availablity Cluster) and load-balancing cluster LBC (Load Balancing Cluster). It refers to a plurality of clustered NAS cooperative nodes (commonly known as NAS head) providing a high-performance, high-availability or load balancing NAS (NFS / CIFS) service.

Unstructured data is currently showing rapid growth trend, IDC research report analysis pointed out that the year 2012 will be unstructured data accounts for more than 80% of the total data storage. Clustered NAS is a lateral extension (Scale-out) memory architecture with linear scaling of capacity and performance advantages, it has been recognized by the global market. From EMC to ISILON, HP for IBRIX, DELL acquisition of Exanet and other events, as well as the introduction of IBM SONAS, NetApp released Data ONTAP 8, it can be seen clustered NAS has become one of the mainstream storage technology. In China, we also see UIT UFS, Long deposit LoongStore, Kyushu early Chi CZSS, US Sen YFS and other clustered NAS solutions. Clustered NAS future potential market is huge, high-performance computing industry in the field of HPC, broadcasting IPTV, video surveillance, cloud storage will gradually be widely used.

2 NAS cluster of three mainstream technology architecture
from the overall architecture of view, by the clustered NAS storage subsystems, NAS cluster (head), the client and network components. The storage subsystem may employ a storage area network SAN, direct-attached storage or DAS architecture object storage device for storing the OSD, SAN and DAS infrastructure needed to manage the way through the rear end of the storage medium storing the cluster, and a cluster file system or in a SAN file system file access interface provides a standard way for the NAS cluster. In the architecture is based on the OSD, NAS cluster management metadata, client directly interact directly with the OSD device data access, which is parallel NAS, namely pNFS / NFSv4.1. NAS cluster is NFS / CIS Gateway, provides clients with a standard file-level NAS services. For SAN and DAS architectures, NAS cluster and to assume metadata and I / O data access capabilities, architecture and OSD only way to bear the metadata access. Depending on the rear end of the storage subsystems employed, can be divided into three clustered NAS technology architecture, i.e. SAN shared storage architecture, the cluster file system architecture and pNFS / NFSv4.1 architecture.

(1) SAN shared storage architecture
This architecture (Figure 1) using the SAN backend storage, all NAS cluster nodes connected by optical fiber to the SAN, shared by all storage devices, SAN usually parallel file system management interfaces and outputs POSIX to NAS cluster. SAN parallel file system metadata typically require the control server, the MDC may be a dedicated, fully distributed manner distributed to the SAN client may be employed. It can achieve concurrent access to shared storage SAN File System SAN client installation NAS cluster, and then run the NFS / CIFS services for client service. Here the front end Ethernet network, connecting the storage behind a SAN network.

Figure 1 SAN shared storage cluster NAS architectures

As a result of high-performance network SAN storage, NAS architectures such clusters may provide a stable and high-bandwidth IOPS performance, and can be individually extended or increase storage disk array NAS cluster nodes are implemented by the storage capacity and performance. The client can be connected directly to a specific NAS cluster nodes, and the use of cluster management software to achieve high availability; DNS or LVS can also be used for load balancing and high availability, the client uses the virtual IP to connect. SAN network storage and parallel file system costs are high, so this shortcoming clustered NAS architecture is the high cost, but also inherited the shortcomings SAN storage architecture, such as complex deployment management, limited expansion and scale. With this architecture is a typical case of clustered NAS IBM SONAS (Figure 2) and the Symantec FileStore .

FIG 2 SONAS

(2) a cluster file system architecture
This architecture (FIG. 3) back-end storage using the DAS, each directly connected to each storage server storage systems, typically for a group of SATA disk, the cluster file system and physical distribution management unified storage space and form a single namespace file system. In fact, the cluster file system is a RAID, Volume, functional unity of the three File System. The current mainstream cluster file systems typically require a dedicated or distributed metadata service metadata services cluster, providing metadata control and unified name space, of course, there are exceptions, such as no GlusterFS metadata services architecture. Installation on clustered NAS cluster file system clients, enabling access to the global memory space, and run the NFS / CIFS NAS services provide external services. Generally clustered NAS Service metadata cluster or cluster of storage nodes running on the same physical node, thereby reducing the size of the physical node deployment, of course, have some impact on performance. And SAN architectures, cluster file systems may share service with NAS TCP / IP network performance impact each other, resulting in jitter I / O performance. Among other uses, such as clustered file system storage ISILON InfiniBand network interconnection node, this effect can be eliminated, maintain the stability of performance.

Figure 3 cluster file system, clustered NAS architecture

In this architecture, a clustered NAS extension is accomplished by adding a storage node, storage space and often also extended performance, many systems can achieve nearly linearly extended. Client Access clustered NAS architecture with the first way the same way, load balancing, and availability can also be used in a similar manner. Because server and storage media are common standard inexpensive device can be used with great advantage in cost, size can be large. However, such devices are very susceptible to failure, damage to the disk or server can cause some data is not available, we need a mechanism to ensure HA server availability, using replication to ensure data availability, which tends to reduce system performance and storage utilization . In addition, because more server nodes, this architecture is not suitable for commercialization, may be more suitable for storage solutions. NAS cluster with a typical case of this architecture includes EMC ISILON , Long deposit LoongStore , Kyushu early Chi CZSS , US Sen YFS and GlusterFS (Figure 4) and so on.

Figure 4 GluterFS architecture

(3) pNFS / NFSv4.1 architecture
This architecture (FIG. 5) is actually a parallel NAS, i.e. pNFS / NFSv4.1, RFC 5661 standard was approved by 2010.01. Its rear end face object storage device using the storage OSD, supports FC / NFS / OSD data multiple access protocol, the client read and write data directly to another device with the OSD, unlike the above two architectures NAS cluster needs to be carried out by data transfer. Here merely as a clustered NAS metadata service, I / O data by the OSD processing, the metadata separation data. This architecture is more like a native parallel file system, not only easier on the system architecture, and has been greatly improved performance, scalability is very good.

Figure 5 pNFS / NFSv4.1 clustered NAS architecture

Obviously, with this architecture the two essentially different, the metadata clustering solutions using the pNFS single point of failure and performance bottlenecks conventional NAS, the metadata separation data is to solve the performance and scalability issues. This is the real future of true parallel NAS, pNFS is a cluster of NAS. However, after only a year pNFS standard is approved, there is no mature product realization, OSD storage device development for many years has not been widely recognized and popular. Panasas Inc. PanFS (6) should be the closest to this clustered NAS architecture, of course, is one of the Panasas pNFS standard major makers. At present, many companies are developing pNFS storage products, such as BlueArc, we predict that by 2012 there will be a product launched.

Figure 6 PanFS architecture

3 open source solutions
mentioned above clustered NAS storage products or solutions, mostly commercial implementations, and the cost is relatively expensive. Some users may want to use open source software to achieve clustered NAS, there is no such open source solution? Clustered NAS is the core of the underlying parallel file system, clustered file system or pNFS protocol, following a brief introduction of open source in terms of NAS cluster support and implementation.
(1) SAN shared storage architecture: Redhat GFS is the open source SAN shared file system, it also supports DAS connection, and then integrate NFS / Samba services can be realized clustered NAS.
(2) cluster file system architecture: Lustre, Gluster, PVFS2, Ceph , these are excellent cluster file system, Gluster itself is a complete clustered NAS system. Gluster achieve similar cluster file system provides NAS services through NFS / Samba gateway for clustering NAS.
(3) pNFS / NFSv4.1 architecture: Linux kernel has been integrated pNFS current source, but in the experimental stage. Also open source OSD achieve very little, GFS2 supports pNFS. The user may want to try something new a try, or to practice caution.

Guess you like

Origin blog.csdn.net/happygirl_wxq/article/details/105339804