MinIO introduces deployment and springboot integration

Enter a new project team and use the minio tool for file storage, so search for relevant information online and combine the use of minio by the project to learn minio-related knowledge, features, application scenarios, storage architecture and basic concepts, and on this basis, local The minio service is actually deployed and built, and integrated into the springboot project for use. It is recorded here for subsequent in-depth study, and also provides reference for latecomers. There are inevitably omissions in the article, and readers are welcome to correct them!

1. MinIO basic information

MinIO is an object storage service based on the Apache License v2.0 open source protocol. Suitable for storing large-capacity unstructured data, such as pictures, videos, log files, backup data, and container/virtual machine images, etc., and an object file can be of any size, from a few KB to a maximum of 5TB. . It is open source and developed in Go language, and has a web operation interface. We can use it to build a storage cloud service compatible with the S3 protocol. Compared with hadoop hdfs distributed storage service is much lighter and supports single-node deployment.

Object storage:
OSS (Object Storage Service) is a massive, secure, low-cost, and highly reliable cloud storage service, suitable for storing any type of file. Elastic expansion of capacity and processing power, multiple storage types to choose from, and comprehensive optimization of storage costs.

2. MinIO Features

1) High-performance
MinIO is the world's leading object storage pioneer and currently has millions of users around the world. On standard hardware, the read/write speed is as high as 183 GB/sec and 171 GB/sec.
Object storage can act as the main storage layer to handle various complex workloads such as Spark, Presto, TensorFlow, H2O.ai and become a replacement for Hadoop HDFS.
MinIO is used as the primary storage for cloud-native applications that require higher throughput and lower latency than traditional object storage. And these are the performance indicators that MinIO can achieve.

2) Scalability
MinIO leverages the hard-won knowledge of web scalers to bring a simple scaling model to object storage. This is our firm philosophy of "Scale Simple." At MinIO, scaling starts with a single cluster that can be federated with other MinIO clusters to create a global namespace, and can span multiple different datacenters if needed. The namespace can be expanded by adding more clusters, more racks, until the goal is achieved.

3) Cloud-native support
MinIO is a software built from scratch in the past 4 years, which conforms to the architecture and construction process of all native cloud computing, and includes the latest new technologies and concepts of cloud computing. These include container technologies that support Kubernetes, microservices, and multi-tenancy. Make object storage more Kubernetes-friendly.

4) Compatible with Amazon S3
Amazon Cloud's S3 API (interface protocol) is an object storage protocol that has reached consensus on a global scale, and is a standard recognized by everyone in the world. MinIO adopted the S3 compatibility protocol very early, and MinIO is the first product to support S3 Select. MinIO is proud of its comprehensive compatibility and has been recognized by more than 750 organizations, including Microsoft Azure. MinIO's S3 gateway - this indicator exceeds the sum of other similar products.

5) Simplicity
Minimalism is the guiding design principle of MinIO. Simplicity reduces the chance of error, increases uptime, provides reliability, and is fundamental to performance. MinIO can be installed and configured in minutes by simply downloading a binary and executing it. The number of configuration options and variants is kept to a minimum, which reduces the probability of failed configurations to close to zero. The MinIO upgrade is completed through a simple command, which can complete the MinIO upgrade without interruption, and the upgrade operation can be completed without downtime - reducing the total usage and operation and maintenance costs.

3. Application scenarios

The application scenarios of MinIO can not only be used as a private cloud object storage service, but also as a gateway layer of cloud object storage, seamlessly connecting with Amazon S3 or Microsoft Azure.
insert image description here

4. Storage Architecture

Minio also sets up corresponding storage architectures for different application scenarios:

4.1 Single host, single hard disk mode

In this mode, Minio only builds services on one server, and the data is stored on a single disk. This mode has a single point of risk and is mainly used for development, testing, etc. The command to start is
:

minio --config-dir ~/tenant1 server --address :9001 /disk1/data/tenant1

4.2 Single host, multiple hard disk mode

In this mode, Minio builds services on one server, but the data is scattered on multiple (more than 4) disks, providing data security

minio --config-dir ~/tenant1 server --address :9001 /disk1/data/tenant1 /disk2/data/tena

4.3 Multi-host, multi-hard disk mode (distributed)

This mode is the most commonly used architecture for Minio services. By sharing an access_key and secret_key, the service is built on multiple (2-32) servers, and the data is scattered on multiple (more than 4, unlimited) disks, providing a relatively Powerful data redundancy mechanism (Reed-Solomon erasure code).

export MINIO_ACCESS_KEY=<TENANT1_ACCESS_KEY>
export MINIO_SECRET_KEY=<TENANT1_SECRET_KEY>
minio --config-dir ~/tenant1 server --address :9001 http://192.168.10.11/data/tenant1 ht

Distributed advantages:
In the field of big data, the usual design concepts are centerless and distributed. The Minio distributed mode can help build a highly available object storage service, and you can use these storage devices regardless of their real physical location.
1) Data protection
Distributed Minio uses erasure codes to prevent multiple node downtime and bit rot.
Distributed Minio requires at least 4 hard disks, and the use of distributed Minio automatically introduces the erasure code function.
2) High availability
There is a single point of failure in the stand-alone Minio service. On the contrary, if it is a distributed Minio with N hard disks, as long as there are N/2 hard disks online, your data is safe.
But you need at least N/2+1 hard drives to create new objects.

For example, a 16-node Minio cluster with 16 hard disks per node, even if 8 servers go down, the cluster is still readable, but you need 9 servers to write data.

Note that you can combine different nodes and several disks per node, as long as you respect the constraints of distributed Minio.
For example, you can use 2 nodes with 4 hard drives each, or you can use 4 nodes with 2 hard drives each, and so on.

3) Consistency
Minio in distributed and stand-alone mode, all read and write operations strictly abide by the read-after-write consistency model.
4) MinIO's data is highly reliable.
Minio uses the two features of Erasure Code and Bit Rot Protection data corruption protection, so MinIO's data reliability is high.

5. Basic concepts

1) Object: Basic objects stored in Minio, such as files, byte streams, Anything...

2) Bucket: Logical space used to store Objects. The data in each bucket is isolated from each other. For the client, it is equivalent to a top-level folder for storing files.

3) Drive: the disk that stores data, which is passed in as a parameter when MinIO starts. All object data in Minio will be stored in Drive.

4) Set
is a collection of Drives. Distributed deployment automatically divides one or more Sets according to the cluster size, and the Drives in each Set are distributed in different locations. An object is stored on a Set. (For example: {1…64} is divided into 4 sets each of size 16.)

An object is stored on a Set.
A cluster is divided into multiple Sets.
The number of Drives contained in a Set is fixed. By default, it is automatically calculated by the system based on the size of the cluster.
The Drives in a SET are distributed on different nodes as much as possible.

The relationship between Set/Drive:
The two concepts of Set/Drive are the two most important concepts in MINIO. An object is finally stored on the Set.
A node machine can contain multiple hard disks. Drive is a block in a node, which can be simply understood as a hard disk. Set is a collection of multiple drives across nodes.

5) The process of writing objects in Minio:
MINIO encodes the original data into N parts through data encoding, and N is the number of Drives on a Set. The N mentioned many times later refers to this meaning.
After the object is encoded into N copies, each copy is written to the corresponding Drive, which is to store an object on the entire Set.
A cluster contains multiple Sets, and the Set on which each object is finally stored is hashed according to the name of the object, and then mapped to a unique Set. This method theoretically ensures that data can be evenly distributed to all Sets.

According to observations, the data distribution is also very uniform. The number of Drives included in a Set is automatically calculated by the system based on the cluster size. Of course, you can also configure it yourself.

A Set's Drive system will consider placing it on as many nodes as possible to ensure its reliability.

6. Deployment

MinIO supports stand-alone deployment, multi-tenant deployment, and distributed deployment. Support original file storage and erasure code mode storage. When deploying on a single machine, you can use the client tool of minio for backup.

6.1 Binary deployment

部署环境:Ubuntu 20.04.2 LTS
系统架构:amd64(uname -a或arch命令可查看系统架构,注:x86_64,x64,AMD64基本上是同一个东西)

Use the following command to run a standalone MinIO server on a Linux host running 64-bit Intel/AMD architecture. Replace /data with the path to the drive or directory where you want MinIO to store the data.

wget https://dl.min.io/server/minio/release/linux-amd64/minio
chmod +x minio
./minio server /data

The installation packages corresponding to different architectures of 64-bit Intel/AMD, 64-bit ARM, 64-bit PowerPC LE (ppc64le), IBM Z-Series (S390X) are at https://dl.min.io/server/minio/release /Search for yourself

The parameter --console-address ":9001" can specify the browser access port
insert image description here

6.2 Docker deployment

1) Pull the image

docker pull minio/minio

2) Run mirrored MinIO:

docker run -p 9000:9000 -p 9001:9001 --name minio \
  -v /etc/localtime:/etc/localtime \
  -v /data/minio/data:/data \
  -v /data/minio/config:/root/.minio \
  -d minio/minio server /data --console-address ":9001"

6.3 Console Access Settings

Browser access http://192.168.109.130:44561/ account password minioadmin/minioadmin
insert image description here
insert image description here

1) Create a bucket: After entering the system, we must first click the "+" button in the lower right corner to create a file bucket (after entering the name, just press Enter), and then upload the file to this file bucket. Create bucket (create file bucket), Upload file (upload file), here I created a bucket test and uploaded a picture

insert image description here
2) For uploaded files, there is a share button on the file list interface. Clicking share will generate the access URL address of the file to specify the valid time of the link. The valid time is up to 7 days, and the smallest unit is minutes. When accessing the picture after the valid time expires, it will prompt invalidation.
insert image description here
insert image description here
3) Bucket access policy
Bucket can have three Access Policy policies by default: public, custom, private
policy public: direct access to resources without any authentication
policy private: no operation without authorization
policy customer: appear through the following custom Access Rules , readonley/writeonly/readwrite

After adding a customer, the Access Policy is automatically set to customer, and after all customers are deleted, the Access Policy is automatically set to private;

insert image description here
insert image description here

7. Springboot integrated use

7.1 Import jar package

<dependency>
    <groupId>io.minio</groupId>
    <artifactId>minio</artifactId>
    <version>7.0.2</version>
</dependency>

<dependency>
    <groupId>cn.hutool</groupId>
    <artifactId>hutool-all</artifactId>
    <version>5.6.6</version>
</dependency>

7.2 Add configuration

platform:
  oss:
    endpoint: http://192.168.109.130:9000
    accessKeyId: minioadmin
    accessKeySecret: minioadmin
    bucketName: tduck-cloud
    domain: http://192.168.109.130:9000/tduck-cloud

7.3 Code Integration

@Data
@Component
@Slf4j
@ConfigurationProperties(prefix = "platform.oss")
public class OssStorageConfig {

    /**
     * oss 类型
     * 参考 OssTypeEnum.java
     */
    private OssTypeEnum ossType;

    /**
     * 阿里云:endpoint
     */
    private String endpoint;

    /**
     * accessKeyId
     */
    private String accessKeyId;

    /**
     * accessKeySecret
     */
    private String accessKeySecret;

    /**
     * 桶名
     */
    private String bucketName;

    /**
     * 预览域名
     */
    private String domain;
    
    /**
     * 本地存储文件存放地址
     */
    private String uploadFolder;
    
    /**
     * 本地存储文件访问路径
     */
    private String accessPathPattern;
}


@Component
public class MIniOStorageService {

    private MinioClient client;

    public MIniOStorageService(OssStorageConfig config) {
        this.config = config;
        //初始化
        init();
    }

    private void init() {
        try {
            client = new MinioClient(config.getEndpoint(), config.getAccessKeyId(), config.getAccessKeySecret(), false);
        } catch (InvalidEndpointException e) {
            e.printStackTrace();
        } catch (InvalidPortException e) {
            e.printStackTrace();
        }
    }

    @Override
    public String upload(InputStream inputStream, String path) {
        try {
            PutObjectOptions poo = new PutObjectOptions(inputStream.available(), -1);
            poo.setContentType(MimeTypeEnum.getContentType(path));
            client.putObject(config.getBucketName(), path, inputStream, poo);
        } catch (Exception e) {
            throw new StorageException("上传文件失败,请检查配置信息", e);
        }
        return config.getDomain() + "/" + path;
    }

    @Override
    public String upload(byte[] data, String path) {
        try {
            PutObjectOptions poo = new PutObjectOptions(data.length, -1);
            poo.setContentType(MimeTypeEnum.getContentType(path));
            client.putObject(config.getBucketName(), path, new ByteArrayInputStream(data), poo);
        } catch (Exception e) {
            throw new StorageException("上传文件失败,请检查配置信息", e);
        }
        return config.getDomain() + "/" + path;
    }

    @Override
    public void delete(String path) {
        try {
            client.removeObject(config.getBucketName(), path);
        } catch (Exception e) {
            throw new StorageException("删除文件失败", e);
        }
    }
}


    @Autowired
    private MIniOStorageService mIniOStorageService;

    /**
     * 上传用户文件
     * <p>
     * 用户Id MD5加密 同一个用户的文件放在一个目录下
     *
     * @param file
     * @param userId
     * @return
     * @throws IOException
     */
    @PostMapping("/user/file/upload")
    public Result<String> uploadUserFile(@RequestParam("file") MultipartFile file, @RequestAttribute Long userId) throws IOException {
        String path = new StringBuffer(SecureUtil.md5(String.valueOf(userId)))
                .append(CharUtil.SLASH)
                .append(IdUtil.simpleUUID())
                .append(CharUtil.DOT)
                .append(FileUtil.extName(file.getOriginalFilename())).toString();
        String url = mIniOStorageService.upload(file.getInputStream(),path);
        return Result.success(url);
    }

8. References

http://www.minio.org.cn/
http://docs.minio.org.cn/docs/
https://blog.csdn.net/lj15559275886/article/details/121441031
https://blog.csdn.net/crazymakercircle/article/details/120855464

Guess you like

Origin blog.csdn.net/shy871/article/details/121967656