[Cloud native] Hadoop HA on k8s environment deployment

I. Overview

Before Hadoop 2.0.0, a cluster had only one Namenode, which would face the single point of failure problem. If the Namenode machine dies, the entire cluster becomes useless. The cluster can only be recovered by restarting the Namenode. In addition, when the cluster is normally planned to be maintained, the entire cluster must be deactivated first, so there is no way to achieve a 7 * 24-hour availability state. Hadoop 2.0 and later versions added the Namenode high availability mechanism. Here we mainly talk about the deployment of Hadoop HA on k8s environment.

For non-highly available k8s environments, please refer to my article: [Cloud Native] Hadoop on k8s environment deployment
For highly available non-k8s environments, please refer to my article: Big Data Hadoop - Hadoop 3.3.4 HA (High Availability) Principle and Implementation (QJM)

HDFS insert image description here YARN insert image description here

2. Start deployment

Here is a transformation based on non-high-availability orchestration. For those who don't know, you can read my article above.

1) Add journalNode orchestration

1. Controller Statefulset

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: {{ include "hadoop.fullname" . }}-hdfs-jn
  annotations:
    checksum/config: {{ include (print $.Template.BasePath "/hadoop-configmap.yaml") . | sha256sum }}
  labels:
    app.kubernetes.io/name: {{ include "hadoop.name" . }}
    helm.sh/chart: {{ include "hadoop.chart" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/component: hdfs-jn
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "hadoop.name" . }}
      app.kubernetes.io/instance: {{ .Release.Name }}
      app.kubernetes.io/component: hdfs-jn
  serviceName: {{ include "hadoop.fullname" . }}-hdfs-jn
  replicas: {{ .Values.hdfs.jounralNode.replicas }}
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "hadoop.name" . }}
        app.kubernetes.io/instance: {{ .Release.Name }}
        app.kubernetes.io/component: hdfs-jn
    spec:
      affinity:
        podAntiAffinity:
        {{- if eq .Values.antiAffinity "hard" }}
          requiredDuringSchedulingIgnoredDuringExecution:
          - topologyKey: "kubernetes.io/hostname"
            labelSelector:
              matchLabels:
                app.kubernetes.io/name: {{ include "hadoop.name" . }}
                app.kubernetes.io/instance: {{ .Release.Name }}
                app.kubernetes.io/component: hdfs-jn
        {{- else if eq .Values.antiAffinity "soft" }}
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 5
            podAffinityTerm:
              topologyKey: "kubernetes.io/hostname"
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: {{ include "hadoop.name" . }}
                  app.kubernetes.io/instance: {{ .Release.Name }}
                  app.kubernetes.io/component: hdfs-jn
        {{- end }}
      terminationGracePeriodSeconds: 0
      containers:
      - name: hdfs-jn
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: {{ .Values.image.pullPolicy | quote }}
        command:
           - "/bin/bash"
           - "/opt/apache/tmp/hadoop-config/bootstrap.sh"
           - "-d"
        resources:
{{ toYaml .Values.hdfs.jounralNode.resources | indent 10 }}
        readinessProbe:
          tcpSocket:
            port: 8485
          initialDelaySeconds: 10
          timeoutSeconds: 2
        livenessProbe:
          tcpSocket:
            port: 8485
          initialDelaySeconds: 10
          timeoutSeconds: 2
        volumeMounts:
        - name: hadoop-config
          mountPath: /opt/apache/tmp/hadoop-config
        {{- range .Values.persistence.journalNode.volumes }}
        - name: {{ .name }}
          mountPath: {{ .mountPath }}
        {{- end }}
        securityContext:
          runAsUser: {{ .Values.securityContext.runAsUser }}
          privileged: {{ .Values.securityContext.privileged }}
      volumes:
      - name: hadoop-config
        configMap:
          name: {{ include "hadoop.fullname" . }}
{{- if .Values.persistence.journalNode.enabled }}
  volumeClaimTemplates:
  {{- range .Values.persistence.journalNode.volumes }}
  - metadata:
      name: {{ .name }}
      labels:
        app.kubernetes.io/name: {{ include "hadoop.name" $ }}
        helm.sh/chart: {{ include "hadoop.chart" $ }}
        app.kubernetes.io/instance: {{ $.Release.Name }}
        app.kubernetes.io/component: hdfs-jn
    spec:
      accessModes:
      - {{ $.Values.persistence.journalNode.accessMode | quote }}
      resources:
        requests:
          storage: {{ $.Values.persistence.journalNode.size | quote }}
    {{- if $.Values.persistence.journalNode.storageClass }}
    {{- if (eq "-" $.Values.persistence.journalNode.storageClass) }}
      storageClassName: ""
    {{- else }}
      storageClassName: "{{ $.Values.persistence.journalNode.storageClass }}"
    {{- end }}
    {{- end }}
      {{- else }}
      - name: dfs
        emptyDir: {}
      {{- end }}
{{- end }}

复制代码

2、service

# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
  name: {{ include "hadoop.fullname" . }}-hdfs-jn
  labels:
    app.kubernetes.io/name: {{ include "hadoop.name" . }}
    helm.sh/chart: {{ include "hadoop.chart" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/component: hdfs-jn
spec:
  ports:
  - name: jn
    port: {{ .Values.service.journalNode.ports.jn }}
    protocol: TCP
    {{- if and (eq .Values.service.journalNode.type "NodePort") .Values.service.journalNode.nodePorts.jn }}
    nodePort: {{ .Values.service.journalNode.nodePorts.jn }}
    {{- end }}
  type: {{ .Values.service.journalNode.type }}
  selector:
    app.kubernetes.io/name: {{ include "hadoop.name" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/component: hdfs-jn

复制代码

2) Modify the configuration

1. Modify values.yaml

image:
  repository: myharbor.com/bigdata/hadoop
  tag: 3.3.2
  pullPolicy: IfNotPresent

# The version of the hadoop libraries being used in the image.
hadoopVersion: 3.3.2
logLevel: INFO

# Select antiAffinity as either hard or soft, default is soft
antiAffinity: "soft"

hdfs:
  nameNode:
    replicas: 2
    pdbMinAvailable: 1

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

  dataNode:
    # Will be used as dfs.datanode.hostname
    # You still need to set up services + ingress for every DN
    # Datanodes will expect to
    externalHostname: example.com
    externalDataPortRangeStart: 9866
    externalHTTPPortRangeStart: 9864

    replicas: 3

    pdbMinAvailable: 1

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

  webhdfs:
    enabled: true

  jounralNode:
    replicas: 3
    pdbMinAvailable: 1

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

yarn:
  resourceManager:
    pdbMinAvailable: 1
    replicas: 2

    resources:
      requests:
        memory: "256Mi"
        cpu: "10m"
      limits:
        memory: "2048Mi"
        cpu: "2000m"

  nodeManager:
    pdbMinAvailable: 1

    # The number of YARN NodeManager instances.
    replicas: 1

    # Create statefulsets in parallel (K8S 1.7+)
    parallelCreate: false

    # CPU and memory resources allocated to each node manager pod.
    # This should be tuned to fit your workload.
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "2048Mi"
        cpu: "1000m"

persistence:
  nameNode:
    enabled: true
    storageClass: "hadoop-ha-nn-local-storage"
    accessMode: ReadWriteOnce
    size: 1Gi
    local:
    - name: hadoop-ha-nn-0
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/nn/data/data1"
    - name: hadoop-ha-nn-1
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/nn/data/data1"

  dataNode:
    enabled: true
    enabledStorageClass: false
    storageClass: "hadoop-ha-dn-local-storage"
    accessMode: ReadWriteOnce
    size: 1Gi
    local:
    - name: hadoop-ha-dn-0
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data1"
    - name: hadoop-ha-dn-1
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data2"
    - name: hadoop-ha-dn-2
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data3"
    - name: hadoop-ha-dn-3
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data1"
    - name: hadoop-ha-dn-4
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data2"
    - name: hadoop-ha-dn-5
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data3"
    - name: hadoop-ha-dn-6
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data1"
    - name: hadoop-ha-dn-7
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data2"
    - name: hadoop-ha-dn-8
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/dn/data/data3"
    volumes:
    - name: dfs1
      mountPath: /opt/apache/hdfs/datanode1
      hostPath: /opt/bigdata/servers/hadoop-ha/dn/data/data1
    - name: dfs2
      mountPath: /opt/apache/hdfs/datanode2
      hostPath: /opt/bigdata/servers/hadoop-ha/dn/data/data2
    - name: dfs3
      mountPath: /opt/apache/hdfs/datanode3
      hostPath: /opt/bigdata/servers/hadoop-ha/dn/data/data3

  journalNode:
    enabled: true
    storageClass: "hadoop-ha-jn-local-storage"
    accessMode: ReadWriteOnce
    size: 1Gi
    local:
    - name: hadoop-ha-jn-0
      host: "local-168-182-110"
      path: "/opt/bigdata/servers/hadoop-ha/jn/data/data1"
    - name: hadoop-ha-jn-1
      host: "local-168-182-111"
      path: "/opt/bigdata/servers/hadoop-ha/jn/data/data1"
    - name: hadoop-ha-jn-2
      host: "local-168-182-112"
      path: "/opt/bigdata/servers/hadoop-ha/jn/data/data1"
    volumes:
    - name: jn
      mountPath: /opt/apache/hdfs/journalnode

service:
  nameNode:
    type: NodePort
    ports:
      dfs: 9000
      webhdfs: 9870
    nodePorts:
      dfs: 30900
      webhdfs: 30870
  dataNode:
    type: NodePort
    ports:
      webhdfs: 9864
    nodePorts:
      webhdfs: 30864
  resourceManager:
    type: NodePort
    ports:
      web: 8088
    nodePorts:
      web: 30088
  journalNode:
    type: ClusterIP
    ports:
      jn: 8485
    nodePorts:
      jn: ""


securityContext:
  runAsUser: 9999
  privileged: true

复制代码

2. Modify hadoop/templates/hadoop-configmap.yaml

There are a lot of modified content, so I won't post it here. The git download address will be given at the bottom.

3) Start the installation

# 创建存储目录
mkdir -p /opt/bigdata/servers/hadoop-ha/{nn,dn,jn}/data/data{1..3}
chmod -R 777 -R /opt/bigdata/servers/hadoop-ha/{nn,dn,jn}

helm install hadoop-ha ./hadoop -n hadoop-ha --create-namespace
复制代码

Check

kubectl get pods,svc -n hadoop-ha -owide
复制代码

insert image description here

HDFS WEB-nn1:http://192.168.182.110:31870/dfshealth.html#tab-overview insert image description here

HDFS WEB-nn2:http://192.168.182.110:31871/dfshealth.html#tab-overview insert image description here YARN WEB-rm1:http://192.168.182.110:31088/cluster/cluster insert image description here YARN WEB-rm2:http://192.168.182.110:31089/cluster/cluster insert image description here

4) Test verification

kubectl exec -it hadoop-ha-hadoop-hdfs-nn-0 -n hadoop-ha -- bash
复制代码

insert image description here

5) Uninstall

helm uninstall hadoop-ha -n hadoop-ha

kubectl delete pod -n hadoop-ha `kubectl get pod -n hadoop-ha|awk 'NR>1{print $1}'` --force
kubectl patch ns hadoop-ha -p '{"metadata":{"finalizers":null}}'
kubectl delete ns hadoop-ha --force

rm -fr /opt/bigdata/servers/hadoop-ha/{nn,dn,jn}/data/data{1..3}/*
复制代码

git download address: gitee.com/hadoop-bigd…

Hadoop HA on k8s environment deployment is here first. There are not many descriptions here. If you have any questions, please leave me a message. There may be some areas that are not perfect. We will continue to improve and add other services on this basis. Continue to share articles related to [big data + cloud native], please wait patiently~

Guess you like

Origin juejin.im/post/7147715998974640141