MooseFs cluster failover operation and maintenance

Batch operation can use the cluster pssh, the package

prun: aliased to pssh -O StrictHostKeyChecking=no -t 0 -p 90 -h hosts -l work -o out -e err

master hang of how to do

If the machine can be started up, then restore it.
If the machine completely collapse, and want to quickly get another master, the

With one of them metalogger server to recover, because it is full mixed cloth, has metalogger, so choose which station will do.

Suppose master machine 10.46.17.17hung up, changing 10.46.19.20to the master when
the machine gzns /home/work/pssh/nfs/mfs_confmodify mfschunkserver.cfgand mfsmetalogger.cfgconnection configuration master

MASTER_HOST = 10.46.19.20

Zsh into gzns state machine, modifying the above two batches cluster configuration.

prun "cd /home/work/.jumbo/etc/mfs && rm mfschunkserver.cfg; wget ftp://**/home/work/pssh/nfs/mfs_conf/mfschunkserver.cfg"
prun "cd /home/work/.jumbo/etc/mfs && rm mfsmetalogger.cfg; wget ftp://**/home/work/pssh/nfs/mfs_conf/mfsmetalogger.cfg"

10.46.19.20 check the machine mfsmaster.cfgand mfsexports.cfgconfiguration is present and correctly configured. If not re-brush, or a single modification.

prun "cd /home/work/.jumbo/etc/mfs && rm mfsmaster.cfg; wget ftp://**/home/work/pssh/nfs/mfs_conf/mfsmaster.cfg"
prun "cd /home/work/.jumbo/etc/mfs && rm mfsexports.cfg; wget ftp://**/home/work/pssh/nfs/mfs_conf/mfsexports.cfg"

Use mfsmaster -ato consolidate metadata logs, automatic recovery mode.

Restart chunkserver and metalogger, if you do not restart, the client may not be able to read the original document, can only write new problems.

prun "ps aux | grep mfschunkserver | grep /home/work/.jumbo/sbin/ | awk '{print $2}' | xargs kill -9"
prun "ps aux | grep mfsmetalogger | grep /home/work/.jumbo/sbin/ | awk '{print $2}' | xargs kill -9"
prun "/home/work/.jumbo/sbin/mfsmetalogger start"
prun "/home/work/.jumbo/sbin/mfschunkserver start"

Above the kill command batch report [FAILURE]nothing, in fact, it has performed successful.

With fusermount -uunloading fuse because it is mounted. If you do not uninstall successfully in client checks whether there mfs mount process, if there is to kill.

fusermount -u /mnt/mfs

Mount the new master of mfs service

mfsmount -H 10.46.19.20 /mnt/mfs

When the cloth chunkserver mixing machine and hang metalogger

https://www.cnblogs.com/bugutian/p/6869278.html
not have a backup copy of the affirmation is endangered, if all your disks are bad, it is inevitable missing and so on. moosefs they have a copy of a balanced strategy, it will automatically make the number of copies you move closer to the diagonal.

chunkserver server, the stored .mfsfile is not the case, about 2 minutes will be automatically backed up to the server on the other chunkserver.

When a chunkserver not made the connection, mfs will be a backup copy on another server to the specified number, and so that Taiwan chunkerserver fault recovery, the extra copies will be automatically deleted until the last tie back to set the number of copies.

End test cluster mfs service, re-brush again (Do not brush after stable operation)

prun "rm -rf /home/work/.jumbo/var/mfs/*"
prun "rm -rf /home/disk1/mfs/*"
prun "rm -rf /home/disk2/mfs/*"
prun "rm -rf /home/disk3/mfs/*"

If the error can not find metadata.mfs

loading metadata ...
can't find metadata.mfs - try using option '-a'
init: metadata manager failed !!!

In /home/work/.jumbo/var/mfsthe copy metadata.mfs.empty(if they are deleted, you can reinstall moosefs, or build their own) as metadata.mfsits content is:

MFSM NEW

Restart normal mfsmaster

/home/work/.jumbo/sbin/mfsmaster start

If the error can not be used to start chunkserver .metaid

hdd space manager: chunkserver without meta id shouldn't use drive with defined meta id (file: '/home/disk1/mfs/.metaid') - use '!' in drive definition to ignore this (dangerous)

First delete the .metaid, then start chunkserver.

prun "rm /home/disk1/mfs/.metaid"
prun "rm /home/disk2/mfs/.metaid"
prun "rm /home/disk3/mfs/.metaid"

Recover accidentally deleted files

Client to view and set commands:

mfsgettrashtime # 查看回收站保留时间
mfssettrashtime # 设定回收站保留时间

The default is 86,400 seconds.

To recover, you need to tuck mfsmeta file system, use -mparameters.

mfsmount -H **.**.**.** -m /mnt/mfsmeta

. . . Is the master machine.
Enter the /mnt/mfsmetadirectory can be seen

  • Directory trash (can be reduced still contains the deleted file information) and trash / undel (over this directory is equivalent to recover).
  • Directory reserved, within the directory is being deleted but still open the files. After the user closes the file to be opened, reserved files in the directory will be deleted.

For the upcoming recovery method to recover files from trash directory to the trash / undel directory. If you have a new file with the same name in the same path, the restore fails.

Read and write speed test

Metalogger increase the number of copies and does not influence the write rate.

If the specified goal than 2, the client sends the data chunk to a server, the server reads the chunk, and sends the write data to another data chunk server. Thus, the input client does not send multiple copies and all copies are written to substantially simultaneously.

1.5G comprising a large number of small files written to test the entire catalog, 2m14s, rate of about 11M / s. Delete consuming 42s, a rate of about 37M / s.

429M large file write time-consuming 3.89s, rate of about 110M / s.

Guess you like

Origin www.cnblogs.com/xrszff/p/10960200.html