Detailed Status Ceph in PG

  Creating
  
  Peering
  
  Activating
  
  the Active
  
  Backfilling
  
  Backfill-toofull
  
  Backfill-the wait
  
  is in Incomplete state
  
  Inconsistent
  
  Peered
  
  Recovering
  
  Recovering-the wait
  
  the REMAPPED
  
  The Scrubbing
  
  unactive
  
  Unclean
  
  Stale
  
  Undersize
  
  Down
  
  Creating
  
  Meaning: PG are creating
  
  cause: to create pool of time, while creating pg conducted according to the specified pg number state, the normal state of the
  
  consequences: no
  
  solution: no need to solve one of the normal state
  
  Peering
  
  meaning: interconnection between PG, agree on one of the objects and metadata states
  
  the cause: when pg is creating, will be interconnected storage to agree on which the objects and metadata group copy status between opposing normalized through the OSD.
  
  Consequences: No
  
  solution: no need to solve one of the normal state
  
  Activating
  
  Meaning: pg after completing the peering process, previous results will be cured, wait for all pg sync, try to enter the active state
  
  the cause of: pg ready to enter the active state before
  
  the consequences: if long-term card in this state, will not be affected by the PG reading and writing, thereby affecting the entire pool availability
  
  solutions: PG stopped where all OSD
  
  perform data backup pg ceph-object-tool
  
  with ceph-object-tool deletes empty on the primary PG pg (Do not manually delete)
  
  used again ceph- object-tool to import data
  
  manually to the directory pg Fu ceph permissions
  
  final restart osd to
  
  Active
  
  meaning: pg active state, can read and write
  
  the cause of: normal
  
  consequences: None
  
  solution: no need to solve one of the normal state
  
  Backfilling
  
  meaning: status backfilling
  
  causes of: osd this is usually due to the off-line (no more than 5 minutes heartbeat response), ceph find new osd to replace the total amount of data copy performed.
  
  Consequences: the emergence of this state are generally determined to hang or offline osd a
  
  solution: ceph will auto-complete data backfill In most cases, if you can not complete the backfill, backfill-toofull will enter the state
  
  Backfill-toofull
  
  Backfilling suspended state: Meaning
  
  cause reason: because the cause is usually not large enough to backfill missing osd osd
  
  Consequences: Causes pool can not be written, read and write stuck.
  
  Solution: need to check osd capacity, whether there are serious imbalances, the excess data osd manual evacuation (reweight), if the cluster nearful phenomenon, should as soon as possible the physical expansion of
  
  the emergency expansion mode (temporary solution, the best way is extended osd number and capacity)
  
  to suspend write osd:
  
  Ceph osd pAUSE
  
  notification mon osd modifications and full threshold
  
  Ceph mon * injectargs Tell "--mon-osd full-ratio-0.96."
  
  Ceph osd * injectargs Tell "--mon-osd. -full-ratio 0.96 "
  
  notification PG modified full threshold:
  
  Ceph PG set_full_ratio 0.96
  
  releasing write prohibition osd:
  
  Ceph osd Unpause
  
  backfill the wait-
  
  meaning: PG are waiting to begin backfilling operation.
  
  The cause of: OSD off cause (not personally capture the state, probably too fast did not see)
  
  consequences: The next theory in terms of pg enter data backfill backfilling state
  
  solution: normal backfill must pass through state, no special attention
  
  Incomplete
  
  meanings: peering process found not agree on the state of the data
  
  causes: pg authority selected when the log, the log can not complete the authority, the authority of the log or log is completed and the local contrast is not normal logic
  
  Consequences: usually leads pg can not be created, stuck in creating + incomplete state, which led pool can not be used
  
  Solution: First make sure osd_allow_recovery_below_min_size is true, as well as the number of copies is reasonable, whether the number of select osd configuration crushmap consistent with the pool, if both normal, try the following recovery process
  
  stopped all incomplete corresponding to each of the PG osd
  
  use ceph-object-tool to mark complete osd were
  
  then restart osd
  
  inconsistent
  
  meanings: in fact inconsistent copy of the data mean
  
  the cause of: a copy of the data is unknown the reason is missing
  
  consequences: a copy of the data inconsistency resulting in reduced safety
  
  solution: use ceph pg repair tools for data recovery, under normal circumstances can be restored to normal, if you can not recover
  
  the osd of osd_max_scrubs three copies are to emphasize large, that use ceph again pg repair tools for data recovery, then finally transferred back to osd_max_scrubs 1
  
  Peered
  
  meanings: search, referring to the PG can not find enough copies to read and write operations (even min_size are unable to meet the love Under)
  
  the cause of: multiple osd hang up, causing the number of copies of the currently active osd <min_size, read and write functions lock
  
  consequences: pg can not be used, not even for routine pool io
  
  Solution: cluster health status, osd hang over five minutes will automatically remapped to repair the state, the state wants a quick fix method is twofold:
  
  1 try to start a copy osd, rejoins the cluster, peered will automatically disappear
  
  osd 2 initiative out of loss of associated, ceph will automatically enter the repair status
  
  Recovering
  
  Meaning: Restoring the
  
  causes of: when an OSD hung up (down), within which the group will return home behind a copy of the other co-location group; when this OSD rebirth (up), co-location set of content must be updated to the current status;
  
  consequences: the recovery is not always the little things, because of a hardware failure may be implicated in multiple OSD. For example, a cabinet or room network switches fail, which can lead to the current state of the OSD on multiple hosts behind the cluster, each OSD after recovery must be restored.
  
  Solution: Cluster emergence of this state, indicating that PG is automatically restored, wait for it to recover complete enough.
  
  Recovering-wait
  
  Meaning: Wait for Recovery resource reservation
  
  causes of: PG awaiting restoration.
  
  Consequences: pg theory in terms of recovering state will enter the data recovery
  
  solutions: restoring normal state.
  
  Remapped
  
  meanings: remapping state
  
  causes of: when set inside the PG combination Acting changed, a new data set from the old to the migration sets. This time may be relatively long, the main OSD new collection before the completion of the migration can not respond to the request. So the new master OSD will be asked to continue to serve the guidance of former team OSD PG migration is complete. Once the data migration is complete, the new primary will take effect OSD accept the request.
  
  Consequences: If you can not re-mapping, data can not be migrated, you may lose data.
  
  Solution: hang in the OSD or OSD on PG PG reassigned osd number belongs Crush algorithm according to the time of expansion. And PG Remap will go to other OSD.
  
  When Remapped state, PG and current Acting Set Up Set inconsistent.
  
  IO client can read and write properly.
  
  Scrubbing
  
  meaning: clean up the
  
  causes of: pg inconsistency check is being done.
  
  Consequences: IO will cause performance degradation
  
  solution: according to the actual needs of the environment, turn the feature off or reduce the frequency of self-test.
  
  Unactive
  
  meanings: inactive state, PG read and write requests can not be processed
  
  due to reasons: PG for a long time does not appear as acitve state (read or write request can not be executed), PG can not read or write,
  
  consequences: PG can not perform read and write
  
  solution : wait OSD update data to the latest backup status
  
  Unclean
  
  meaning: uncleaned state, PG can not recover from a failure
  
  caused by reasons: return home the group number of copies some object does not reach the desired number of times, they should be in recovery;
  
  the consequences : data security decreased
  
  solution: usually have to perform the recovery operation
  
  Stale
  
  meaning: to refresh the state, pg has not been any updates osd
  
  caused by reasons: it may be caused by osd hang, follow peering status appear together under normal circumstances
  
  Analogue: manual stopped a osd, systemctl stop ceph-osd @ 0, see ceph -s will find that in a short time (peering before), pg will enter stale + clean + active status of special
  
  effects: warning signs, often represents osd abnormality occurs off network or a node.
  
  Solution: In general just have to wait peering can be completed.
  
  Undersized
  
  Meaning: the number of copies is too small
  
  causes of: the number of copies is less than the number of copies of the PG configured storage pool. Usually due to a osd down the service, when this state.
  
  Consequences: reduce data availability
  
  solutions: PG adjustment pool where the number of copies osd default min size pool = 1, no adjustment is recommended. Osd up and other services like
  
  # delete news of stop words
  
  DEF drop_stopwords (Contents, stopwords):
  
  contents_clean = [] # deleted after the news
  
  all_words = [www.qjljdgt.cn] # word cloud structure data used
  
  for line Contents in:
  
  line_clean = [www.zheshengyule.com]
  
  for in Word Line:
  
  IF stopwords in Word:
  
  Continue
  
  line_clean.append (Word)
  
  all_words.append (STR (Word))
  
  contents_clean.append (line_clean)
  
  return contents_clean, all_words
  
  Contents = www.tyyLeapp.com df_content.content_S.values.tolist ()
  
  stopwords = stopwords.stopword.values.tolist ()
  
  after the # delete stop words get news and word cloud data
  
  contents_clean, all_words = drop_stopwords (Contents, stopwords)
  
  # df_content.content_S.isin (stopwords.stopword)
  
  # = df_content df_content [www.jintianxuesha.com ~ df_content.content_S.isin (stopwords.stopword)]
  
  # df_content.head ()
  
  View news content after deleting the stop word
  
  df_content = pd.DataFrame ({www.mxdpt.cn'contents_clean ': contents_clean})
  
  df_content.head ()
  
  can be seen from the results, the comparison of the data for improving the quality of the above data a lot.
  
  Look at all the words appear, that is, delete all_words after the stop words.
  
  www.sanguoyoux.cn = pd.DataFrame df_all_words ({ 'all_words': all_words})
  
  df_all_words.head ()
  
  results:
  
  statistical word frequency all_words each word, the word frequency statistics for convenience and showing the back of a word cloud.
  
  numpy Import
  
  # packet Statistical
  
  words_count = df_all_words.groupby (by = [ 'all_words']) [ 'all_words'] AGG ({ 'count': numpy.size}).
  
  # The count sort
  
  . words_count = words_count.reset_index () sort_values (by = [ 'COUNT'], Ascending = False)
  
  words_count.head (www.pingguoyul.cn)
  
  meaning: failure
  
  caused by reasons: authoritative copy of the OSD downtime return home group, you must wait for it to boot, or is marked as lost to continue
  
  consequences: this time the PG can not provide clients read and write IO, IO hang rammed live
  
  solution: OSD up service.

Guess you like

Origin www.cnblogs.com/qwangxiao/p/11453866.html