Creating
Peering
Activating
the Active
Backfilling
Backfill-toofull
Backfill-the wait
is in Incomplete state
Inconsistent
Peered
Recovering
Recovering-the wait
the REMAPPED
The Scrubbing
unactive
Unclean
Stale
Undersize
Down
Creating
Meaning: PG are creating
cause: to create pool of time, while creating pg conducted according to the specified pg number state, the normal state of the
consequences: no
solution: no need to solve one of the normal state
Peering
meaning: interconnection between PG, agree on one of the objects and metadata states
the cause: when pg is creating, will be interconnected storage to agree on which the objects and metadata group copy status between opposing normalized through the OSD.
Consequences: No
solution: no need to solve one of the normal state
Activating
Meaning: pg after completing the peering process, previous results will be cured, wait for all pg sync, try to enter the active state
the cause of: pg ready to enter the active state before
the consequences: if long-term card in this state, will not be affected by the PG reading and writing, thereby affecting the entire pool availability
solutions: PG stopped where all OSD
perform data backup pg ceph-object-tool
with ceph-object-tool deletes empty on the primary PG pg (Do not manually delete)
used again ceph- object-tool to import data
manually to the directory pg Fu ceph permissions
final restart osd to
Active
meaning: pg active state, can read and write
the cause of: normal
consequences: None
solution: no need to solve one of the normal state
Backfilling
meaning: status backfilling
causes of: osd this is usually due to the off-line (no more than 5 minutes heartbeat response), ceph find new osd to replace the total amount of data copy performed.
Consequences: the emergence of this state are generally determined to hang or offline osd a
solution: ceph will auto-complete data backfill In most cases, if you can not complete the backfill, backfill-toofull will enter the state
Backfill-toofull
Backfilling suspended state: Meaning
cause reason: because the cause is usually not large enough to backfill missing osd osd
Consequences: Causes pool can not be written, read and write stuck.
Solution: need to check osd capacity, whether there are serious imbalances, the excess data osd manual evacuation (reweight), if the cluster nearful phenomenon, should as soon as possible the physical expansion of
the emergency expansion mode (temporary solution, the best way is extended osd number and capacity)
to suspend write osd:
Ceph osd pAUSE
notification mon osd modifications and full threshold
Ceph mon * injectargs Tell "--mon-osd full-ratio-0.96."
Ceph osd * injectargs Tell "--mon-osd. -full-ratio 0.96 "
notification PG modified full threshold:
Ceph PG set_full_ratio 0.96
releasing write prohibition osd:
Ceph osd Unpause
backfill the wait-
meaning: PG are waiting to begin backfilling operation.
The cause of: OSD off cause (not personally capture the state, probably too fast did not see)
consequences: The next theory in terms of pg enter data backfill backfilling state
solution: normal backfill must pass through state, no special attention
Incomplete
meanings: peering process found not agree on the state of the data
causes: pg authority selected when the log, the log can not complete the authority, the authority of the log or log is completed and the local contrast is not normal logic
Consequences: usually leads pg can not be created, stuck in creating + incomplete state, which led pool can not be used
Solution: First make sure osd_allow_recovery_below_min_size is true, as well as the number of copies is reasonable, whether the number of select osd configuration crushmap consistent with the pool, if both normal, try the following recovery process
stopped all incomplete corresponding to each of the PG osd
use ceph-object-tool to mark complete osd were
then restart osd
inconsistent
meanings: in fact inconsistent copy of the data mean
the cause of: a copy of the data is unknown the reason is missing
consequences: a copy of the data inconsistency resulting in reduced safety
solution: use ceph pg repair tools for data recovery, under normal circumstances can be restored to normal, if you can not recover
the osd of osd_max_scrubs three copies are to emphasize large, that use ceph again pg repair tools for data recovery, then finally transferred back to osd_max_scrubs 1
Peered
meanings: search, referring to the PG can not find enough copies to read and write operations (even min_size are unable to meet the love Under)
the cause of: multiple osd hang up, causing the number of copies of the currently active osd <min_size, read and write functions lock
consequences: pg can not be used, not even for routine pool io
Solution: cluster health status, osd hang over five minutes will automatically remapped to repair the state, the state wants a quick fix method is twofold:
1 try to start a copy osd, rejoins the cluster, peered will automatically disappear
osd 2 initiative out of loss of associated, ceph will automatically enter the repair status
Recovering
Meaning: Restoring the
causes of: when an OSD hung up (down), within which the group will return home behind a copy of the other co-location group; when this OSD rebirth (up), co-location set of content must be updated to the current status;
consequences: the recovery is not always the little things, because of a hardware failure may be implicated in multiple OSD. For example, a cabinet or room network switches fail, which can lead to the current state of the OSD on multiple hosts behind the cluster, each OSD after recovery must be restored.
Solution: Cluster emergence of this state, indicating that PG is automatically restored, wait for it to recover complete enough.
Recovering-wait
Meaning: Wait for Recovery resource reservation
causes of: PG awaiting restoration.
Consequences: pg theory in terms of recovering state will enter the data recovery
solutions: restoring normal state.
Remapped
meanings: remapping state
causes of: when set inside the PG combination Acting changed, a new data set from the old to the migration sets. This time may be relatively long, the main OSD new collection before the completion of the migration can not respond to the request. So the new master OSD will be asked to continue to serve the guidance of former team OSD PG migration is complete. Once the data migration is complete, the new primary will take effect OSD accept the request.
Consequences: If you can not re-mapping, data can not be migrated, you may lose data.
Solution: hang in the OSD or OSD on PG PG reassigned osd number belongs Crush algorithm according to the time of expansion. And PG Remap will go to other OSD.
When Remapped state, PG and current Acting Set Up Set inconsistent.
IO client can read and write properly.
Scrubbing
meaning: clean up the
causes of: pg inconsistency check is being done.
Consequences: IO will cause performance degradation
solution: according to the actual needs of the environment, turn the feature off or reduce the frequency of self-test.
Unactive
meanings: inactive state, PG read and write requests can not be processed
due to reasons: PG for a long time does not appear as acitve state (read or write request can not be executed), PG can not read or write,
consequences: PG can not perform read and write
solution : wait OSD update data to the latest backup status
Unclean
meaning: uncleaned state, PG can not recover from a failure
caused by reasons: return home the group number of copies some object does not reach the desired number of times, they should be in recovery;
the consequences : data security decreased
solution: usually have to perform the recovery operation
Stale
meaning: to refresh the state, pg has not been any updates osd
caused by reasons: it may be caused by osd hang, follow peering status appear together under normal circumstances
Analogue: manual stopped a osd, systemctl stop ceph-osd @ 0, see ceph -s will find that in a short time (peering before), pg will enter stale + clean + active status of special
effects: warning signs, often represents osd abnormality occurs off network or a node.
Solution: In general just have to wait peering can be completed.
Undersized
Meaning: the number of copies is too small
causes of: the number of copies is less than the number of copies of the PG configured storage pool. Usually due to a osd down the service, when this state.
Consequences: reduce data availability
solutions: PG adjustment pool where the number of copies osd default min size pool = 1, no adjustment is recommended. Osd up and other services like
# delete news of stop words
DEF drop_stopwords (Contents, stopwords):
contents_clean = [] # deleted after the news
all_words = [www.qjljdgt.cn] # word cloud structure data used
for line Contents in:
line_clean = [www.zheshengyule.com]
for in Word Line:
IF stopwords in Word:
Continue
line_clean.append (Word)
all_words.append (STR (Word))
contents_clean.append (line_clean)
return contents_clean, all_words
Contents = www.tyyLeapp.com df_content.content_S.values.tolist ()
stopwords = stopwords.stopword.values.tolist ()
after the # delete stop words get news and word cloud data
contents_clean, all_words = drop_stopwords (Contents, stopwords)
# df_content.content_S.isin (stopwords.stopword)
# = df_content df_content [www.jintianxuesha.com ~ df_content.content_S.isin (stopwords.stopword)]
# df_content.head ()
View news content after deleting the stop word
df_content = pd.DataFrame ({www.mxdpt.cn'contents_clean ': contents_clean})
df_content.head ()
can be seen from the results, the comparison of the data for improving the quality of the above data a lot.
Look at all the words appear, that is, delete all_words after the stop words.
www.sanguoyoux.cn = pd.DataFrame df_all_words ({ 'all_words': all_words})
df_all_words.head ()
results:
statistical word frequency all_words each word, the word frequency statistics for convenience and showing the back of a word cloud.
numpy Import
# packet Statistical
words_count = df_all_words.groupby (by = [ 'all_words']) [ 'all_words'] AGG ({ 'count': numpy.size}).
# The count sort
. words_count = words_count.reset_index () sort_values (by = [ 'COUNT'], Ascending = False)
words_count.head (www.pingguoyul.cn)
meaning: failure
caused by reasons: authoritative copy of the OSD downtime return home group, you must wait for it to boot, or is marked as lost to continue
consequences: this time the PG can not provide clients read and write IO, IO hang rammed live
solution: OSD up service.
Detailed Status Ceph in PG
Guess you like
Origin www.cnblogs.com/qwangxiao/p/11453866.html
Recommended
Ranking