Apache Geode starts and shuts down your system

 

        Determine proper startup and shutdown procedures and write your startup and shutdown scripts.

 

        A well-designed system startup and shutdown process can boot quickly and protect your data. The processes you need to start and stop include server and locator processes as well as your other Geode applications, including clients. The process you use depends in part on your system configuration and the dependencies between your system processes.

 

      Use the guidelines below to create startup and shutdown procedures and scripts. Some of these instructions use gfsh (Geode SHell).

 

boot your system

         When you start your Geode system you should follow a certain sequence of guidelines.

 

        Start the distributed system before you start its client application. On each distributed system, follow these guidelines to start each member:

  • First start the locator, see "Running the Geode Locator Process" as an example of the locator start command.
  • Start the cache server before the rest of your processes, unless the implementation requires other processes to start before the cache server. See Running the Geode Server Process for an example of a server startup command.
  • If your distributed system uses persistent replicated regions as well as non-persistent replicated regions, you should start all persistent replicated members in parallel before you start the non-persistent region. In this way, persistent members will not be delayed due to other persistent members and subsequent data.
  •  For a system that includes a persistent area, see Startup and Shutdown with Disk Storage
  • If you are running a producer process and a consumer or listener process, start the consumer first. This ensures that consumers and listeners do not miss any notifications or updates.
  •  If you start your locator and peer members all at once, you can start the process withlocator-wait-time属性。这个超时允许对等点等待定位器在尝试加入分布式之前完成启动。如果进程已经被配置为等待定位器启动,它会记录信息级别的消息。

  • GemFire startup was unable to contact a locator. Waiting for one to start.

    Configured locators are frodo[12345],pippin[12345].

      The process will pause for a while and then retry until it connects or exceeds the specifiedlocator-wait-time。默认情况下,locator-wait-time被设置为0,意思是如果一个进程不能在启动前接到定位器将会抛出异常。

Note: You can optionally override the default timeout period for shutting down individual processes. This override setting must be specified during member startup. See Shutting Down the System for details.

 

Boot after disk loss of data

        This information belongs to the catastrophic loss of Geode disk storage files. If you lose the disk storage files, your next boot may hang, waiting for the lost disk to come back. If your system hangs at boot, use the gfsh command to show missing-disk-store来列出丢失的磁盘存储,如果有需要,undo the lost disk storage so that the boot can be completed. You must use the disk store ID to revoke a disk store. There are two commands here:

gfsh>show missing-disk-stores

Disk Store ID             |   Host    |               Directory                                           
------------------------------------ | --------- | -------------------------------------
60399215-532b-406f-b81f-9b5bd8d1b55a | excalibur | /usr/local/gemfire/deploy/disk_store1

gfsh>revoke missing-disk-store --id=60399215-532b-406f-b81f-9b5bd8d1b55a

 Note: This gfsh命令要求你已经通过JMX manager is connected to a distributed system.

 

shut down the system

     Shut down your Geode system by using the gfshshutdown command or by shutting down an individual member at a time.

 

 

Use the close command

     If you are using persistent areas, (members that will save data to disk), you should use the gfsh shutdowncommand to stop the running system in an orderly fashion. This command synchronizes the persistent partition region before shutdown, which makes the distributed system as efficient as possible until the next startup.

      If possible, all members should be running before you close them so that synchronization can take place. Use the following gfsh command to stop the system:

 

gfsh>shutdown
 By default, the close command will only close the data node, if you want to close all nodes including the locator, specify --include-locators=true参数。例如:

 

 

gfsh>shutdown --include-locators=true
 

 

This closes all locators at once, and finally the manager.

Shut down all members within a grace period, specifying a timeout option (seconds).

 

gfsh>shutdown --time-out=60
Shut down all members including locators within a grace period, specifying a timeout option (seconds).

 

gfsh>shutdown --include-locators=true --time-out=60
 

Shut down system members one by one

     If you are not using persistent areas, you can shut down the system by shutting down the members in the reverse order they were started. (See "Start Your System" for recommended member startup sequence)

Closes distributed system members by member type. For example, use the following method to close a member:

  • Use the appropriate method to shut down any clients running on the distributed system connected to Geode.
  • Shut down all cache servers. To shut down a server, use the following gfsh command:

 

gfsh>stop server --name=<...>
 or

 

 

gfsh>stop server --dir=<server_working_dir>
 To close a locator, to close a locator, use the following gfsh command:

 

 

gfsh>stop locator --name=<...>
or

 

 

gfsh>stop locator --dir=<locator_working_dir>
 

Option to turn off system members

DISCONNECT_WAIT命令行参数设置了关闭过程的每个步骤的最大时间。如果任何步骤花的时间比指定的数值长,

它会被强制中止。每个操作都有宽限期,所以缓存成员花费用于关闭的总时间取决于操作的次数和DISCONNECT_WAIT设置。在关闭进程期间,Geode produces a message like the following:

Disconnect listener still running

 

 DISCONNECT_WAIT默认为10000 ms

To change it, set this system property on the java command line when the member starts. E.g:

gfsh>start server --J=-DDistributionManager.DISCONNECT_WAIT=<milliseconds>

Each process can have differentDISCONNECT_WAIT设置。

 

 

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327044051&siteId=291194637