MongoDB Operation and Maintenance Optimization Series (2)

One MQ, with 5 threads open, consumes data on an average day as follows:

Total table: 249733 (upsert + $inc)

Subtable: 1732389 (insert)

When the amount of data is inserted into this level, the database insertion is delayed, and the delay time is more than 5 minutes, gradually increasing...

jstack <pid> go and see the MQ message thread status, the five threads are as follows:

"main" prio=10 tid=0x00007f38ac009000 nid=0x1359 runnable [0x00007f38b3723000]
   java.lang.Thread.State: RUNNABLE
	at java.net.PlainSocketImpl.socketAccept(Native Method)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
	at java.net.ServerSocket.implAccept(ServerSocket.java:530)
	at java.net.ServerSocket.accept(ServerSocket.java:498)
	at org.apache.catalina.core.StandardServer.await(StandardServer.java:470)
	at org.apache.catalina.startup.Catalina.await(Catalina.java:781)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:727)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:294)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:428)

 Note one of the sentences: java.lang.Thread.State: RUNNABLE

This means that the thread is running, and if all 5 are like this, the thread is bluntly busy.

 

At this point, the problem is not yet determined, but there are three solutions to this problem, or they can be parallelized:

1. Increase the MongoDB index

2. Increase the number of consumer threads. (5->15)

3. Change the handler in the message processing OnMessage to asynchronous (that is, take the thread pool mode)

 

Of these three schemes, the fastest implementation is scheme 1, so scheme 1 takes a wave. In order to verify whether it is caused by the index, we open the MongoDB slow log, and the command is as follows:

db.setProfilingLevel(1 ,50);

 where the first parameter:

0 - not enabled 

1 - log slow commands (default >100ms)  

2 - log all commands  

Second parameter:

n, slow operation execution time, in ms

After the execution is successful, the system will generate a system table under the DB

system.profile

The structure of this table is as follows:

{
    "op" : "update",
    "ns" : "chae_prod.oma_osoa_link_total",
    "query" : {
        "$and" : [
            {
                "nsCode" : {
                    "$eq" : "lyf"
                }
            },
            {
                "envCode" : {
                    "$eq" : "prod"
                }
            },
            {
                "minuteTime" : {
                    "$eq" : "2017-06-29 10:09"
                }
            },
            {
                "clientServiceName" : {
                    "$eq" : "back-order-web"
                }
            },
            {
                "serviceName" : {
                    "$eq" : "basics-stock-service"
                }
            }
        ]
    },
    "updateobj" : {
        "$set" : {
            "clientServiceName" : "back-order-web",
            "nsCode" : "lyf",
            "envCode" : "prod",
            "minuteTime" : "2017-06-29 10:09",
            "belongDate" : "2017-06-29",
            "serviceName" : "basics-stock-service"
        },
        "$inc" : {
            "successTotalNumber" : 1,
            "totalNumber" : 1,
            "failedTotalNumber" : 0
        }
    },
    "keysExamined" : 0,
    "docsExamined" : 213026,
    "nMatched" : 1,
    "nModified" : 1,
    "keyUpdates" : 0,
    "writeConflicts" : 0,
    "numYield" : 1669,
    "locks" : {
        "Global" : {
            "acquireCount" : {
                "r" : NumberLong(1670),
                "w" : NumberLong(1670)
            }
        },
        "Database" : {
            "acquireCount" : {
                "w" : NumberLong(1670)
            }
        },
        "Collection" : {
            "acquireCount" : {
                "w" : NumberLong(1670)
            }
        }
    },
    "millis" : 631,
    "execStats" : {},
    "ts" : ISODate("2017-06-29T10:19:58.387+08:00"),
    "client" : "10.10.254.15",
    "allUsers" : [],
    "user" : ""
}

 

 

 

The millis field indicates the execution time. We see 631MS, which is very high.

 

 

OK, we quickly implement the increase index action

We increase the query index of upsert on the main table (also the front-end query index):

 

{
    "nsCode" : -1,
    "envCode" : -1,
    "minuteTime" : -1,
    "clientServiceName" : -1,
    "serviceName" : -1
}

 

 

Add a front-end query index on the child table:

 

{
    "nsCode" : -1,
    "envCode" : -1,
    "endTime" : -1,
    "clientServiceName" : -1,
    "serviceName" : -1
}

 

 

After adding, we query the data again and find that the data of the latest time of the order is OK, and the current time is correct.

 Let's look at the memory stack again:

"pool-2-thread-1" daemon prio=10 tid=0x00007f385ce3c800 nid=0x1370 waiting on condition [0x00007f38a14f7000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c6bd5740> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
	at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
	at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

 The timed_waiting in "TIMED_WAITING (parking)" refers to the waiting state, but the time is specified here, and the waiting state is automatically exited after the specified time; parking refers to the thread being suspended.

This means that our thread consumption is already in a suspended state, relatively idle, waiting for automatic recycling.

After observing for a while, I found that there are no major problems.

So we no longer implement the 2,3 scheme.

For the insurance period, I have added option 4:

Option 4: Delete historical data

There are two ways to implement solution 4. One is to use the TTL that comes with MongoDB. The specific syntax is listed below. Please do the rest:

db.log_events.createIndex({"createdAt": 1},{expireAfterSeconds: 180}) #5分钟后过期

The following parameter units are seconds, 180 seconds.

Since we allocated 1G memory to MongoDB by default, we also increased the MongoDB memory to 2G.

The other is that you write your own job and delete it regularly. Well, the program optimization ends here.

Looking forward to sharing more MongoDB operation and maintenance optimization solutions with you

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326218059&siteId=291194637