Cassandra throws OutOfMemory ; how to tune?

Honza :

I've build a new cluster with cassandra 3.11.3, filled it with data from old cluster with cassandra 2.0.14 and few hours after switching application to this new cluster I've got memory errors:

INFO  [IndexSummaryManager:1] 2019-05-07 23:44:42,090 IndexSummaryRedistribution.java:77 - Redistributing index summaries
INFO  [ReadStage-1] 2019-05-07 23:46:48,394 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-1] 2019-05-08 00:01:48,613 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-1] 2019-05-08 00:16:48,622 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-1] 2019-05-08 00:31:48,778 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [IndexSummaryManager:1] 2019-05-08 00:44:42,761 IndexSummaryRedistribution.java:77 - Redistributing index summaries
INFO  [ReadStage-1] 2019-05-08 00:46:48,860 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-1] 2019-05-08 01:01:48,863 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-1] 2019-05-08 01:16:48,881 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-2] 2019-05-08 01:31:49,135 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ReadStage-1] 2019-05-08 01:41:49,962 MonitoringTask.java:93 - Scheduling monitoring task with report interval of 5000 ms, max operations 50
INFO  [ScheduledTasks:1] 2019-05-08 01:41:54,972 NoSpamLogger.java:91 - Some operations were slow, details available at debug level (debug.log)
INFO  [IndexSummaryManager:1] 2019-05-08 01:44:43,408 IndexSummaryRedistribution.java:77 - Redistributing index summaries
INFO  [ReadStage-1] 2019-05-08 01:46:49,502 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576
INFO  [ScheduledTasks:1] 2019-05-08 01:50:55,000 NoSpamLogger.java:91 - Some operations were slow, details available at debug level (debug.log)
ERROR [ReadStage-3] 2019-05-08 01:57:09,375 JVMStabilityInspector.java:74 - OutOfMemory error letting the JVM handle the error:
java.lang.OutOfMemoryError: null
        at java.lang.ClassLoader.defineClass1(Native Method) ~[na:1.8.0_181]
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763) ~[na:1.8.0_181]
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) ~[na:1.8.0_181]
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) ~[na:1.8.0_181]
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74) ~[na:1.8.0_181]
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369) ~[na:1.8.0_181]
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363) ~[na:1.8.0_181]
        at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_181]
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362) ~[na:1.8.0_181]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_181]
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[na:1.8.0_181]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_181]
        at org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:243) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:214) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:207) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.util.FileHandle.createReader(FileHandle.java:150) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1807) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:103) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.columniterator.SSTableIterator.<init>(SSTableIterator.java:49) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:72) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:107) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.getPartitionIndexLowerBound(UnfilteredRowIteratorWithLowerBound.java:191) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.lowerBound(UnfilteredRowIteratorWithLowerBound.java:88) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.lowerBound(UnfilteredRowIteratorWithLowerBound.java:47) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.utils.MergeIterator$Candidate.<init>(MergeIterator.java:362) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:147) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:44) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.<init>(UnfilteredRowIterators.java:406) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.create(UnfilteredRowIterators.java:422) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.access$000(UnfilteredRowIterators.java:385) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators.merge(UnfilteredRowIterators.java:121) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.withSSTablesIterated(SinglePartitionReadCommand.java:848) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:794) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:669) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:503) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:422) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1887) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2597) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181]
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.3.jar:3.11.3]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
ERROR [ReadStage-3] 2019-05-08 01:57:09,378 JVMStabilityInspector.java:74 - OutOfMemory error letting the JVM handle the error:
java.lang.OutOfMemoryError: null
        at java.lang.ClassLoader.defineClass1(Native Method) ~[na:1.8.0_181]
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763) ~[na:1.8.0_181]
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) ~[na:1.8.0_181]
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) ~[na:1.8.0_181]
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74) ~[na:1.8.0_181]
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369) ~[na:1.8.0_181]
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363) ~[na:1.8.0_181]
        at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_181]
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362) ~[na:1.8.0_181]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_181]
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) ~[na:1.8.0_181]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_181]
        at org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:243) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:214) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:207) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.util.FileHandle.createReader(FileHandle.java:150) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1807) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:103) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.columniterator.SSTableIterator.<init>(SSTableIterator.java:49) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:72) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:107) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.getPartitionIndexLowerBound(UnfilteredRowIteratorWithLowerBound.java:191) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.lowerBound(UnfilteredRowIteratorWithLowerBound.java:88) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.lowerBound(UnfilteredRowIteratorWithLowerBound.java:47) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.utils.MergeIterator$Candidate.<init>(MergeIterator.java:362) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:147) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:44) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.<init>(UnfilteredRowIterators.java:406) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.create(UnfilteredRowIterators.java:422) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.access$000(UnfilteredRowIterators.java:385) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.rows.UnfilteredRowIterators.merge(UnfilteredRowIterators.java:121) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.withSSTablesIterated(SinglePartitionReadCommand.java:848) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:794) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:669) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:503) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:422) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1887) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2597) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181]
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]

Now, I seriously hope that cassandra 3.11.3 is able to handle the same load as 2.0.14 with twice as much memory (the previous cluster used 4GB RAM, this one 7.5GB) so I assume I've set up something incorrectly. Unless the mistake was starting to use vnodes.

The command line which includes the memory setting is

java -Xloggc:/var/log/cassandra/gc.log -XX:+PrintHeapAtGC -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=50 -XX:GCLogFileSize=100M -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseNUMA -XX:+PerfDisableSharedMem -Djava.net.preferIPv4Stack=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSWaitDuration=10000 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -Xms5120M -Xmx5120M -Xmn800M -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -Dcassandra.jmx.local.port=7199 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password -Djava.library.path=/usr/share/cassandra/lib/sigar-bin -Dcassandra.libjemalloc=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 -XX:OnOutOfMemoryError=kill -9 %p -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir=/var/lib/cassandra -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/HdrHistogram-2.1.9.jar:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/asm-5.0.4.jar:/usr/share/cassandra/lib/caffeine-2.2.6.jar:/usr/share/cassandra/lib/cassandra-driver-core-3.0.1-shaded.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.9.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrent-trees-2.4.0.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/ecj-4.4.2.jar:/usr/share/cassandra/lib/guava-18.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/hppc-0.5.4.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.13.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/cassandra/lib/jamm-0.3.0.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/jctools-core-1.2.1.jar:/usr/share/cassandra/lib/jflex-1.6.0.jar:/usr/share/cassandra/lib/jna-4.2.2.jar:/usr/share/cassandra/lib/joda-time-2.4.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/jstackjunit-0.0.1.jar:/usr/share/cassandra/lib/libthrift-0.9.2.jar:/usr/share/cassandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/logback-classic-1.1.3.jar:/usr/share/cassandra/lib/logback-core-1.1.3.jar:/usr/share/cassandra/lib/lz4-1.3.0.jar:/usr/share/cassandra/lib/metrics-core-3.1.5.jar:/usr/share/cassandra/lib/metrics-jvm-3.1.5.jar:/usr/share/cassandra/lib/metrics-logback-3.1.5.jar:/usr/share/cassandra/lib/netty-all-4.0.44.Final.jar:/usr/share/cassandra/lib/ohc-core-0.4.4.jar:/usr/share/cassandra/lib/ohc-core-j8-0.4.4.jar:/usr/share/cassandra/lib/reporter-config-base-3.0.3.jar:/usr/share/cassandra/lib/reporter-config3-3.0.3.jar:/usr/share/cassandra/lib/sigar-1.6.4.jar:/usr/share/cassandra/lib/slf4j-api-1.7.7.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.1.1.7.jar:/usr/share/cassandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-3.11.3.jar:/usr/share/cassandra/apache-cassandra-thrift-3.11.3.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1557299542.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1557299542.log org.apache.cassandra.service.CassandraDaemon

nodetool info now (after restart and switching the apps back) reports:

ID                     : d1f45a7e-0344-4806-894e-20c2ad4a2915
Gossip active          : true
Thrift active          : true
Native Transport active: true
Load                   : 189.55 GiB
Generation No          : 1557299590
Uptime (seconds)       : 4433
Heap Memory (MB)       : 1284.83 / 5040.00
Off Heap Memory (MB)   : 56.34
Data Center            : dc1
Rack                   : rack1
Exceptions             : 0
Key Cache              : entries 52231, size 4.39 MiB, capacity 100 MiB, 212159 hits, 440743 requests, 0.481 recent hit rate, 14400 save period in seconds
Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
Counter Cache          : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
Chunk Cache            : entries 6672, size 417.06 MiB, capacity 480 MiB, 1053622 misses, 1313584 requests, 0.198 recent hit rate, 180.246 microseconds miss latency
Percent Repaired       : 99.93543489030424%
Token                  : (invoke with -T/--tokens to see all 256 tokens)

The biggest table doesn't seem to have too big partitions:

 nodetool cfhistograms main1 articles
main1/articles histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)
50%             1.00             29.52           1358.10              3311                12
75%             1.00             61.21           1955.67              4768                12
95%             1.00             88.15           4866.32              9887                14
98%             2.00             88.15           7007.51             14237                14
99%             2.00             88.15           8409.01             17084                24
Min             1.00             17.09             61.22               259                 8
Max             3.00             88.15          36157.19           5839588                35

Is there some other info I should provide? .. oh, the cluster is small, just two machines.

Chris Lohfink :

5gb heaps (-Xmx5120M) are lower than the minimum recommended heap size of 8gb. While you can probably tune it to make it work the database is not designed with its defaults to run in that limited of a heap. Probably the fastest and most likely way to help is to set concurrent_reads and concurrent_writes to 8 or 4. Then concurrent_compactors and concurrent_validations (if more recent version) to 1.

If you set the -XX:+HeapDumpOnOutOfMemoryError and -XX:HeapDumpPath=/somewhere/writable then on the OOM it will dump the heap and you can open it up and see what is cause. If you get it you can post heap histogram (using jmap on the dump) and people may be able to give more direction - but it may require actually poking around with a tool like yourkit or MAT to know exact cause.

If thats a deadend you can use async profiler

sudo -u [cassandra user] profiler.sh -e alloc -d 60 [cassandra pid]

or sjk

sudo -u [cassandra user] java -jar sjk.jar hh -p [cassandra pid] --dead-young

to get an idea of what objects are getting generated over time.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=100520&siteId=1