Skip to main content

Making sense out of BDB JE fast stats


Refer to EnvironmentStats ,for a run down of what all information gets dumped. StatsConfig has a setClear() method, which lets you get 'deltas' of values from last time instead of absolute values. I am going to cover fast stat metrics that relate to analysing issues in your "fast path"


Cache Usage & IO:


cacheTotalBytes/sharedCacheTotalBytes - gives you the how much each of your environments use out of your shared bdb cache. If you have a env, which has higher latency due to disk, then try tweaking the cache size.

nSequentialWriteBytes/nRandomWriteBytes & nSequentialReadBytes/nRandomReadBytes are very helpful in determining how much IO your app is actually throwing. Compare this to SAR/iostat output to see how much headroom you have. This is useful for SSDs as there is more headroom than spindle disks, where you run out of iops pretty soon and its all cache sizing after that.


Contention
nBINsFetchMiss/nBINsFetch  - gives you an estimate of the proportion of your time the requested BIN node was not found in memory. Higher values are bad
since you will incur contention, due to latching of the parent IN node of the BIN, while the BIN is fetched into memory. Keep an eye on nAcquiresNoWaiters and nAcquireWithContention although  btreeRelatchesRequired seems to be more directly related to the problem I described.


Increase cache size accordingly to keep this down to 10% or so.


Cache eviction:
nEvictPasses indicates evictor activity. From the code, every operation does CRITICAL eviction, i.e releases just enough to maintain budget. requiredEvictBytes will tell you how much it has to evict each time.

nNodesEvicted is usually proportional to nCacheMiss and directly affects GC. Make sure, your collection throughput can keep up with the evictionRate.


If nRootNodesEvicted has some reasonable value, then your cache is seriously small. Similarly keep an eye on nBINsEvicted[Critical] and nUpperINsEvicted[Critical], if their proportion is high w.r.t nNodesEvicted, once again you have small cache.

Use CacheMode.EVICT_LN , (as mentioned in the JE faq) to leverage the page cache for caching your data, while you use the JVM heap for index nodes alone. This would give the best average case performance.

Cleaner activity:

These are useful in understanding the cleaner behaviour. If you overlay these, with the cache miss and the application latencies, in a chart, you can see how much impact these have. From the code, it seems like the cleaner will contend with application threads for cache eviction. Hence watch for contention during high cleaner activity.

cleanerBackLog
fileDeletionBacklog
nCleanerDeletions
nCleanerEntriesRead
nCleanerRuns


Tuning them is basically a trade-off between how much disk you have. I am not doing to get into the details of this now. Later as I learn further.


More to come, as I understand what correlations make sense in a practical sense

Comments

Popular posts from this blog

Learning Spark Streaming #1

I have been doing a lot of Spark in the past few months, and of late, have taken a keen interest in Spark Streaming . In a series of posts, I intend to cover a lot of details about Spark streaming and even other stream processing systems in general, either presenting technical arguments/critiques, with any micro benchmarks as needed. Some high level description of Spark Streaming (as of 1.4),  most of which you can find in the programming guide .  At a high level, Spark streaming is simply a spark job run on very small increments of input data (i.e micro batch), every 't' seconds, where t can be as low as 1 second. As with any stream processing system, there are three big aspects to the framework itself. Ingesting the data streams : This is accomplished via DStreams, which you can think of effectively as a thin wrapper around an input source such as Kafka/HDFS which knows how to read the next N entries from the input. The receiver based approach is a little compl

Setting up Hadoop/YARN/Spark/Hive on Mac OSX El Capitan

If you are like me, who loves to have everything you are developing against working locally in a mini-integration environment, read on Here, we attempt to get some pretty heavy-weight stuff working locally on your mac, namely Hadoop (Hadoop2/HDFS) YARN (So you can submit MR jobs) Spark (We will illustrate with Spark Shell, but should work on YARN mode as well) Hive (So we can create some tables and play with it)  We will use the latest stable Cloudera distribution, and work off the jars. Most of the methodology is borrowed from here , we just link the four pieces together nicely in this blog.  Download Stuff First off all, make sure you have Java 7/8 installed, with JAVA_HOME variable setup to point to the correct location. You have to download the CDH tarballs for Hadoop, Zookeeper, Hive from the tarball page (CDH 5.4.x page ) and untar them under a folder (refered to as CDH_HOME going forward) as hadoop, zookeeper $ ls $HOME /bin/cdh/5.4.7 hadoop

HDFS Client Configs for talking to HA Hadoop NameNodes

One more simple thing, that had relatively scarce documentation out on the Internet. As you might know, Hadoop NameNodes finally became HA in 2.0 . The HDFS client configuration, which is already a little bit tedious, became more complicated. Traditionally, there were two ways to configure a HDFS client (lets stick to Java) Copy over the entire Hadoop config directory with all the xml files, place it somewhere in the classpath of your app or construct a Hadoop Configuration object by manually adding in those files. Simply provide the HDFS NameNode URI and let the client do the rest.          Configuration conf = new Configuration(false);         conf.set("fs.default.name", "hdfs://localhost:8020"); // this is deprecated now         conf.set("fs.defaultFS", "hdfs://localhost:8020");         FileSystem fs = FileSystem.get(conf); Most people prefer 2, unless you need way more configs from the actual xml config files, at which po