Skip to main content


Showing posts from 2012

Improving BDB JE Storage for Voldemort

I am writing this blogpost, mainly to share my experiences with improving the BDB storage engine for use in Voldemort and also throw light on how this relates our VaaS goal. As you drill into the details, I hope you get a clear idea of the efforts that have gone into making BDB JE work. I have tried to include pointers whenever possible. Note : All of this is written in the context of SSDs. Hence, you will often find me ignoring IOPS completely, focusing on memory. BDB GC Tuning on SSD  The post on engineering blog delves to great details about GC issues we faced, when migrating to SSDs. The article mentions the pushing data off the heap, to play nicely with GC.  This is done by setting EVICT_LN cache mode for online traffic and EVICT_BIN cache mode for cursors.  However, to achieve this, we need a higher version of JE > 4.0.117 . We first evaluated BDB 5, which requires a non backwards compatible data conversion to be done on existing data. But we hit issues in the

Memory allocation speed check

Traditionally, in high performance systems, repeatedly allocating and deallocating memory has been found to be costly. (i.e a malloc vs free cycle). Hence, people resorted to building their own memory pool on top of the OS, dealing with fragmentation/free list maintenance etc. One of the popular techniques to doing this being Slab allocators . This post is about doing a reality check about the cost of explicitly doing an alloc() and free() cycle, given that most popular OS es, specifically Linux gotten better at memory allocation recently. Along the way, I will also compare the JVM memory allocations (which should be faster since we pay a premium for the freeing of memory via garbage collection). So, all set here we go.  The following is a comparison of native c allocations, java jvm based allocation, java direct buffer allocation. For each of them we measure the following. Allocation/free rate (rate/s): This gives you an upper bound on the single threaded throughput of your

C - Ritchie & Kernighan - Part 3

Chapter 7-Input and Output   'prog <infile' causes prog to read characters from infile instead. 'otherprog | prog' runs the two programs otherprog and prog, and pipes the standard output of otherprog into the standard input for prog. 'prog >outfile' will write the standard output to outfile instead of stdout. These can be used when using getchar & putchar Each conversion specification begins with a % and ends with a conversion character. Between the % and the conversion character there may be, in order: A minus sign, which specifies left adjustment of the converted argument. A number that specifies the minimum field width. The converted argument will be printed in a field at least this wide. If necessary it will be padded on the left (or right, if left adjustment is called for) to make up the field width. A period, which separates the field width from the precision. A number, the precision, that specifies the maximu

C - Ritchie & Kernighan Part 2

Chapter 3-Control Flow the else part of an if-else is optional,there is an ambiguity when an else if omitted from a nested if sequence. This is resolved by associating the else with the closest previous else-less if The case labeled default is executed if none of the other cases are satisfied. A default is optional; if it isn't there and if none of the cases match, no action at all takes place. Cases and the default clause can occur in any order. Because cases serve just as labels, after the code for one case is done, execution falls through to the next unless you take explicit action to escape. The standard library provides a more elaborate function strtol for conversion of strings to long integers A pair of expressions separated by a comma is evaluated left to right, and the type and value of the result are the type and value of the right operand. A%B = A when A<B The continue statement applies only to loops, not to switch. A continue