Today we’re going to talk about some performance considerations about web applications and the advantages of having a memory cache.
As noted in our previous Memcached post, for most applications, some data can be cached in a local cache, so you don’t have to query it every time a user enters your site, as the exchange rates example in my previous post.
For the exchange rates example, you could choose to store the values locally in several formats, you may choose to store the data in a temporary file in the local file system, a database, a memory cache, or whatever comes to your mind, so you don’t need to query the external service again for the rest of the day.
Sometimes, if you already have a DB running your application uses that, you may want to reuse that to store the temporal data, so you don’t need to install any additional applications, for example, memcached. As a developer, it may sometimes become a hard task to convince the sysadmin to install new software in the environment.
In this post, we are going to compare the performance of using a local DB as a cache (Postgre SQL) and a memory cache (Memcached). You may use these comparisons as an argument to move to a memory cache (I mention memcached, but there are more, some I mentioned in my previous post, but you can find more using your preferred search engine ? ).
For the test environment, we installed a local Postgre SQL sever and a Memcached server, both were running simultaneously during the tests.
For every test, we inserted a total of 10000 keys and then read them sequentially. After that, we compared the inserts/second and reads/second and made a graph for each test case.
The variables considered were: length of the key (10 to 250 characters), size of value object (10KB to 1MB), and the number of threads (1 to 40). This means the horizontal axis represents the variable, and the vertical axis shows the insertions or reads per second, depending on the case.
You can find the source code of the tests here. We didn’t develop the source to share it afterward, so it may be incomplete (the DB cache only has the put and get methods implemented, not the delete or flush) and a bit hard to read if you don’t speak Spanish. If you want to read it and have questions, ask in the comments.
In this test, the variable is the length of the key. The object size is fixed to 6kB. According to memcached, the limit length is 256 characters, so I wondered how this limit affects performance. You can see that when adding keys longer than 200 characters, the number of insertions/sec decreases significantly for memcached (red line).
The same tests running against the DB (blue line) show almost an endless number of insertions and reads per second, but far less than memcached. The peak insertions/second in the case of Memcached is 14700, whereas for the DB it is 350.
The reads per second for memcached is more than 3 times the amount of the DB.
In this case, the variable is the object size, from 10kB to 1MB. The key length is fixed to 100 characters.
It is very clear how the performance drops in both cases as the size of the inserted object increases. The biggest size inserted is 1MB, since this is the limit for Memcached.
This last case fixes the length of the key to 100 characters and the size to 6kB. The variable is the number of threads used to read the keys. Every one of these threads read all the 10000 keys stored previously in the cache. Only the read part is essential since the objects are inserted only once into the cache.
The reads/second limit for the DB is reached very fast. This is, using more than 4 threads, the reads/sec remain constant between 1500 and 2000.
For memcached, this limit is not reached, and adding more threads increases the number of reads/sec. We didn’t try with more than 40 threads, but a single memcached server can be shared 2 or more applications from a performance point of view.
In every case, according to the expected result, the memory cache always performs better than the DB, no matter the variable.
Even though memcached loses performance when using big objects or uses it in a normal range (e.g., objects smaller than 500kB), it will perform very nice, and more importantly, the memcached server can scale very well and also very easy. If the performance isn’t good enough, you can always easily add another memcached server in any server with some spare memory, and that’s it, the number of reads and writes per second will increase.
Please note, this comparison is not a formal benchmark between these caches because the tests add some overhead. It is developed in Java, and the Java VM has some time and memory overhead. A more formal benchmark would have to use a small footprint language for the client, for example, C or Lua. We used Java because my main point wasn’t to get an exact number of inserts a cache can provide, but to get an approximate improvement percentage when using a memory cache, instead of a DB.
We hope you enjoyed reading the post as much as we enjoyed writing it!
Please let us know if you have any comments or suggestions. We would also like you to share experiences using other memory caches; we are open to alternatives ? .