An in-memory database is a type of nonrelational database that relies primarily on memory for data storage, in contrast to databases that store data on disk or SSDs. In-memory databases are designed to attain minimal response time by eliminating the need to access disks. Because all data is stored and managed exclusively in main memory, it is at risk of being lost upon a process or server failure. In-memory databases can persist data on disks by storing each operation in a log or by taking snapshots.

In-memory databases are ideal for applications that require microsecond response times and can have large spikes in traffic coming at any time such as gaming leaderboards, session stores, and real-time analytics.

Use cases :-

Real-time bidding

Real-time bidding refers to the buying and selling of online ad impressions. Usually the bid has to be made while the user is loading a webpage, in 100-120 milliseconds and sometimes as little as 50 milliseconds. During this time period, real-time bidding applications request bids from all buyers for the ad spot, select a winning bid based on multiple criteria, display the bid, and collect post ad-display information. In-memory databases are ideal choices for ingesting, processing, and analyzing real-time data with submillisecond latency.

Gaming leaderboards

A relative gaming leaderboard shows a gamer's position relative to other players of a similar rank. A relative gaming leaderboard can help to build engagement among players and meanwhile keep gamers from becoming demotivated when compared only to top players. For a game with millions of players, in-memory databases can deliver sorting results quickly and keep the leaderboard updated in real time.

Caching

A cache is a high-speed data storage layer which stores a subset of data, typically transient in nature, so that future requests for that data are served up faster than is possible by accessing the data’s primary storage location. Caching allows you to efficiently reuse previously retrieved or computed data. The data in a cache is generally stored in fast access hardware such as RAM (Random-access memory) and may also be used in correlation with a software component. A cache's primary purpose is to increase data retrieval performance by reducing the need to access the underlying slower storage layer.

Data after a machine with an in-memory database reboots or crashes? With just an in-memory database, there’s no way out. A machine is down — the data is lost.

It is possible to combine the power of in-memory data storage and the durability of good old databases like MySQL or Postgres? Sure! Would it affect the performance? Surprisingly, it won’t!

Here come in-memory databases with persistence like Redis, Aerospike, Tarantool.

You may ask: how can in-memory storage be persistent? The trick here is that you still keep everything in memory, but additionally you persist each operation on disk in a transaction log. Look at the picture:

The first thing that you may notice is that even though your fast and nice in-memory database has got persistence now, queries don’t slow down, because they still hit only the main memory like they did with just an in-memory database. Good news! :-) But what about updates? Each update (or let’s name it a transaction) should not only be applied to memory but also persisted on disk. A slow disk. Is it a problem? Let’s look at the picture:

To summarize all that was said above about disks and in-memory databases:

In-memory databases don’t use disk for non-change operations.
In-memory databases do use disk for data change operations, but they use it in the fastest possible way.

Why wouldn’t regular disk-based databases adopt the same techniques? Well, first, unlike in-memory databases, they need to read data from disk on each query (let’s forget about caching for a minute, this is going to be a topic for another article).

You never know what the next query will be, so you can consider that queries generate random access workload on a disk, which is, remember, the worst scenario of disk usage. Second, disk-based databases need to persist changes in such a way that the changed data could be immediately read.

Unlike in-memory databases, which usually don’t read from disk unless for recovery reasons on starting up. So, disk-based databases require specific data structures to avoid a full scan of a transaction log in order to read from a data set fast. One such data structure of the kind is a B/B+ tree. The flip side of this data structure is that you should change a B/B+ tree on each change operation, which could constitute random workload on a disk. While improving the performance for read operations, B/B+ trees are degrading it for write operations. There is a handful of database engines based on B/B+ trees. These are InnoDB by MySQL or Postgres storage engine. There is also another data structure that is somewhat better in terms of write workload — LSM tree. This modern data structure doesn’t solve problems with random reads, but it partially solves problems with random writes. Examples of such engines are RocksDB, LevelDB or Vinyl.

References Used :- What Is an In-Memory Database?
What an in-memory database is and how it persists data efficiently

Version :- 1.0.0

Lessons Learned as a Programmer

Thursday, July 9, 2020

How to convert json values to Map in java

Sunday, May 31, 2020

Martin Fowler - Software Design in the 21st Century ( Code Refactoring )

Monday, April 13, 2020

In-memory database

Real-time bidding

Gaming leaderboards

Caching

The AI Driven Software Developer, Optimize Innovate Transform

Pages

Search This Blog