18 In Memory Database


18 In Memory Database 2015

In memory database perform most of their operations on data stored in computer memory. Many products support hybrid modes where some data is stored in memory and other data stored on disk. For persistence data is usually written back to disk at some point, although this too is becoming faster as SSD is increasingly used. Performance is the big advantage of these databases, and as memory becomes cheaper and servers address a terabyte and more of memory, so in-memory databases will become more commonly used. SAP, IBM and Oracle all have in-memory databases and have not been explicitly listed below – they will mainly be of interest to larger organisations with investments in the respective technologies.

Aerospike open source, NoSQL database, is a clustered and distributed database optimized for reliable, high performance in-memory use. Its primary key focus delivers scalable and robust no-single-point-of-failure clustering. Rich APIs include mem- cached’s CAS interfaces, schemaless types, in-database operations like increment, synchronous secondary indexes, efficient very large lists and maps, and expressive APIs for languages like Node.js, Java, Scala, PHP, C, C#, Go, and more. By also supporting analytics queries, real-time MapReduce, and in-database computation with user-defined functions, Aerospike provides rich database primitives.

Its use of Flash and SSD technology brings Redis-like in-memory performance to persistent, large-scale datasets, while also supporting RAM datasets. Cloud integration provides Aerospike performance and robustness in high scale environments like Amazon EC2, Google Compute, Internap, GoGrid, and others.

Altibase provides two versions of its database technology – Altibase HDB (hybrid database) and Altibase XDB (Extreme In-Memory Database). Altibase HDB combines in-memory with on-disk processing and the option to operate exclusively in either mode, or as a hybrid. It supports stored procedures, triggers, record level locking, multi-version concurrency control and a variety of database connectivity standards (ODBC, JDBC .Net Provider, OLE DB, CLI). Compatible with all standard interfaces and programming languages (Java, C/C++, Precompiler, PHP, PERL, etc), and loaded with an abundance of tools and utilities for ease-of-use and enhanced productivity. Altibase XDB has been clocked at 1.5 million TPS and operates exclusively in-memory. It supports Direct Attach mode for extreme speed and Direct Call Interface for the highest speeds. A broad range of SQL standards are supported and is compatible with all standard interfaces and programming languages (Java, C/C++, Precompiler, PHP, PERL, etc). As with the HDB version it supports triggers, stored procedures, record level locking, and is ACID compliant.

EXASolution in-memory columnar database has repeatedly set the performance benchmark according to the TPC-H benchmarks. It also comes with a SQL interface, and provides SQL-based BI and data integration tools via EXASOL support for ODBC, JDBC, MDX, and ADO.net. There are many integration mechanisms, including support for programming languages (Java, R, Python, Lua), NoSQL technologies such as Hadoop, OLAP Cube technologies using MDX, SAP CRM integration, geospatial functions and a “Skyline” analytic function for multi-criterion decision making.

eXtremeDB in-Memory Database System embedded database is McObject’s core product: a fast database management system, designed for performance, with a strict memory-based architecture and direct data manipulation. Storing and manipulating data in exactly the form used by the application removes overheads associated with caching and translation. Typical read and write accesses are at the level of a few microseconds, or less. The engine is reentrant, allowing for multiple execution threads, with transactions supporting the ACID properties, assuring transaction and data integrity. It comes in several flavours including:

  • eXtremeDB In-Memory Database System – for real-time applications with small footprint. Ideal for embedding in to equipment such as set top boxes, telecom equipment etc.
  • eXtremeDB Fusion combines on-disk and in-memory data storage in a single embedded database system, so developers can optimize applications for speed and persistence, while adopting the most cost-effective and physical space-conserving data storage.
  • eXtremeDB Financial Edition meets the demanding performance and reliability needs of capital markets systems (algorithmic trading, risk management, etc.). Specialized features include columnar data layout, to accelerate management of time series data (market data); a library of vector-based statistical functions; and GUI-based performance monitoring provided as an application programming interface (API).
  • eXtremeDB Cluster is McObject’s distributed real-time database system. It manages databases across multiple hardware nodes, enabling two or more servers to share the workload. It dramatically increases available net processing power, reduces system expansion costs, and delivers a more scalable and reliable database solution.
  • eXtremeDB High Availability (HA) Edition – High Availability enables deployment of two or more synchronized embedded databases within separate hardware instances using communication channels implemented over standard or proprietary protocols.
  • eXtremeSQL offers a high performance implementation of the popular SQL database programming language, as well as ODBC and JDBC support, for the eXtremeDB in-memory database. eXtremeSQL provides broad coverage of the SQL-89 standard, plus eXtremeDB-specific extensions including support for nearly all eXtremeDB data and query types.

Other variants are available for logging, 64 bit hardware, and kernel mode operation.

Pivotal GemFire stores all operational data compressed and in-memory to avoid disk I/O time lags. Nodes operate in a cluster, optimizing data distribution and processing. Data is durable through in-grid replication and persistent write-optimized disk stores. Pivotal GemFire clusters provide automatic fail-over to other nodes in the cluster in case of failures and supports user defined object models in complex graphs, as well as documents in JSON format. Data can be accessed in native clients for Java, C++ and C#, as well as applications supporting REST calls. APIs implemented include Java:Hashmap, Spring Data GemFire and Memcached. It also allows users to query data using Object Query Language (OQL). Custom procedures written in Java are stored and executed in relevant nodes, where pertinent data is stored. A reliable asycnhronous database event framework provides publish and subscribe capabilities, call back functions for custom processing, along with support for continuous query. To optimize performance and system resource utilization, Pivotal GemFire is built to automate many administrative tasks – including self-healing of clusters when nodes join and distribution of data across nodes. Tools include a cluster status dashboard, offline performance analysis and command line interfaces to support automation scripting.

GPUdb is an object store that allows users to define any number of Sets, and populate these Sets with objects. A Set is analogous to a traditional database table where each column (GPUdb Set Attribute) can be one of the following types: int, double, string, or byte sequence. GPUdb then allows the user to perform calculations against those Sets. GPUdb also includes a number of geospatial functions that can be performed on any Set. The primary way to interact with GPUdb is through its HTTP endpoints. Client devices interact with GPUdb by constructing appropriately formatted Avro binary objects, or appropriately formatted JSON objects and then HTTP POST them to the desired endpoint that maps to a calculation capability. Currently GPUdb supports NVIDIA GPUs and Xeon Phi many-core devices.

H2 is an in-memory Java SQL database with a fast JDBC API and small footprint. It operates in embedded and server modes with clustering support. Other features include disk or in-memory databases, 2-phase-commit, cost based optimisation, and strong encryption.

Microsoft SQL Server 2014 includes the Hekaton technology for in-memory processing. This has boosted the performance of SQL Server by orders of magnitude in may instances, and particularly for in-memory analytics. This makes SQL Server a good solution for many applications including transaction processing, data warehousing, and analytics. Databases can be hosted in-house or using Azure services.

OrigoDB is an in-memory database that caters for a wide variety of domain specific models. The generic models include relational, document, key/value, graph, Xml and others. Bespoke models can be created if needed and commands and queries are written in C# with runtime access to the entire Mono/.NET class library. Persistence is achieved using write-ahead command logging with optional full model snapshots. On startup, the in-memory model is restored from the most recent snapshot followed by command replay. Commands and queries are processed by the kernel, the component responsible for atomicity, isolation and durability.

Redis is an open source, BSD licensed, advanced key-value cache and store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets, bitmaps and hyperloglogs, and can run atomic operations on these types, like appending to a string; incrementing the value in a hash; pushing an element to a list; computing set intersection, union and difference; or getting the member with highest ranking in a sorted set. In order to achieve its high levels of performance, Redis works with an in-memory dataset.

SolidDB is an in-memory database that supports ACID transactions, ANSI SQL, stored procedures and triggers. It allows both in-memory and disk based tables in the same database instance. It can be enbedded inside packaged solutions and requires no administration, separate installation or configuration.

Microsoft SQL Server Compact 4.0 is a free, embedded database that software developers can use for building ASP.NET websites and Windows desktop applications. SQL Server Compact 4.0 has a small footprint and supports private deployment of its binaries within the application folder, easy application development in Visual Studio and WebMatrix, and seamless migration of schema and data to SQL Server.

SQLite is an in-process library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. The code for SQLite is in the public domain and is thus free for use for any purpose, commercial or private. SQLite is the most widely deployed database in the world. It has an optional in-memory mode of operation.

Tarantool is a Lua application server integrated with a database management system. It has a “fiber” model which means that many applications can run simultaneously on a single thread, while the Tarantool server can run multiple threads for input-output and background maintenance. It integrates the LuaJIT – “Just In Time” – Lua compiler, Lua libraries for most common applications, and the Tarantool Database Server which is an established NoSQL DBMS. Thus it serves all the purposes that have made node.js and Twisted popular in other environments, with the additional twist that it has a data persistence level. It has two data engines: 100% in-memory with optional persistence and a 2-level disk-based B-tree, to use with large data sets.

Software AG is no stranger to very fast database technology, and its Adabas DBMS often outperforms many competitors. So its acquisition of Terracotta comes as no surprise The BigMemory Max in-memory database is also a pacesetter, and accommodates a hybrid mode leveraging SSD and flash technologies in addition to DRAM. It supports SQL to query in-memory data, and provides access to BigMemory data from multiple client platforms (Java, .NET/C# and C++). WAN data replication allows data to be synchronised across regions and features fault tolerance and fast restarts. The Terracotta Management Console™ provides a customisable Web dashboard for advanced monitoring and administration of Terracotta deployments.

Oracle TimesTen In-Memory Database (TimesTen) is a full-featured, memory-optimized, relational database with persistence and recoverability. It provides applications with the instant responsiveness and very high throughput required by database-intensive applications. Deployed in the application tier, TimesTen operates on databases that fit entirely in physical memory (RAM). Applications access the TimesTen database using standard SQL interfaces. For customers with existing application data residing on the Oracle Database, TimesTen is deployed as an in-memory cache database with automatic data synchronization between TimesTen and the Oracle Database.

UnQLite is a in-process software library which implements a self-contained, serverless, zero-configuration, transactional NoSQL database engine. UnQLite is a document store database similar to MongoDB, Redis, CouchDB etc. as well a standard Key/Value store similar to BerkeleyDB, LevelDB, etc.

UnQLite is an embedded NoSQL (Key/Value store and Document-store) database engine. Unlike most other NoSQL databases, UnQLite does not have a separate server process. UnQLite reads and writes directly to ordinary disk files. A complete database with multiple collections, is contained in a single disk file. The database file format is cross-platform, you can freely copy a database between 32-bit and 64-bit systems or between big-endian and little-endian architectures. It operates as a disk based or in-memory database.

VoltDB offers the speed and scale of NoSQL databases but with the ACID guarantees, relational data models, and transactional capability of traditional RDBMSs. VoltDB’s in-memory architecture is designed for performance. It eliminates the significant overhead of multi-threading and locking responsible for the poor performance of traditional RDBMSs that rely on disks. Community and paid for editions are available.