Embedded databases explained - Open Source by greenrobot

Accelerated by the Corona pandemic, exploding numbers of connected devices and data volumes drive a shift towards decentralized Edge Computing and with it the need for embedded database management systems continues to grow rapidly. Analysts expect the Embedded Database market to grow by 60% annually (CAGR) from 2022-2029.

What is an Embedded Database?

What is a database vs. a DBMS?

A “database” is an organized collection of (structured or unstructured) data, typically stored electronically in a computer system. The most common database operations are Create, Read, Update, Delete (CRUD). “Database Management System” (or DBMS) refers to the piece of software for storing and managing that data. However, often the term “database” is also used loosely to refer to a DBMS, and you will find most DBMS only use the term database in their name and communication.

What does “embedded” mean in the database world?

The term “embedded” can mean two different things when used in the context of databases:

“Database for embedded systems” is a database specifically designed to be used in embedded systems. Embedded systems are systems consisting of a deeply integrated hardware / software combination, e.g. electronic control units (ECUs), IoT devices. A database for such systems must have
- a small footprint and
- be optimised to run on highly restricted hardware
- thrifty with resource-use, e.g. CPU, Memory, Battery.
“Embedded database”: this means that the database is deeply integrated in the software / application. Also referred to as an “embeddable database”, “embedded database management system” or “embedded DBMS (Database Management System)”.

A lot of confusion arises from these terms being used interchangeably, because not all embedded databases are suitable for embedded systems. Additionally, databases used in embedded systems don’t need to be embedded in the software. On top, the phrase “embedded database” can refer to any of the two meanings. Here, we will only use these two terms as defined above.

Embedded Database vs Embedded System

What is an embedded system?

Embedded systems / embedded devices are everywhere nowadays. They are used in most industries, ranging from healthcare to manufacturing to automotive to games. Recall, an embedded system is in its essence a (typically small) piece of hardware with integrated software. Tyically, embedded systems are highly restricted (CPU, power, memory, …) and connected (WiFi, bluetooth, ZigBee, …) devices. They also typically form a part of a larger system and each individual embedded system serves a small number of specific functions within the larger system. All in all, embedded systems often form a complex distributed system. Examples of embedded systems: smartphones, controlling units, micro-controllers, cameras, smart watches, home appliances, ATMs, robots, sensors, medical devices, and many more.

Embedded Devices can be found across verticals — Embedded systems can be found everywhere from household items to aviation

Embedded Database vs Database for Embedded Systems

A core constraint of embedded systems is often their limited computational power. So maintaining efficiency and a small footprint becomes vital for a DBMS. This fact gave rise to the new market of databases specifically made for embedded systems. Because of being lightweight and highly-performant, embedded databases might work well in embedded systems. However, not all embedded databases are suitable for embedded devices. Such features like local data storage and efficient synchronisation with the backend play a huge role in determining which databases work best in embedded systems.

A database that is both embedded in the application and works well in embedded systems is called an Edge Database. To clarify, Edge Database is an embedded database optimised for resource-efficiency on restricted decentralised devices (this typically means embedded devices) with limited resources. Mobile databases, for example, are a type of Edge Databases that support mobile operating systems, like Android and iOS.

New Edge databases solve the challenge of an insanely growing number of data produced from the growingnumber of embedded devices, both in the professional / industrial as well as the consumer world. Edge Database help access decentralized edge data, making that data useful and creating value.

Why use an embedded database in an embedded system?

First of all, local data storage enabled by embedded databases is a big advantage for typical embedded systems that often face connectivity issues (due to the device, use case, data protection needs, location, costs, …) or operate in real-time scenarios (e.g. on the factory floor, or in the car). Due to the limited connectivity or realtime requirements that these systems often experience, one cannot often rely on it for retrieving data from the cloud. Instead, a smart solution would be to store data locally on the device and sync it with other parts of the system only when needed.

Embedded systems often deal with large amounts of data, while also having an unreliable or non-permanent connection. This can be imposed by the limitations of the system or done deliberately to save battery life. Thus, a suitable synchronisation solution should not only sync data every time there is a connection, but also do it efficiently. For example, differential sync works well: by only sending the changes to the server, it will help to avoid unnecessary energy use and also save network costs.

The two most important features of databases in embedded systems are performance and reliability. A database used in embedded systems should perform well on devices with limited CPU and memory. This is why embedded databases might work well in embedded systems – they are largely designed to work in exactly such environments. Some of them are truly tiny, which means they thrive in small applications. While better performance helps to eliminate some of the risks, it does not help with sudden power failures. Therefore, a good data recovery procedure is also important.

Let’s have a look at the features of embedded databases that make them a great choice for embedded systems.

Most relevant features when choosing an embedded database

High performance. In the embedded space, resources are often restricted and some devices might depend on a battery. The performance of the database thus impacts the viability of the application and/or the business case. Evaluating the performance of the database is therefore typically one of the most relevant criteria.
Small footprint. In the embedded space, storage space is often restricted and the size of the application and thus the embedded database does matter. Embedded databases can be smaller than 1 MB, which makes them particularly suitable for mobile and IoT devices with limited storage.
Reliability. Many embedded devices use battery power, so sudden power failures might happen. Therefore, the data management solution should be built to ensure that data is reliably persisted. This is a popular feature of embedded databases that are built with embedded systems in mind.
Scalability. The number of connected devices and data volumes are growing across use cases and industries. An efficient embedded database should therefore not only perform well and scale with large sets of data, but also support a large variety of devices and operating systems.
Ease of use and low maintenance. As with any database or developer tool ease of use (typically resulting in less maintenance work) is highly important. Embedded databases can SQL and NoSQL-based. Depending on your developer’teams skillset, a database with native language APIs in the programming of choice is the easiest option. Also, since embedded databases are embedded directly in the application, they do not need administration and effectively manage themselves.

Comparison matrix of the most popular Embedded Databases

When choosing an embedded database, look out for such factors as ACID (atomicity, consistency, isolation, durability) compliance, CRUD performance, footprint, and data sync. This comparison matrix should help you:

Database solution	Primary model	Minimum footprint	Sync	Languages
SQLite	relational	<1MB	no	C/C++, Tcl, Python, Java, Go, MATLAB, PHP, and more
Mongo Realm	Object-oriented NoSQL database	5 MB+	Sync only via Mongo Cloud	Swift, Objective-C, Java, Kotlin, C#, and JavaScript
Berkeley DB	NoSQL database; key-value store	<2MB	no	C++, C#, Java, Perl, PHP, Python, Ruby, Smalltalk and Tcl
LMDB	Key-value store	<1MB	no	C++, Java, Python, Lua, Go, Ruby, Objective C, Javascript, C#, Perl, PHP, etc
RocksDB	Key-value store		no	C++, C, Java, Python, NodeJS, Go, PHP and Rust, and others
ObjectBox	Object-orientedNoSQL database	<1MB	Offline Sync, on-premise Sync, Cloud Sync, p2p Sync is planned	C,C++, Java, Kotlin, Swift, Go, Flutter / Dart, Python
Couchbase Lite	NoSQL DB; document store	1-5MB	Syncneeds a Couchbase Server	Swift, Objective-C, Java (Android), Java (Non-Android), Kotlin, C#, JavaScript, C
UnQLite	NoSQL; document & key-value store	~ 1.5 MB	no	C/C++, Python
extremeDB	In-memory relational DB, hybrid persistence	< 1 MB	no	C, C#, C++, Java, Lua, Python, Rust

Conclusion

When choosing a database for an embedded system, there are several factors that should be considered. Performance, reliability, maintenance and footprint are some of the leading indicators. On highly restricted devices, even a small difference in one of those parameters can have a significant impact. While building your own solution with a particular device in mind could work, tight schedules and the additional effort typically don’t justify this decision. This is why we recommend choosing a ready-made database, e.g. ObjectBox that is built not just with the specifics of embedded systems in mind, but with an efficient data sync solution on top

There are several embedded databases that perform well on embedded devices. Each has its own benefits and drawbacks, as a result it’s up to you to choose the right one for you. But, if you have a use case where performance is especially important, ObjectBox outperforms all competitors across all CRUD operations (Create, Read, Update, Delete). ObjectBox is a next-gen infrastructure software for Edge Computing. In addition to a fast local data storage, ObjectBox empowers decentralised secure data flows, and combines flexible data management with on-device security. You can evaluate it for yourself by checking out the code on GitHub, or the open source performance benchmarks.

Spread the love