At the dawn of the millennium, somewhere in the depths of a company contracted by the United States Navy, D. Richard Hipp designed SQLite so that you could finally stick a database inside an application. Its original purpose was to power guided missile destroyers, which surely could have used a fast, lightweight database that ran locally instead of on a remote database server. After all, if there’s ever been a great offline-first use case, it’s being out at sea.
In the years since, as the SQLite spec was expanded and improved, a whole new generation of mobile developers have used it inside their apps too. It may be built on a 45-year-old language, and based off a 43-year-old standard, but today, it runs pretty much anywhere — from being the stock database in Django apps to running natively on Android and iOS devices. And its ubiquity means that many apps live with it, and all the compromises it demands.
SQLite made it possible for an app to interact with the database via function calls, but today, it’s pretty unlikely that developers are interacting with SQLite’s APIs directly. It’s still possible, and allows you to avoid bringing in thousands of objects into memory, but it’s awkward. After all, you still need to write SQL to deal with data stored in the database. That’s not exactly fun to do in a modern, object-oriented language like Kotlin or Swift.
Once you do create or receive data from SQLite, you still need to make it work for you in the language you’ve written your app in. That requires painstakingly mapping and serializing the data that SQLite returns into objects that you can do real work with, and then serializing any changes back into SQLite for storage.
Which is where object-relational mapping libraries — ORMs — come in handy. They do the hard work of translating SQLite’s functions into a set of APIs that feel native to the platform you’re developing in. For iOS, developers wrap SQLite with CoreData: Apple’s long-supported, long-maligned ORM. Consider the example of an RSS reader app, where the user wants to mark all her 10,000 articles as read. With CoreData, you need to bring everything into memory, eventually, marking it as read in a loop, and re-saving each object, one at a time. SQLite gives a sane way of doing it in a single call, but CoreData expects you to treat its object-oriented abstractions seriously, and introduces problems as a result.
The way all ORMs work is by serializing database values into in-memory objects. And by working with what’s in memory instead of what’s on disk, you can easily end up working with stale data. You can, using good practice and modern ORM features, avoid this being too pernicious a problem. But every once in a while, you’ll end up with an unexpected result.
It’s called the impedance mismatch. Your ORM has to make a copy of the data in the tabular SQLite database, and turn it into a graph of objects that you can work with. By itself, that sounds fine — but the moment you start working with data from multiple threads, any ACID guarantees your SQLite database makes will go out the window. Objects can be changed by one thread in ways that break consistency, and lead other threads to make changes to data that might be stale, or simply not exist anymore. When an object makes a change on another thread, there’s nothing in SQLite or any ORM to let other objects know that they must update or cause major problems.
There are solutions, but they do have tradeoffs. The first solution is to restore the “I” in “ACID”: isolation. By using multi-version concurrency control (MVCC), you can work on a snapshot of the database, and the database can do the work of resolving conflicts between many concurrent writers. That way, you can be sure your data won’t change underneath you. But that’s also a problem: it means your data won’t be responsive. To show what the user should see, you have to go back to the database to get the latest data and the latest version.
Inevitably, we confront a simple reality: SQLite and object-oriented languages do things differently from the ground up. They model the world in different ways, and so we incur costs by converting things between the two ways of dealing with data. And we may offer a modern, object-oriented way to help us manage what SQLite does, but sometimes its representation of data simply diverge from the database. And while we could just deal directly with SQLite, we’ll likely end up making our very own bad, ad-hoc ORM.