Misconceptions about databases in the Jamstack

If you think a relational database and Jamstack aren't compatible, you've come to the right place.

Lately, I've noticed a bit of confusion about where databases fit within the Jamstack ecosystem. This confusion often comes from misconceptions about how modern databases have evolved and Jamstack itself. In this blog post, I aim to dispel those misconceptions and give insight into how you can use relational databases, like MySQL, in your Jamstack applications.

This post will often mention MySQL, but many of these ideas apply to other relational databases such as Postgres.

How did we get here?#

Before we can work through the misconceptions, we need to understand what came before Jamstack. The short version is: LAMP and MEAN stack.

LAMP stack stands for Linux, Apache, MySQL, and PHP. This stack was very popular in the late 1990s and early 2000s and still lives on today. LAMP applications heavily use server-side rendering. With LAMP, you have to maintain and operate your servers yourself. You have to install the MySQL database on the server, the security updates, and make your own performance improvements. Server-side rendering combined with operational challenges causes a handful of issues around reliability, security, scaling, complexity, and cost. Also, during the height of LAMP's popularity, we started seeing more caching, load balancing, and CDN layers to improve performance but increased complexity.

Some folks thought there could be a better way in the 2010s. So they pivoted to an opposite stack, the MEAN stack. MEAN stands for MongoDB, Express, Angular, and Node. MEAN applications heavily use client-side rendering, which means web pages are rendered directly in the browser. You still have to interact with servers on platforms like Google App Engine, Heroku, and other PaaS solutions, just not as deeply as LAMP applications. At the same time, there was a substantial increase in the amount of JavaScript, JavaScript-based tooling, and other infrastructure to try to make things faster. MEAN still had the same issues as LAMP but with even more front-end performance issues because all of the logic, data fetching, templating, and routing are handled by the client, often a browser, rather than the server. So, many MEAN applications became slow over time.

Understanding LAMP and MEAN gives you a better idea of how we got to Jamstack (formerly known as JAMstack) in 2015, which stands for JavaScript, APIs, and Markup. People were frustrated with the performance challenges of MEAN and did not want to manage servers anymore. Many other technologies had gained popularity simultaneously, such as serverless architectures, APIs, git workflows, and other build tools that Jamstack applications use today.

Jamstack is really about compiling as much as you can upfront, putting these static pages and assets in a CDN, and pulling in data as needed. However, none of this is a new idea. It combines lessons learned from LAMP and MEAN that contribute higher performing web applications today.

If you want to learn more about the rise of Jamstack, I recommend Shawn Wang's The Rise of Jamstack talk.

But where do databases fit in? According to the 2021 Jamstack Community Survey, over one-third of responders said a "headless database" was one of the third-party services they are using. This number will only increase as applications move to that Jamstack that previously used older development stacks.

You will likely see more conversations around databases in the years to come. So, let's bust some misconceptions now!

The misconceptions of databases in Jamstack applications#

#1: Jamstack sites cannot be database-driven

Initially, many Jamstack sites were seen as static sites that a CMS often backed. But have you ever thought about what a CMS is? It's a database! Headless CMS and other APIs are often just another data storage mechanism.

Jamstack applications have been database-driven applications from the start. Everything from dynamic content, user-generated content, dashboards, e-commerce functionality to SaaS applications involve databases.

In the transition from LAMP and MEAN stack to Jamstack, we decided that operating servers was no longer something we wanted to do. And I agree, it was a hassle. I still remember the first time I tried installing an Apache server just to call an API. It wasn't fun. It is not easy to simultaneously focus on the back-end infrastructure and the front-end performance. This requires specialized knowledge that many developers don't have time for when they just want to build web applications that pull information from different data sources.

This takes us back to the core of Jamstack. It is really about compiling as much as you can upfront and pulling in data as needed. This doesn't mean you only pull the data at build time. It can still be done dynamically when a page loads too. We already do this with the APIs we use in our Jamstack applications like Shopify for e-commerce, Stripe for payments, Cloudinary for images, Auth0 for authorization, and many more.

#2: Relational databases, like MySQL, cannot support high connection limits

Jamstack applications need a lot of database connections, but why?

Relational databases like MySQL and Postgres use Transmission Control Protocol/Internet Protocol (TCP/IP) connections to connect to a database server. This is different from the HTTP protocol sent over TCP that API requests use. Behind the scenes, any user of a Jamstack application will be opening connections from the front-end or a serverless function. The Vercel docs on databases explain this well:

"Serverless Functions are stateless and asynchronous. They are not designed for a persistent connection to a database. When a function is invoked, a connection to the database is opened. Upon completion, the connection is closed."

These short-lived TCP/IP connections can add up when a large number of users are using a Jamstack site. Some databases cannot handle this high number of connections or have lower limits based on their plans. (Most of this article is unbiased about what database you should use, but in this case, I would be remiss to mention PlanetScale can handle tens of thousands of connections out of the box. On the free tier today, you get 1,000 connections at a time.) With many relational databases, you can implement connection pooling, where connections are shared, to prevent the database from overloading and still use a relational database. Some tools also help you with connection management and pooling, like Prisma's Data Proxy.

#3: You can only connect to your database via a query API

While a core principle of Jamstack is that it doesn't require a web server, this doesn't mean you can't communicate with a hosted database server, such as a database as a service. It's easy to see where the idea that your database has to be accessible via an API comes from: The "A" in Jamstack. But the reality is there are still ways to connect to a database that is not through a query API provided to you by your database provider. You can either use a database client, like mysql2, or an object-relational mapping tool (ORM), like Prisma or sequalize, that also provides a database client.

For example, in Next.js, a common framework used in Jamstack applications, you can fetch data from a database:

At build time with getStaticProps
At request time with getServersideProps
Using API routes (E.g., inside a /pages/api/…) – often implemented as a serverless function
Or by separating the back-end and using a stand-alone server

Where you access your database depends on your use case. It depends on where you want to render or store the data, the type of data, and where you want database logic to live. It might make sense to create a serverless API function for modularization and reuse in some cases. It can reduce the complexity of your client code and increase portability because it is easier to reason about different parts of the application when broken up. You can easily swap out and change parts of the application. For simplicity, you can also treat the serverless API function the same way you treat other third-party APIs in your application.

Conclusion#

This brings us back to where we started: The "A" in Jamstack. The "A" does not mean a third-party API is the only way data can get pulled into a Jamstack application. As Jamstack.org says, "the ability to leverage domain experts who offer their products and service via APIs has allowed teams to build far more complex applications than if they were to take on the risk and burden of such capabilities themselves." Similarly, you should probably not burden yourself with maintaining and operating your database. Instead, find one that works for your Jamstack use case.

Resources#

How we made PlanetScale’s background jobs self-healing