Most relational database technologies can now store JSON data and perform queries on it. The introduction of the JSON data type to relational databases represented a turning point and opened up many new possibilities. Considering the innovative nature of this feature, you might have several doubts about whether adopting JSON data is the right choice. In particular, the risk of abusing of JSON and losing the benefits of the relational model is high. That is why having a clear understanding of the advantages and disadvantages that JSON data can bring to a relational database is so important.
Here, you will see everything you need to know about JSON columns and what are the best scenarios in which to rely on them.
What Is JSON?
JSON stands for JavaScript Object Notation and represents a lightweight, easy-to-understand, human-readable data format. In a relational database, the JSON data type is a special binary or text column that allows you to store data in JSON format.
{ "user": { "id": "1", "name": "Maria", "surname": "Williams", "hobbies": [ { "name": "Tennis", "since": 1985 }, { "name": "Reading", "since": 1974 } ] } }
The relational databases that support JSON generally come with JSON-specific functions and operators to query and perform operations on this data type.
Why Does Relational Databases Support JSON Data?
Over time, the features offered to users have become increasingly complex. As a result, it is unlikely that the right data structure to support an application can be purely relational. The importance of data is increasing, and the way to represent it is evolving accordingly. Those who develop relational database technologies know this and have introduced the JSON data type to support the new way of thinking about data.
MySQL introduced JSON support starting from version 5.7, in 2015. PostgreSQL supports JSON data type since version 9.2, released in 2012. SQL Server has supported JSON since SQL Server 2016. As you can see, the most popular RDBMSes enable storing JSON data. This does not mean that the relational model is dying or is no longer useful. On the contrary, it is simply evolving and adapting to new needs.
After all, relational database technologies support JSON only as a data type. This means that you can add one or more JSON columns to your relational tables. This does not radically disrupt the relational approach and represents just a possibility, a useful and powerful feature to have.
When To Store JSON Data in a Relational Database
Embracing the JSON data type and adding JSON columns to your tables can bring several advantages to your relational schema. This is especially true in the following scenarios.
For logging
You should be able to easily read and understand your log data, and the JSON format is a great way to store it. Therefore, you should consider turning your log data into JSON format and storing it in JSON columns. In addition, having a JSON column containing log data is an effective approach to keeping track of what happened just by looking at a row. With this approach, you do not have to define new tables, perform JOIN
s, or spend time retrieving the log data.
JSON columns are also useful for logging where the data came from, especially when importing data via API. Considering that API responses are generally in JSON format, storing them in a JSON column is an effortless solution to not lose this data and exploit it when needed.
To store permissions and configurations
Not all users may have access to the same features and sections of your application. Similarly, each user might configure your application based on their preferences. These are two common scenarios and involve data that changes a lot over time. This is because your application is likely to evolve, involving new configurations, views, features, and sections. As a result, you have to continuously update your relational schema to match the new data structure. This takes time and energy.
Instead, you can store permissions and configurations in a JSON column directly connected to your user table. Also, JSON is a good data format for your permissions and configuration. In fact, your application is likely to treat this data in JSON format.
To avoid slow performance on highly nested data
If performance is a key requirement and your application involves data that is highly nested, you should consider JSON columns. This is particularly true when you have to use it as one piece. In a relational approach, you would have to structure your data in several tables. To retrieve it, you would have to nest various JOIN
s, which would make queries very slow.
On the contrary, by storing this nested data in a JSON column, you can retrieve it with a simple SELECT query. This would make your data retrieval process faster while keeping your data structure easy.
When To Avoid JSON Data in a Relational Database
Some consider using the JSON data type as an antipattern and something to adopt sparingly. The reason is that using a format without constraints such as JSON can undermine your relational schema. In detail, you should avoid JSON in the following cases.
You are not sure what data to store in the JSON column
The JSON format is flexible, but this does mean that you can use it to store whatever you want. Before adding JSON columns to your database, you must clearly define their purpose. Otherwise, they are likely to become junk drawers. If this happens, the consequences can be deadly for the future of your application. The only way you have to avoid it is by designing a data format for your JSON columns.
Avoid using JSON columns if you do not know how to use them. Having JSON columns storing data in different formats means having no data validation on the database level, which can lead to no consistency or integrity at the application level.
You do not want to deal with complex queries
Storing data in JSON format is cool, but do not forget that it comes with additional costs and complications. One of these is that querying data in JSON columns leads to more complicated-looking queries.
Here is what a query involving JSON columns looks like in PostgreSQL:
And this is the equivalent query in a traditional scenario:
The second one is much easier to read and understand. Avoid using JSON if the benefit brought by JSON is not enough to justify the complexity of the new queries.
You have a strongly typed ORM
When mapping your database with a strongly typed ORM technology, you may encounter problems with types. This is because types at the database level are different from types at the application level. When you want to deal with the data stored in JSON columns at the application level, you have two approaches.
The first one is to define JSON columns in your ORM mapping as a string and then transform them into an object. Otherwise, if your ORM supports this option, you can map the column with a custom structured type. In either case, the application or the ORM must perform the data conversion behind the scenes. This comes at a cost and can compromise backend performance.
Conclusion
Data has evolved, and the relational model might no longer be enough to represent data efficiently. This is why relational database technologies introduced ways to deal with unstructured data. Specifically, the most popular RDBMSes available now support the JSON data type. Thanks to this, you can the benefits of NoSQL into your relational database. Here, we looked at what JSON data type is, why it was introduced, and when to use or avoid it.