Databases: Switching from relational to document models, Part 1
Reading Time: 5 minutes

How about more insights? Check out the video on this topic.

blank

This is a three-part blog series. Part two is located here, and part three can be found here.

In this blog, we will review relational databases and compare the idea of normalization between relational and document-oriented databases, which is a common challenge for those migrating from a relational to a document model. We will also discuss the SQL language, storage engines, and joins. Later, we will compare the model for normalization and discuss different ways to organize your document-oriented database. We will also delve into the query language, although it’s important to note that depending on the specific database, it could vary.

Finally, we will discuss best practices and common mistakes when migrating from a relational database to a document-oriented database. This blog mainly focuses on MongoDB, FerretDB, CosmosDB, and other common document-oriented databases. If you have any remaining questions, feel free to ask them in the comments below.

Relational Databases – Review

Now I will discuss the relational database, which will serve as a review for those who have never worked with a relational database before. This will give you an idea of what a relational database is, as they are commonly used nowadays. I will also mention some of the most popular relational databases currently in use, including Oracle, PostgreSQL, MySQL, and Microsoft SQL Server.

In the 1970s, Edgar Frank Codd wrote the first paper on relational databases. This was an important milestone for the development of the initial database. Nine years later, Oracle released their first commercial relational database, which was followed by others such as DB2 and Informix.

When these databases were first designed, they were created to use as little resources as possible due to the high cost of storage and processing power at the time. They were designed to run on single hosts and provide good performance with these limited resources.

Despite being over 40 years old, relational databases are still widely used today and are essential to numerous systems. Many of these systems are able to run smoothly with the initial design of these databases.

Relational Databases – Normalization

When discussing document-oriented databases, it’s important to consider the types of workloads that are best suited for them. In contrast, when working with relational databases, normalization is a crucial step in the database design process. This involves splitting tables to avoid duplicate data, save space, and ensure consistency. Relational databases use SQL, and there are several pros and cons associated with data normalization. On the positive side, normalization helps to avoid data duplication, saves space, and can improve performance by cashing small tables. However, there are also some cons to consider, such as a high dependency among tables, making it difficult to retrieve information if a key table is lost. Additionally, there can be a strict schema, making it challenging to add fields or change the primary key.

As an example, consider a simple normalized system for orders that includes customer, user, orders, and order details tables. The customer table includes the customer’s full name, birthday, and creation date, while the user table tracks which application created the customer. The orders table contains minimum fields such as order ID, customer ID, and created date, while the order details table includes the product ID, name, amount, and final price. This is a typical layout for a standard relational database.

blank

Relational Databases – SQL Language and Joins

Relational databases have different kinds of commands such as data manipulation commands (e.g., delete, update, select, join), and data definition commands (e.g., create table, create index). Some relational databases have their own SQL functions for certain operations, like creating backups or adding external data sources.

There is a standard SQL that most relational databases use, but some may have specific commands that are unique to them. It’s important to know these differences because migrating from one relational database to another may not work due to specific codes being used in data definition language (DDL). Some operations may block other operations, but modern storage engines and databases try to avoid blocking as much as possible.

To take advantage of normalization, you need to tell the database what you need. For example, to find all the orders made by a customer called “Joe” you would need to use an inner join to fetch data from child tables. The query optimizer will find the fastest path to retrieve information using indexes, statistics, or other information. Once the query is completed, the relational database returns the results to the client, which could be an application or a user.

This is a three-part blog series. Part two is located here, and part three can be found here.

0 Comments

About the author

blank

Adamo Tonete

MongoDB SME

With decade of experience in NoSQL Databases, Adamo have worked for different companies such as Percona, MongoDB and PingCAP. Currently Adamo works as a MongoDB SME for an Irish company, maintening a 24×7 system.

Related posts

Subscribe to Updates

Privacy Policy