How To Avoid Bottlenecks in Node.js Applications

Recently, I dealt with a huge performance problem in Node.js. My backend built using Sequelize and Mongoose was often unable to respond to any requests for minutes.

Let’s see what I did to identify the bottlenecks, why such issues can happen to any Node.js developer, and how to avoid them.

Detecting the Bottlenecks

I had monitored my VPSs for days without noticing anything alarming before realizing that the problem was elsewhere. Each of the databases used by the application has its own VPS. Data is very large — hundreds of millions of records — but the 16GB 4 CPU VPSs I am using have never saturated their resources. Still, the problem was there.

When monitoring the application and the databases, I found out that everything seemed fine, even while my backend was struggling to respond.

First of all, I had to replicate the problem. To do so, I stressed my application. After some tests, I was able to reproduce the issue. This occurred only when dealing with large queries, but not with CPU-intensive operations.

To understand what was going on, I started monitoring the result of the following query:

SHOW PROCESSLIST

With this special query, MySQL returns a process list indicating the operations currently being performed by the set of threads executing within the server.

Bingo! By observing the result of this query while stressing my application, I was able to understand where the issue originated. My Express application was using no more than five processes at a time, queuing up additional requests.

What Is a Connection Pool?

Opening and closing a database connection involves both CPU time and memory, so they are expensive operations. As a result, if an application opens a new connection for every request to the database, the impact on the performance would be heavy.

To save resources and reuse connections, most database drivers use a different approach. Instead of establishing a new connection every time they need to perform an operation on the database, they borrow a connection from a special cache. As soon as the required operation is over, they give it back, releasing it. This special cache is known as the connection pool.

The downside of this approach is that when the connection pool limit is reached, any new request will have to wait for a connection to be released by the previous ones.

This is why the pool size should be chosen carefully. If your data is not very large or you do not expect many simultaneous users, you might not encounter any problems. Otherwise, this may easily become the bottleneck of your application.

The size of the connection pool is usually configurable in every database driver. The default pool size varies from driver to driver. The most important aspects to take into account in order to find the right value are the expected application load and concurrency level.

Let’s see what the default connection pool size value is in Sequelize and Mongoose and how to configure it.

Configuring the Connection Pool Size in Sequelize

Whenever your application connects to a database, it actually forks a new process to fulfill the request.

As stated in the official documentation, Sequelize sets up a connection pool on initialization that can be configured through the constructor’s options parameter.

By default, this pool is limited to 5 connections. This means that your application will use no more than five connections at a time, no matter how many requests it gets. As explained before, if six users concurrently hit any of your APIs that need to connect to the database, one of them will be queued and have to wait.

You can modify this value by changing the pool.max parameter value when calling the Sequelize constructor as follows:

const sequelize = new Sequelize(
    /* ... */,
    {
        // ...
        pool: {
            max: 5, // default connection pool size
            min: 0,
            acquire: 30000,
            idle: 10000
        }
    }
);

Please note that the right value for the max attribute also depends on your MySQL server configuration.

Keep in mind that mysqld actually permits only max_connections client connections, plus an extra connection reserved for use by accounts that have the SUPER privilege. In MySQL 5.6, its default value is 151.

When max_connections is reached, the server rejects any other connection — as opposed to what happens at the application layer, where requests are queued.

Finding a good value for max_connections is complex, and this is not what this article is about. Instead, I recommend reading this page from the MySQL official documentation.

Configuring the Connection Pool Size in Mongoose

The exact same issue may also occur when using Mongoose to deal with MongoDB.

As stated in the official documentation, each connection— whether created with mongoose.connect or mongoose.createConnection— is backed by an internal configurable connection pool. Again, its default maximum value is 5.

You can modify this value by changing the poolSize option value when calling mongoose.connect as follows:

mongoose
    .connect(
        /* ... */,
        {
            // ...
            poolSize: 5 // default connection pool size
        },
    )

As for MySQL, the poolSize value should be based on your MongoDB server configuration. Specifically, the maximum number of simultaneous connections that mongos or mongod will accept is defined by the maxIncomingConnections. In MongoDB 4.4, its default value is 65536.

According to the MongoDB documentation, this parameter should have a value greater than the maximum size of your connection pool.

Please keep in mind that dealing with the maxIncomingConnections parameter may be tricky and you should not change it without a very good reason.

Conclusion

In this article, we looked at how connection pools may become bottlenecks in your Node.js application. Knowing what a connection pool is and how it works is vital to understanding how to configure it properly. In fact, the default limitation that comes with Sequelize and Mongoose may have a detrimental effect on your application performance. As shown, this is especially true when dealing with concurrent data-intensive requests.

Thanks for reading! I hope that you found this article helpful.