Node.js Child Process - Best Use Cases

Article


author small image Siva Kishore G
Posted : 13 Aug 2022
Modified : 17 Aug 2022
Expert Developers
Node js Threads architecture

Node JS Architecture

Node JS process is single threaded and non-blocking IO. But in reality it works on a multi-threaded underlying system which is generally not exposed to the developer. Node JS has inbuilt capability of spawning sub-processes where one can execute low level system commands. These can be evoked both synchronously and asynchronously.

With a single thread, the parent node js process can take only upto a certain load. This is where child processes shine to tackle scalability.

Child Process

Spawning a child process is as simple as importing an inbuilt module child_process and calling various functions such as spawn(), fork(), exec() and execFile(). Basically, the child process allows us to execute system commands and capture the output.

  • Spawn

    Spawn takes the system command as an argument and emits the output as events. They use EventEmitter API. Use this when large data output is expected.

  • Exec

    Exec creates a shell to execute the command that is passed. The output is buffered and passed as a whole to the callback function. Use this for smaller data since the data is buffered into the memory before unloading it to the callback.

  • ExecFile

    ExecFile is very similar to Exec except that it will not spawn a shell which makes it perform a bit better than the other. A word of caution executing the windows specific .bat and .cmd files. These require shell, so fall back to exec or spawn with shell enabled.

  • Fork

    Fork is similar to spawn but the biggest difference is that a communication channel is attached to the forked object through which the parent process can communicate.

To get more in-depth details, please refer official docs

Best Use Cases

In these use cases, I will explain the implementation at the front end and at the backend. We will work with an Express powered web application.

1. Ping

Lets say, you are at a digital marketing company and you want to check the health of your clients websites. A quick check is to do a ping on the domain. When you run ping in the terminal, you would get data incrementally (stream).

By default, the data received will be in a base64 format, simply call the data.toString() to convert it to human readable. Lets see how to use the Spawn functionality and send the data using express.

app.post('/test-page', (req, res) => {
  var {spawn} = require('child_process') // Declare this variable outside if you are using spawn in various routes
  var sp = spawn('ping',['htmljstemplates.com'])
  sp.stdout.on('data',(data)=>{
    res.write(data.toString())
  })
  sp.stderr.on('data',(data)=>{
    res.write(data)
  })
  sp.on('close',(code)=>{
    res.end()
  })
})

Express is very well capable of sending the data as streams. Simply call the res.write('Some data') and at the end, call res.end(). Notice the 'close' event listener, I have implemented the close call here. The 'code' will output 'O' if the function returned successfully and a non-zero value to suggest that the function encountered an error.

Lets see how to implement this at the front end i.e read the data incrementally using javascript.

( async()=>{
  var req = await fetch('/test-page',{method:'post'});
  if(req.status !== 200) return console.log(req.statusText)
  var reader = req.body.getReader();
  var result;
  var decoder = new TextDecoder('utf8');
  while (!result?.done) {
    result = await reader.read();
    let chunk = decoder.decode(result.value);
    console.log(chunk); // document.getElementById('insertData').innerHTML += `<div>${chunk}</div>`
  }
} )()

The code above will output the data as soon as the data is streamed from the server. I've added an optional code in line 10 to show how to show the data to the user. This is a much better user experience compared to generic ajax or fetch calls.

$.ajax({
  url: '/test-page',
  type: 'post',
  cache: false,
  data: {url:'htmljstemplates.com'},
  success: function(result){
    console.log(result)
  },
  error: function(err){
    console.log(err)
  }
})

The above code will buffer all the input and squirts out all at once to the user. For the user, this will look like the page is not responding and slow. But if you insist, I recommend showing a "loading" spinner while this process completes.

2. Git

Some Git commands will output the data at once. For example git status. In this case, we can use the exec command since its syntax is very simple and employs a callback style.

var {exec} = require('child_process')
app.post('/test-page', (req, res) => {
  exec('git status',(err,stdout,stderr)=>{
    if(err) return res.status(501).send("Error fetching the status")
    res.send(stdout)
  })
})

We can run any system command and get the output programatically. Now you must be thinking, Wait! Can I build my own pipeline? Absolutely.

If you are using forever to run your node application, create an exec function and pass git pull && forever restart index.js

Run this on separate process since calling the forever restart will cause the child process to disconnect and makes the application unstable.

3. Startup

Every web application involves some kind of database. MongoDB and SQL are the most common ones. A common practice with them is to save the previous connection and reuse it. The most expensive and time taking process is opening a connection. But once its done, the subsequent data transfers are relatively fast.

Whenever we update the code, we usually restart the node process and by doing so, the first online user will have to open the database connection and wait since there wans't any previous connection to resue. This is not a good user experience.

In this scenario, we created a pool of startup services and offloaded it to a new "slow.js" file. We will fork this file and send signal to initialize the databases. As a developer, this creates a better opportunity to identify the issues and resolve them.

const {fork} = require('child_process')
const forked = fork('slow.js')
forked.on('message',(msg)=>{
  console.log(msg)
})
forked.send("mongo")
forked.send("mysql")
// slow.js
process.on('message',(msg)=>{
  switch(msg){
    case 'mongo':
    // Start Mongo DB
    require('../db/mongoDB').init((err,data)=>{
      if(err){
        // Send notification to admin regarding the failed process
        return process.send("Unable to Start Mongo DB")
      }
      process.send("Unable to Start Mongo DB")
    })
    break;

    case 'sql':
    // Start SQL
    require('../db/sql').init((err,data)=>{
      if(err){
        // Send notification to admin regarding the failed process
        return process.send("Unable to Start MySQL")
      }
      process.send("Unable to Start MySQL")
    })
    break;

    default:
    break;
  }
})

Using Web Sockets

A typical web server is capable of sending and receiving the data that doesn't timeouts. For example, a nginx "server request timeout" is 60 seconds and that for a node js (6.x) 2 minutes. See below how to increase the timeout.

// Increasing the timeout globally
var server = app.listen(9001)
server.setTimeout(300000) // 300 Seconds -> 5 Minutes; 0 -> no timeout
// Increasing timeout for a specific route
app.post('/some-url',(req,res)=>{
  req.setTimeout(30000) // 0 -> no timeout
  // Some large computation
  res.send('RESULT')
})

So, what happens if the data we requested doesn't get completed under the stipulated timeout. You would get a HTTP 505 - Gateway Timeout error.

Web Sockets to rescue! Web sockets don't typically wait for a response, they are driven by events. They can only be triggered by an incoming message and thereby invalidating the concept of timeout.

Let's use Socket IO

const io = require('socket.io')({ cors: {origin: "*"} }); // Optional Cors
var {spawn} = require('child_process')
io.on('connection', client => {
  client.on('get-server-status', data => {
    var sp = spawn('sudo',['service','nginx','status'])
    sp.stdout.on('data',(data)=>{
      client.emit('server-status',data.toString())
    })
    sp.on('close',(code)=>{
      client.emit('command-status',`Program exited with code : ${code}`) // Code - 0. Program ran succesfully. Non-Zero, program encountered error
    })
  })
})
var socket = io("<URL OF THE WEBSITE>");
socket.on('connect', () => {
  socket.emit("get-server-status", 'Optional message in plain text of json')
})
socket.on('server-status',(resp)=>{
  console.log("Server response", resp)
})
socket.on('command-status',(resp)=>{
  console.log(resp)
})

Conclusion

We can run any system command and get the output programatically. Now you must be thinking, Wait! Can I build my own pipeline? Absolutely.

If you are using forever to run your node application, create an exec function and pass git pull && forever restart index.js

Run this on separate node process since calling the forever restart will cause the child process to disconnect and makes the application unstable.



Post a comment

I promise, I keep it clean *

Comments

Alert SVG Cookie Consent

This website uses cookies and similar technologies, to enhance your browsing experience and provide personalized recommendations. By continuing to use our website, you agree to our Privacy policy.