Browse Source

doc: move topics/guides to website

This commit removes the topics and guides that the documentation
working group has proposed added to the website. We want them to have
more visibility and believe that moving them to the website does that.

Ref: https://github.com/nodejs/nodejs.org/pull/1105
Fixes: https://github.com/nodejs/node/issues/10792
PR-URL: https://github.com/nodejs/node/pull/10896
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
v6
Evan Lucas 8 years ago
parent
commit
7eef09ddcf
  1. 192
      doc/guides/timers-in-node.md
  2. 143
      doc/topics/blocking-vs-non-blocking.md
  3. 301
      doc/topics/domain-postmortem.md
  4. 136
      doc/topics/domain-resource-cleanup-example.js
  5. 486
      doc/topics/event-loop-timers-and-nexttick.md

192
doc/guides/timers-in-node.md

@ -1,192 +0,0 @@
---
title: Timers in Node.js
layout: docs.hbs
---
# Timers in Node.js and beyond
The Timers module in Node.js contains functions that execute code after a set
period of time. Timers do not need to be imported via `require()`, since
all the methods are available globally to emulate the browser JavaScript API.
To fully understand when timer functions will be executed, it's a good idea to
read up on the the Node.js
[Event Loop](../topics/event-loop-timers-and-nexttick.md).
## Controlling the Time Continuum with Node.js
The Node.js API provides several ways of scheduling code to execute at
some point after the present moment. The functions below may seem familiar,
since they are available in most browsers, but Node.js actually provides
its own implementation of these methods. Timers integrate very closely
with the system, and despite the fact that the API mirrors the browser
API, there are some differences in implementation.
### "When I say so" Execution ~ *`setTimeout()`*
`setTimeout()` can be used to schedule code execution after a designated
amount of milliseconds. This function is similar to
[`window.setTimeout()`](https://developer.mozilla.org/en-US/docs/Web/API/WindowTimers/setTimeout)
from the browser JavaScript API, however a string of code cannot be passed
to be executed.
`setTimeout()` accepts a function to execute as its first argument and the
millisecond delay defined as a number as the second argument. Additional
arguments may also be included and these will be passed on to the function. Here
is an example of that:
```js
function myFunc (arg) {
console.log('arg was => ' + arg);
}
setTimeout(myFunc, 1500, 'funky');
```
The above function `myFunc()` will execute as close to 1500
milliseconds (or 1.5 seconds) as possible due to the call of `setTimeout()`.
The timeout interval that is set cannot be relied upon to execute after
that *exact* number of milliseconds. This is because other executing code that
blocks or holds onto the event loop will push the execution of the timeout
back. The *only* guarantee is that the timeout will not execute *sooner* than
the declared timeout interval.
`setTimeout()` returns a `Timeout` object that can be used to reference the
timeout that was set. This returned object can be used to cancel the timeout (
see `clearTimeout()` below) as well as change the execution behavior (see
`unref()` below).
### "Right after this" Execution ~ *`setImmediate()`*
`setImmediate()` will execute code at the end of the current event loop cycle.
This code will execute *after* any I/O operations in the current event loop and
*before* any timers scheduled for the next event loop. This code execution
could be thought of as happening "right after this", meaning any code following
the `setImmediate()` function call will execute before the `setImmediate()`
function argument.
The first argument to `setImmediate()` will be the function to execute. Any
subsequent arguments will be passed to the function when it is executed.
Here's an example:
```js
console.log('before immediate');
setImmediate((arg) => {
console.log(`executing immediate: ${arg}`);
}, 'so immediate');
console.log('after immediate');
```
The above function passed to `setImmediate()` will execute after all runnable
code has executed, and the console output will be:
```console
before immediate
after immediate
executing immediate: so immediate
```
`setImmediate()` returns and `Immediate` object, which can be used to cancel
the scheduled immediate (see `clearImmediate()` below).
Note: Don't get `setImmediate()` confused with `process.nextTick()`. There are
some major ways they differ. The first is that `process.nextTick()` will run
*before* any `Immediate`s that are set as well as before any scheduled I/O.
The second is that `process.nextTick()` is non-clearable, meaning once
code has been scheduled to execute with `process.nextTick()`, the execution
cannot be stopped, just like with a normal function. Refer to [this guide](../topics/event-loop-timers-and-nexttick.md#processnexttick)
to better understand the operation of `process.nextTick()`.
### "Infinite Loop" Execution ~ *`setInterval()`*
If there is a block of code that should execute multiple times, `setInterval()`
can be used to execute that code. `setInterval()` takes a function
argument that will run an infinite number of times with a given millisecond
delay as the second argument. Just like `setTimeout()`, additional arguments
can be added beyond the delay, and these will be passed on to the function call.
Also like `setTimeout()`, the delay cannot be guaranteed because of operations
that may hold on to the event loop, and therefore should be treated as an
approximate delay. See the below example:
```js
function intervalFunc () {
console.log('Cant stop me now!');
}
setInterval(intervalFunc, 1500);
```
In the above example, `intervalFunc()` will execute about every 1500
milliseconds, or 1.5 seconds, until it is stopped (see below).
Just like `setTimeout()`, `setInterval()` also returns a `Timeout` object which
can be used to reference and modify the interval that was set.
## Clearing the Future
What can be done if a `Timeout` or `Immediate` object needs to be cancelled?
`setTimeout()`, `setImmediate()`, and `setInterval()` return a timer object
that can be used to reference the set `Timeout` or `Immediate` object.
By passing said object into the respective `clear` function, execution of
that object will be halted completely. The respective functions are
`clearTimeout()`, `clearImmediate()`, and `clearInterval()`. See the example
below for an example of each:
```js
let timeoutObj = setTimeout(() => {
console.log('timeout beyond time');
}, 1500);
let immediateObj = setImmediate(() => {
console.log('immediately executing immediate');
});
let intervalObj = setInterval(() => {
console.log('interviewing the interval');
}, 500);
clearTimeout(timeoutObj);
clearImmediate(immediateObj);
clearInterval(intervalObj);
```
## Leaving Timeouts Behind
Remember that `Timeout` objects are returned by `setTimeout` and `setInterval`.
The `Timeout` object provides two functions intended to augment `Timeout`
behavior with `unref()` and `ref()`. If there is a `Timeout` object scheduled
using a `set` function, `unref()` can be called on that object. This will change
the behavior slightly, and not call the `Timeout` object *if it is the last
code to execute*. The `Timeout` object will not keep the process alive, waiting
to execute.
In similar fashion, a `Timeout` object that has had `unref()` called on it
can remove that behavior by calling `ref()` on that same `Timeout` object,
which will then ensure its execution. Be aware, however, that this does
not *exactly* restore the initial behavior for performance reasons. See
below for examples of both:
```js
let timerObj = setTimeout(() => {
console.log('will i run?');
});
// if left alone, this statement will keep the above
// timeout from running, since the timeout will be the only
// thing keeping the program from exiting
timerObj.unref();
// we can bring it back to life by calling ref() inside
// an immediate
setImmediate(() => {
timerObj.ref();
});
```
## Further Down the Event Loop
There's much more to the Event Loop and Timers than this guide
has covered. To learn more about the internals of the Node.js
Event Loop and how Timers operate during execution, check out
this Node.js guide: [The Node.js Event Loop, Timers, and
process.nextTick()](../topics/event-loop-timers-and-nexttick.md).

143
doc/topics/blocking-vs-non-blocking.md

@ -1,143 +0,0 @@
# Overview of Blocking vs Non-Blocking
This overview covers the difference between **blocking** and **non-blocking**
calls in Node.js. This overview will refer to the event loop and libuv but no
prior knowledge of those topics is required. Readers are assumed to have a
basic understanding of the JavaScript language and Node.js callback pattern.
> "I/O" refers primarily to interaction with the system's disk and
> network supported by [libuv](http://libuv.org/).
## Blocking
**Blocking** is when the execution of additional JavaScript in the Node.js
process must wait until a non-JavaScript operation completes. This happens
because the event loop is unable to continue running JavaScript while a
**blocking** operation is occurring.
In Node.js, JavaScript that exhibits poor performance due to being CPU intensive
rather than waiting on a non-JavaScript operation, such as I/O, isn't typically
referred to as **blocking**. Synchronous methods in the Node.js standard library
that use libuv are the most commonly used **blocking** operations. Native
modules may also have **blocking** methods.
All of the I/O methods in the Node.js standard library provide asynchronous
versions, which are **non-blocking**, and accept callback functions. Some
methods also have **blocking** counterparts, which have names that end with
`Sync`.
## Comparing Code
**Blocking** methods execute **synchronously** and **non-blocking** methods
execute **asynchronously**.
Using the File System module as an example, this is a **synchronous** file read:
```js
const fs = require('fs');
const data = fs.readFileSync('/file.md'); // blocks here until file is read
```
And here is an equivalent **asynchronous** example:
```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
});
```
The first example appears simpler than the second but has the disadvantage of
the second line **blocking** the execution of any additional JavaScript until
the entire file is read. Note that in the synchronous version if an error is
thrown it will need to be caught or the process will crash. In the asynchronous
version, it is up to the author to decide whether an error should throw as
shown.
Let's expand our example a little bit:
```js
const fs = require('fs');
const data = fs.readFileSync('/file.md'); // blocks here until file is read
console.log(data);
// moreWork(); will run after console.log
```
And here is a similar, but not equivalent asynchronous example:
```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
console.log(data);
});
// moreWork(); will run before console.log
```
In the first example above, `console.log` will be called before `moreWork()`. In
the second example `fs.readFile()` is **non-blocking** so JavaScript execution
can continue and `moreWork()` will be called first. The ability to run
`moreWork()` without waiting for the file read to complete is a key design
choice that allows for higher throughput.
## Concurrency and Throughput
JavaScript execution in Node.js is single threaded, so concurrency refers to the
event loop's capacity to execute JavaScript callback functions after completing
other work. Any code that is expected to run in a concurrent manner must allow
the event loop to continue running as non-JavaScript operations, like I/O, are
occurring.
As an example, let's consider a case where each request to a web server takes
50ms to complete and 45ms of that 50ms is database I/O that can be done
asynchronously. Choosing **non-blocking** asynchronous operations frees up that
45ms per request to handle other requests. This is a significant difference in
capacity just by choosing to use **non-blocking** methods instead of
**blocking** methods.
The event loop is different than models in many other languages where additional
threads may be created to handle concurrent work.
## Dangers of Mixing Blocking and Non-Blocking Code
There are some patterns that should be avoided when dealing with I/O. Let's look
at an example:
```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
console.log(data);
});
fs.unlinkSync('/file.md');
```
In the above example, `fs.unlinkSync()` is likely to be run before
`fs.readFile()`, which would delete `file.md` before it is actually read. A
better way to write this that is completely **non-blocking** and guaranteed to
execute in the correct order is:
```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
console.log(data);
fs.unlink('/file.md', (err) => {
if (err) throw err;
});
});
```
The above places a **non-blocking** call to `fs.unlink()` within the callback of
`fs.readFile()` which guarantees the correct order of operations.
## Additional Resources
- [libuv](http://libuv.org/)
- [About Node.js](https://nodejs.org/en/about/)

301
doc/topics/domain-postmortem.md

@ -1,301 +0,0 @@
# Domain Module Postmortem
## Usability Issues
### Implicit Behavior
It's possible for a developer to create a new domain and then simply run
`domain.enter()`. Which then acts as a catch-all for any exception in the
future that couldn't be observed by the thrower. Allowing a module author to
intercept the exceptions of unrelated code in a different module. Preventing
the originator of the code from knowing about its own exceptions.
Here's an example of how one indirectly linked modules can affect another:
```js
// module a.js
const b = require('./b');
const c = require('./c');
// module b.js
const d = require('domain').create();
d.on('error', () => { /* silence everything */ });
d.enter();
// module c.js
const dep = require('some-dep');
dep.method(); // Uh-oh! This method doesn't actually exist.
```
Since module `b` enters the domain but never exits any uncaught exception will
be swallowed. Leaving module `c` in the dark as to why it didn't run the entire
script. Leaving a potentially partially populated `module.exports`. Doing this
is not the same as listening for `'uncaughtException'`. As the latter is
explicitly meant to globally catch errors. The other issue is that domains are
processed prior to any `'uncaughtException'` handlers, and prevent them from
running.
Another issue is that domains route errors automatically if no `'error'`
handler was set on the event emitter. There is no opt-in mechanism for this,
and automatically propagates across the entire asynchronous chain. This may
seem useful at first, but once asynchronous calls are two or more modules deep
and one of them doesn't include an error handler the creator of the domain will
suddenly be catching unexpected exceptions, and the thrower's exception will go
unnoticed by the author.
The following is a simple example of how a missing `'error'` handler allows
the active domain to hijack the error:
```js
const domain = require('domain');
const net = require('net');
const d = domain.create();
d.on('error', (err) => console.error(err.message));
d.run(() => net.createServer((c) => {
c.end();
c.write('bye');
}).listen(8000));
```
Even manually removing the connection via `d.remove(c)` does not prevent the
connection's error from being automatically intercepted.
Failures that plagues both error routing and exception handling are the
inconsistencies in how errors are bubbled. The following is an example of how
nested domains will and won't bubble the exception based on when they happen:
```js
const domain = require('domain');
const net = require('net');
const d = domain.create();
d.on('error', () => console.error('d intercepted an error'));
d.run(() => {
const server = net.createServer((c) => {
const e = domain.create(); // No 'error' handler being set.
e.run(() => {
// This will not be caught by d's error handler.
setImmediate(() => {
throw new Error('thrown from setImmediate');
});
// Though this one will bubble to d's error handler.
throw new Error('immediately thrown');
});
}).listen(8080);
});
```
It may be expected that nested domains always remain nested, and will always
propagate the exception up the domain stack. Or that exceptions will never
automatically bubble. Unfortunately both these situations occur, leading to
potentially confusing behavior that may even be prone to difficult to debug
timing conflicts.
### API Gaps
While APIs based on using `EventEmitter` can use `bind()` and errback style
callbacks can use `intercept()`, alternative APIs that implicitly bind to the
active domain must be executed inside of `run()`. Meaning if module authors
wanted to support domains using a mechanism alternative to those mentioned they
must manually implement domain support themselves. Instead of being able to
leverage the implicit mechanisms already in place.
### Error Propagation
Propagating errors across nested domains is not straight forward, if even
possible. Existing documentation shows a simple example of how to `close()` an
`http` server if there is an error in the request handler. What it does not
explain is how to close the server if the request handler creates another
domain instance for another async request. Using the following as a simple
example of the failing of error propagation:
```js
const d1 = domain.create();
d1.foo = true; // custom member to make more visible in console
d1.on('error', (er) => { /* handle error */ });
d1.run(() => setTimeout(() => {
const d2 = domain.create();
d2.bar = 43;
d2.on('error', (er) => console.error(er.message, domain._stack));
d2.run(() => {
setTimeout(() => {
setTimeout(() => {
throw new Error('outer');
});
throw new Error('inner')
});
});
}));
```
Even in the case that the domain instances are being used for local storage so
access to resources are made available there is still no way to allow the error
to continue propagating from `d2` back to `d1`. Quick inspection may tell us
that simply throwing from `d2`'s domain `'error'` handler would allow `d1` to
then catch the exception and execute its own error handler. Though that is not
the case. Upon inspection of `domain._stack` you'll see that the stack only
contains `d2`.
This may be considered a failing of the API, but even if it did operate in this
way there is still the issue of transmitting the fact that a branch in the
asynchronous execution has failed, and that all further operations in that
branch must cease. In the example of the http request handler, if we fire off
several asynchronous requests and each one then `write()`'s data back to the
client many more errors will arise from attempting to `write()` to a closed
handle. More on this in _Resource Cleanup on Exception_.
### Resource Cleanup on Exception
The script [`domain-resource-cleanup-example.js`][]
contains a more complex example of properly cleaning up in a small resource
dependency tree in the case that an exception occurs in a given connection or
any of its dependencies. Breaking down the script into its basic operations:
- When a new connection happens, concurrently:
- Open a file on the file system
- Open Pipe to unique socket
- Read a chunk of the file asynchronously
- Write chunk to both the TCP connection and any listening sockets
- If any of these resources error, notify all other attached resources that
they need to clean up and shutdown
As we can see from this example a lot more must be done to properly clean up
resources when something fails than what can be done strictly through the
domain API. All that domains offer is an exception aggregation mechanism. Even
the potentially useful ability to propagate data with the domain is easily
countered, in this example, by passing the needed resources as a function
argument.
One problem domains perpetuated was the supposed simplicity of being able to
continue execution, contrary to what the documentation stated, of the
application despite an unexpected exception. This example demonstrates the
fallacy behind that idea.
Attempting proper resource cleanup on unexpected exception becomes more complex
as the application itself grows in complexity. This example only has 3 basic
resources in play, and all of them with a clear dependency path. If an
application uses something like shared resources or resource reuse the ability
to cleanup, and properly test that cleanup has been done, grows greatly.
In the end, in terms of handling errors, domains aren't much more than a
glorified `'uncaughtException'` handler. Except with more implicit and
unobservable behavior by third-parties.
### Resource Propagation
Another use case for domains was to use it to propagate data along asynchronous
data paths. One problematic point is the ambiguity of when to expect the
correct domain when there are multiple in the stack (which must be assumed if
the async stack works with other modules). Also the conflict between being
able to depend on a domain for error handling while also having it available to
retrieve the necessary data.
The following is a involved example demonstrating the failing using domains to
propagate data along asynchronous stacks:
```js
const domain = require('domain');
const net = require('net');
const server = net.createServer((c) => {
// Use a domain to propagate data across events within the
// connection so that we don't have to pass arguments
// everywhere.
const d = domain.create();
d.data = { connection: c };
d.add(c);
// Mock class that does some useless async data transformation
// for demonstration purposes.
const ds = new DataStream(dataTransformed);
c.on('data', (chunk) => ds.data(chunk));
}).listen(8080, () => console.log(`listening on 8080`));
function dataTransformed(chunk) {
// FAIL! Because the DataStream instance also created a
// domain we have now lost the active domain we had
// hoped to use.
domain.active.data.connection.write(chunk);
}
function DataStream(cb) {
this.cb = cb;
// DataStream wants to use domains for data propagation too!
// Unfortunately this will conflict with any domain that
// already exists.
this.domain = domain.create();
this.domain.data = { inst: this };
}
DataStream.prototype.data = function data(chunk) {
// This code is self contained, but pretend it's a complex
// operation that crosses at least one other module. So
// passing along "this", etc., is not easy.
this.domain.run(function() {
// Simulate an async operation that does the data transform.
setImmediate(() => {
for (var i = 0; i < chunk.length; i++)
chunk[i] = ((chunk[i] + Math.random() * 100) % 96) + 33;
// Grab the instance from the active domain and use that
// to call the user's callback.
const self = domain.active.data.inst;
self.cb.call(self, chunk);
});
});
};
```
The above shows that it is difficult to have more than one asynchronous API
attempt to use domains to propagate data. This example could possibly be fixed
by assigning `parent: domain.active` in the `DataStream` constructor. Then
restoring it via `domain.active = domain.active.data.parent` just before the
user's callback is called. Also the instantiation of `DataStream` in the
`'connection'` callback must be run inside `d.run()`, instead of simply using
`d.add(c)`, otherwise there will be no active domain.
In short, for this to have a prayer of a chance usage would need to strictly
adhere to a set of guidelines that would be difficult to enforce or test.
## Performance Issues
A significant deterrent from using domains is the overhead. Using node's
built-in http benchmark, `http_simple.js`, without domains it can handle over
22,000 requests/second. Whereas if it's run with `NODE_USE_DOMAINS=1` that
number drops down to under 17,000 requests/second. In this case there is only
a single global domain. If we edit the benchmark so the http request callback
creates a new domain instance performance drops further to 15,000
requests/second.
While this probably wouldn't affect a server only serving a few hundred or even
a thousand requests per second, the amount of overhead is directly proportional
to the number of asynchronous requests made. So if a single connection needs to
connect to several other services all of those will contribute to the overall
latency of delivering the final product to the client.
Using `AsyncWrap` and tracking the number of times
`init`/`pre`/`post`/`destroy` are called in the mentioned benchmark we find
that the sum of all events called is over 170,000 times per second. This means
even adding 1 microsecond overhead per call for any type of setup or tear down
will result in a 17% performance loss. Granted, this is for the optimized
scenario of the benchmark, but I believe this demonstrates the necessity for a
mechanism such as domain to be as cheap to run as possible.
## Looking Ahead
The domain module has been soft deprecated since Dec 2014, but has not yet been
removed because node offers no alternative functionality at the moment. As of
this writing there is ongoing work building out the `AsyncWrap` API and a
proposal for Zones being prepared for the TC39. At such time there is suitable
functionality to replace domains it will undergo the full deprecation cycle and
eventually be removed from core.
[`domain-resource-cleanup-example.js`]: ./domain-resource-cleanup-example.js

136
doc/topics/domain-resource-cleanup-example.js

@ -1,136 +0,0 @@
'use strict';
const domain = require('domain');
const EE = require('events');
const fs = require('fs');
const net = require('net');
const util = require('util');
const print = process._rawDebug;
const pipeList = [];
const FILENAME = '/tmp/tmp.tmp';
const PIPENAME = '/tmp/node-domain-example-';
const FILESIZE = 1024;
var uid = 0;
// Setting up temporary resources
const buf = Buffer(FILESIZE);
for (var i = 0; i < buf.length; i++)
buf[i] = ((Math.random() * 1e3) % 78) + 48; // Basic ASCII
fs.writeFileSync(FILENAME, buf);
function ConnectionResource(c) {
EE.call(this);
this._connection = c;
this._alive = true;
this._domain = domain.create();
this._id = Math.random().toString(32).substr(2).substr(0, 8) + (++uid);
this._domain.add(c);
this._domain.on('error', () => {
this._alive = false;
});
}
util.inherits(ConnectionResource, EE);
ConnectionResource.prototype.end = function end(chunk) {
this._alive = false;
this._connection.end(chunk);
this.emit('end');
};
ConnectionResource.prototype.isAlive = function isAlive() {
return this._alive;
};
ConnectionResource.prototype.id = function id() {
return this._id;
};
ConnectionResource.prototype.write = function write(chunk) {
this.emit('data', chunk);
return this._connection.write(chunk);
};
// Example begin
net.createServer((c) => {
const cr = new ConnectionResource(c);
const d1 = domain.create();
fs.open(FILENAME, 'r', d1.intercept((fd) => {
streamInParts(fd, cr, 0);
}));
pipeData(cr);
c.on('close', () => cr.end());
}).listen(8080);
function streamInParts(fd, cr, pos) {
const d2 = domain.create();
var alive = true;
d2.on('error', (er) => {
print('d2 error:', er.message)
cr.end();
});
fs.read(fd, new Buffer(10), 0, 10, pos, d2.intercept((bRead, buf) => {
if (!cr.isAlive()) {
return fs.close(fd);
}
if (cr._connection.bytesWritten < FILESIZE) {
// Documentation says callback is optional, but doesn't mention that if
// the write fails an exception will be thrown.
const goodtogo = cr.write(buf);
if (goodtogo) {
setTimeout(() => streamInParts(fd, cr, pos + bRead), 1000);
} else {
cr._connection.once('drain', () => streamInParts(fd, cr, pos + bRead));
}
return;
}
cr.end(buf);
fs.close(fd);
}));
}
function pipeData(cr) {
const pname = PIPENAME + cr.id();
const ps = net.createServer();
const d3 = domain.create();
const connectionList = [];
d3.on('error', (er) => {
print('d3 error:', er.message);
cr.end();
});
d3.add(ps);
ps.on('connection', (conn) => {
connectionList.push(conn);
conn.on('data', () => {}); // don't care about incoming data.
conn.on('close', () => {
connectionList.splice(connectionList.indexOf(conn), 1);
});
});
cr.on('data', (chunk) => {
for (var i = 0; i < connectionList.length; i++) {
connectionList[i].write(chunk);
}
});
cr.on('end', () => {
for (var i = 0; i < connectionList.length; i++) {
connectionList[i].end();
}
ps.close();
});
pipeList.push(pname);
ps.listen(pname);
}
process.on('SIGINT', () => process.exit());
process.on('exit', () => {
try {
for (var i = 0; i < pipeList.length; i++) {
fs.unlinkSync(pipeList[i]);
}
fs.unlinkSync(FILENAME);
} catch (e) { }
});

486
doc/topics/event-loop-timers-and-nexttick.md

@ -1,486 +0,0 @@
# The Node.js Event Loop, Timers, and `process.nextTick()`
## What is the Event Loop?
The event loop is what allows Node.js to perform non-blocking I/O
operations — despite the fact that JavaScript is single-threaded — by
offloading operations to the system kernel whenever possible.
Since most modern kernels are multi-threaded, they can handle multiple
operations executing in the background. When one of these operations
completes, the kernel tells Node.js so that the appropriate callback
may be added to the **poll** queue to eventually be executed. We'll explain
this in further detail later in this topic.
## Event Loop Explained
When Node.js starts, it initializes the event loop, processes the
provided input script (or drops into the [REPL][], which is not covered in
this document) which may make async API calls, schedule timers, or call
`process.nextTick()`, then begins processing the event loop.
The following diagram shows a simplified overview of the event loop's
order of operations.
```txt
┌───────────────────────┐
┌─>│ timers │
│ └──────────┬────────────┘
│ ┌──────────┴────────────┐
│ │ I/O callbacks │
│ └──────────┬────────────┘
│ ┌──────────┴────────────┐
│ │ idle, prepare │
│ └──────────┬────────────┘ ┌───────────────┐
│ ┌──────────┴────────────┐ │ incoming: │
│ │ poll │<─────┤ connections, │
│ └──────────┬────────────┘ │ data, etc. │
│ ┌──────────┴────────────┐ └───────────────┘
│ │ check │
│ └──────────┬────────────┘
│ ┌──────────┴────────────┐
└──┤ close callbacks │
└───────────────────────┘
```
*note: each box will be referred to as a "phase" of the event loop.*
Each phase has a FIFO queue of callbacks to execute. While each phase is
special in its own way, generally, when the event loop enters a given
phase, it will perform any operations specific to that phase, then
execute callbacks in that phase's queue until the queue has been
exhausted or the maximum number of callbacks has executed. When the
queue has been exhausted or the callback limit is reached, the event
loop will move to the next phase, and so on.
Since any of these operations may schedule _more_ operations and new
events processed in the **poll** phase are queued by the kernel, poll
events can be queued while polling events are being processed. As a
result, long running callbacks can allow the poll phase to run much
longer than a timer's threshold. See the [**timers**](#timers) and
[**poll**](#poll) sections for more details.
_**NOTE:** There is a slight discrepancy between the Windows and the
Unix/Linux implementation, but that's not important for this
demonstration. The most important parts are here. There are actually
seven or eight steps, but the ones we care about — ones that Node.js
actually uses - are those above._
## Phases Overview
* **timers**: this phase executes callbacks scheduled by `setTimeout()`
and `setInterval()`.
* **I/O callbacks**: executes almost all callbacks with the exception of
close callbacks, the ones scheduled by timers, and `setImmediate()`.
* **idle, prepare**: only used internally.
* **poll**: retrieve new I/O events; node will block here when appropriate.
* **check**: `setImmediate()` callbacks are invoked here.
* **close callbacks**: e.g. `socket.on('close', ...)`.
Between each run of the event loop, Node.js checks if it is waiting for
any asynchronous I/O or timers and shuts down cleanly if there are not
any.
## Phases in Detail
### timers
A timer specifies the **threshold** _after which_ a provided callback
_may be executed_ rather than the **exact** time a person _wants it to
be executed_. Timers callbacks will run as early as they can be
scheduled after the specified amount of time has passed; however,
Operating System scheduling or the running of other callbacks may delay
them.
_**Note**: Technically, the [**poll** phase](#poll) controls when timers
are executed._
For example, say you schedule a timeout to execute after a 100 ms
threshold, then your script starts asynchronously reading a file which
takes 95 ms:
```js
var fs = require('fs');
function someAsyncOperation (callback) {
// Assume this takes 95ms to complete
fs.readFile('/path/to/file', callback);
}
var timeoutScheduled = Date.now();
setTimeout(function () {
var delay = Date.now() - timeoutScheduled;
console.log(delay + "ms have passed since I was scheduled");
}, 100);
// do someAsyncOperation which takes 95 ms to complete
someAsyncOperation(function () {
var startCallback = Date.now();
// do something that will take 10ms...
while (Date.now() - startCallback < 10) {
; // do nothing
}
});
```
When the event loop enters the **poll** phase, it has an empty queue
(`fs.readFile()` has not completed), so it will wait for the number of ms
remaining until the soonest timer's threshold is reached. While it is
waiting 95 ms pass, `fs.readFile()` finishes reading the file and its
callback which takes 10 ms to complete is added to the **poll** queue and
executed. When the callback finishes, there are no more callbacks in the
queue, so the event loop will see that the threshold of the soonest
timer has been reached then wrap back to the **timers** phase to execute
the timer's callback. In this example, you will see that the total delay
between the timer being scheduled and its callback being executed will
be 105ms.
Note: To prevent the **poll** phase from starving the event loop, [libuv]
(http://libuv.org/) (the C library that implements the Node.js
event loop and all of the asynchronous behaviors of the platform)
also has a hard maximum (system dependent) before it stops polling for
more events.
### I/O callbacks
This phase executes callbacks for some system operations such as types
of TCP errors. For example if a TCP socket receives `ECONNREFUSED` when
attempting to connect, some \*nix systems want to wait to report the
error. This will be queued to execute in the **I/O callbacks** phase.
### poll
The **poll** phase has two main functions:
1. Executing scripts for timers whose threshold has elapsed, then
2. Processing events in the **poll** queue.
When the event loop enters the **poll** phase _and there are no timers
scheduled_, one of two things will happen:
* _If the **poll** queue **is not empty**_, the event loop will iterate
through its queue of callbacks executing them synchronously until
either the queue has been exhausted, or the system-dependent hard limit
is reached.
* _If the **poll** queue **is empty**_, one of two more things will
happen:
* If scripts have been scheduled by `setImmediate()`, the event loop
will end the **poll** phase and continue to the **check** phase to
execute those scheduled scripts.
* If scripts **have not** been scheduled by `setImmediate()`, the
event loop will wait for callbacks to be added to the queue, then
execute them immediately.
Once the **poll** queue is empty the event loop will check for timers
_whose time thresholds have been reached_. If one or more timers are
ready, the event loop will wrap back to the **timers** phase to execute
those timers' callbacks.
### check
This phase allows a person to execute callbacks immediately after the
**poll** phase has completed. If the **poll** phase becomes idle and
scripts have been queued with `setImmediate()`, the event loop may
continue to the **check** phase rather than waiting.
`setImmediate()` is actually a special timer that runs in a separate
phase of the event loop. It uses a libuv API that schedules callbacks to
execute after the **poll** phase has completed.
Generally, as the code is executed, the event loop will eventually hit
the **poll** phase where it will wait for an incoming connection, request,
etc. However, if a callback has been scheduled with `setImmediate()`
and the **poll** phase becomes idle, it will end and continue to the
**check** phase rather than waiting for **poll** events.
### close callbacks
If a socket or handle is closed abruptly (e.g. `socket.destroy()`), the
`'close'` event will be emitted in this phase. Otherwise it will be
emitted via `process.nextTick()`.
## `setImmediate()` vs `setTimeout()`
`setImmediate` and `setTimeout()` are similar, but behave in different
ways depending on when they are called.
* `setImmediate()` is designed to execute a script once the current
**poll** phase completes.
* `setTimeout()` schedules a script to be run after a minimum threshold
in ms has elapsed.
The order in which the timers are executed will vary depending on the
context in which they are called. If both are called from within the
main module, then timing will be bound by the performance of the process
(which can be impacted by other applications running on the machine).
For example, if we run the following script which is not within an I/O
cycle (i.e. the main module), the order in which the two timers are
executed is non-deterministic, as it is bound by the performance of the
process:
```js
// timeout_vs_immediate.js
setTimeout(function timeout () {
console.log('timeout');
},0);
setImmediate(function immediate () {
console.log('immediate');
});
```
```console
$ node timeout_vs_immediate.js
timeout
immediate
$ node timeout_vs_immediate.js
immediate
timeout
```
However, if you move the two calls within an I/O cycle, the immediate
callback is always executed first:
```js
// timeout_vs_immediate.js
var fs = require('fs')
fs.readFile(__filename, () => {
setTimeout(() => {
console.log('timeout')
}, 0)
setImmediate(() => {
console.log('immediate')
})
})
```
```console
$ node timeout_vs_immediate.js
immediate
timeout
$ node timeout_vs_immediate.js
immediate
timeout
```
The main advantage to using `setImmediate()` over `setTimeout()` is
`setImmediate()` will always be executed before any timers if scheduled
within an I/O cycle, independently of how many timers are present.
## `process.nextTick()`
### Understanding `process.nextTick()`
You may have noticed that `process.nextTick()` was not displayed in the
diagram, even though it's a part of the asynchronous API. This is because
`process.nextTick()` is not technically part of the event loop. Instead,
the `nextTickQueue` will be processed after the current operation
completes, regardless of the current phase of the event loop.
Looking back at our diagram, any time you call `process.nextTick()` in a
given phase, all callbacks passed to `process.nextTick()` will be
resolved before the event loop continues. This can create some bad
situations because **it allows you to "starve" your I/O by making
recursive `process.nextTick()` calls**, which prevents the event loop
from reaching the **poll** phase.
### Why would that be allowed?
Why would something like this be included in Node.js? Part of it is a
design philosophy where an API should always be asynchronous even where
it doesn't have to be. Take this code snippet for example:
```js
function apiCall (arg, callback) {
if (typeof arg !== 'string')
return process.nextTick(callback,
new TypeError('argument should be string'));
}
```
The snippet does an argument check and if it's not correct, it will pass
the error to the callback. The API updated fairly recently to allow
passing arguments to `process.nextTick()` allowing it to take any
arguments passed after the callback to be propagated as the arguments to
the callback so you don't have to nest functions.
What we're doing is passing an error back to the user but only *after*
we have allowed the rest of the user's code to execute. By using
`process.nextTick()` we guarantee that `apiCall()` always runs its
callback *after* the rest of the user's code and *before* the event loop
is allowed to proceed. To achieve this, the JS call stack is allowed to
unwind then immediately execute the provided callback which allows a
person to make recursive calls to `process.nextTick()` without reaching a
`RangeError: Maximum call stack size exceeded from v8`.
This philosophy can lead to some potentially problematic situations.
Take this snippet for example:
```js
// this has an asynchronous signature, but calls callback synchronously
function someAsyncApiCall (callback) { callback(); };
// the callback is called before `someAsyncApiCall` completes.
someAsyncApiCall(() => {
// since someAsyncApiCall has completed, bar hasn't been assigned any value
console.log('bar', bar); // undefined
});
var bar = 1;
```
The user defines `someAsyncApiCall()` to have an asynchronous signature,
but it actually operates synchronously. When it is called, the callback
provided to `someAsyncApiCall()` is called in the same phase of the
event loop because `someAsyncApiCall()` doesn't actually do anything
asynchronously. As a result, the callback tries to reference `bar` even
though it may not have that variable in scope yet, because the script has not
been able to run to completion.
By placing the callback in a `process.nextTick()`, the script still has the
ability to run to completion, allowing all the variables, functions,
etc., to be initialized prior to the callback being called. It also has
the advantage of not allowing the event loop to continue. It may be
useful for the user to be alerted to an error before the event loop is
allowed to continue. Here is the previous example using `process.nextTick()`:
```js
function someAsyncApiCall (callback) {
process.nextTick(callback);
};
someAsyncApiCall(() => {
console.log('bar', bar); // 1
});
var bar = 1;
```
Here's another real world example:
```js
const server = net.createServer(() => {}).listen(8080);
server.on('listening', () => {});
```
When only a port is passed the port is bound immediately. So the
`'listening'` callback could be called immediately. Problem is that the
`.on('listening')` will not have been set by that time.
To get around this the `'listening'` event is queued in a `nextTick()`
to allow the script to run to completion. Which allows the user to set
any event handlers they want.
## `process.nextTick()` vs `setImmediate()`
We have two calls that are similar as far as users are concerned, but
their names are confusing.
* `process.nextTick()` fires immediately on the same phase
* `setImmediate()` fires on the following iteration or 'tick' of the
event loop
In essence, the names should be swapped. `process.nextTick()` fires more
immediately than `setImmediate()` but this is an artifact of the past
which is unlikely to change. Making this switch would break a large
percentage of the packages on npm. Every day more new modules are being
added, which mean every day we wait, more potential breakages occur.
While they are confusing, the names themselves won't change.
*We recommend developers use `setImmediate()` in all cases because it's
easier to reason about (and it leads to code that's compatible with a
wider variety of environments, like browser JS.)*
## Why use `process.nextTick()`?
There are two main reasons:
1. Allow users to handle errors, cleanup any then unneeded resources, or
perhaps try the request again before the event loop continues.
2. At times it's necessary to allow a callback to run after the call
stack has unwound but before the event loop continues.
One example is to match the user's expectations. Simple example:
```js
var server = net.createServer();
server.on('connection', function(conn) { });
server.listen(8080);
server.on('listening', function() { });
```
Say that `listen()` is run at the beginning of the event loop, but the
listening callback is placed in a `setImmediate()`. Now, unless a
hostname is passed binding to the port will happen immediately. Now for
the event loop to proceed it must hit the **poll** phase, which means
there is a non-zero chance that a connection could have been received
allowing the connection event to be fired before the listening event.
Another example is running a function constructor that was to, say,
inherit from `EventEmitter` and it wanted to call an event within the
constructor:
```js
const EventEmitter = require('events');
const util = require('util');
function MyEmitter() {
EventEmitter.call(this);
this.emit('event');
}
util.inherits(MyEmitter, EventEmitter);
const myEmitter = new MyEmitter();
myEmitter.on('event', function() {
console.log('an event occurred!');
});
```
You can't emit an event from the constructor immediately
because the script will not have processed to the point where the user
assigns a callback to that event. So, within the constructor itself,
you can use `process.nextTick()` to set a callback to emit the event
after the constructor has finished, which provides the expected results:
```js
const EventEmitter = require('events');
const util = require('util');
function MyEmitter() {
EventEmitter.call(this);
// use nextTick to emit the event once a handler is assigned
process.nextTick(function () {
this.emit('event');
}.bind(this));
}
util.inherits(MyEmitter, EventEmitter);
const myEmitter = new MyEmitter();
myEmitter.on('event', function() {
console.log('an event occurred!');
});
```
[REPL]: https://nodejs.org/api/repl.html#repl_repl
Loading…
Cancel
Save