doc: async-hooks documentation

Author: Thorsten Lorenz <thlorenz@gmx.de> Author: Andreas Madsen <amwebdk@gmail.com> PR-URL: https://github.com/nodejs/node/pull/13287 Reviewed-By: Trevor Norris <trev.norris@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net>
8 years ago · 126170a05f
1 changed files with 554 additions and 0 deletions
--- a/doc/api/async_hooks.md
+++ b/doc/api/async_hooks.md
@ -0,0 +1,554 @@
+# Async Hooks
+
+> Stability: 1 - Experimental
+
+The `async_hooks` module provides an API to register callbacks tracking the
+lifetime of asynchronous resources created inside a Node.js application.
+It can be accessed using:
+
+```js
+const async_hooks = require('async_hooks');
+```
+
+## Terminology
+
+An asynchronous resource represents an object with an associated callback.
+This callback may be called multiple times, for example, the `connection` event
+in `net.createServer`, or just a single time like in `fs.open`. A resource
+can also be closed before the callback is called. AsyncHook does not
+explicitly distinguish between these different cases but will represent them
+as the abstract concept that is a resource.
+
+## Public API
+
+### Overview
+
+Following is a simple overview of the public API.
+
+```js
+const async_hooks = require('async_hooks');
+
+// Return the ID of the current execution context.
+const cid = async_hooks.currentId();
+
+// Return the ID of the handle responsible for triggering the callback of the
+// current execution scope to call.
+const tid = async_hooks.triggerId();
+
+// Create a new AsyncHook instance. All of these callbacks are optional.
+const asyncHook = async_hooks.createHook({ init, before, after, destroy });
+
+// Allow callbacks of this AsyncHook instance to call. This is not an implicit
+// action after running the constructor, and must be explicitly run to begin
+// executing callbacks.
+asyncHook.enable();
+
+// Disable listening for new asynchronous events.
+asyncHook.disable();
+
+//
+// The following are the callbacks that can be passed to createHook().
+//
+
+// init is called during object construction. The resource may not have
+// completed construction when this callback runs, therefore all fields of the
+// resource referenced by "asyncId" may not have been populated.
+function init(asyncId, type, triggerId, resource) { }
+
+// before is called just before the resource's callback is called. It can be
+// called 0-N times for handles (e.g. TCPWrap), and will be called exactly 1
+// time for requests (e.g. FSReqWrap).
+function before(asyncId) { }
+
+// after is called just after the resource's callback has finished.
+function after(asyncId) { }
+
+// destroy is called when an AsyncWrap instance is destroyed.
+function destroy(asyncId) { }
+```
+
+#### `async_hooks.createHook(callbacks)`
+
+<!-- YAML
+added: REPLACEME
+-->
+
+* `callbacks` {Object} the callbacks to register
+* Returns: `{AsyncHook}` instance used for disabling and enabling hooks
+
+Registers functions to be called for different lifetime events of each async
+operation.
+
+The callbacks `init()`/`before()`/`after()`/`destroy()` are called for the
+respective asynchronous event during a resource's lifetime.
+
+All callbacks are optional. So, for example, if only resource cleanup needs to
+be tracked then only the `destroy` callback needs to be passed. The
+specifics of all functions that can be passed to `callbacks` is in the section
+[`Hook Callbacks`][].
+
+##### Error Handling
+
+If any `AsyncHook` callbacks throw, the application will print the stack trace
+and exit. The exit path does follow that of an uncaught exception but
+all `uncaughtException` listeners are removed, thus forcing the process to
+exit. The `'exit'` callbacks will still be called unless the application is run
+with `--abort-on-uncaught-exception`, in which case a stack trace will be
+printed and the application exits, leaving a core file.
+
+The reason for this error handling behavior is that these callbacks are running
+at potentially volatile points in an object's lifetime, for example during
+class construction and destruction. Because of this, it is deemed necessary to
+bring down the process quickly in order to prevent an unintentional abort in the
+future. This is subject to change in the future if a comprehensive analysis is
+performed to ensure an exception can follow the normal control flow without
+unintentional side effects.
+
+
+##### Printing in AsyncHooks callbacks
+
+Because printing to the console is an asynchronous operation, `console.log()`
+will cause the AsyncHooks callbacks to be called. Using `console.log()` or
+similar asynchronous operations inside an AsyncHooks callback function will thus
+cause an infinite recursion. An easily solution to this when debugging is
+to use a synchronous logging operation such as `fs.writeSync(1, msg)`. This
+will print to stdout because `1` is the file descriptor for stdout and will
+not invoke AsyncHooks recursively because it is synchronous.
+
+```js
+const fs = require('fs');
+const util = require('util');
+
+function debug() {
+  // use a function like this one when debugging inside an AsyncHooks callback
+  fs.writeSync(1, util.format.apply(null, arguments));
+}
+```
+
+If an asynchronous operation is needed for logging, it is possible to keep
+track of what caused the asynchronous operation using the information
+provided by AsyncHooks itself. The logging should then be skipped when
+it was the logging itself that caused AsyncHooks callback to call. By
+doing this the otherwise infinite recursion is broken.
+
+#### `asyncHook.enable()`
+
+* Returns {AsyncHook} A reference to `asyncHook`.
+
+Enable the callbacks for a given `AsyncHook` instance. If no callbacks are
+provided enabling is a noop.
+
+The `AsyncHook` instance is by default disabled. If the `AsyncHook` instance
+should be enabled immediately after creation, the following pattern can be used.
+
+```js
+const async_hooks = require('async_hooks');
+
+const hook = async_hooks.createHook(callbacks).enable();
+```
+
+#### `asyncHook.disable()`
+
+* Returns {AsyncHook} A reference to `asyncHook`.
+
+Disable the callbacks for a given `AsyncHook` instance from the global pool of
+AsyncHook callbacks to be executed. Once a hook has been disabled it will not
+be called again until enabled.
+
+For API consistency `disable()` also returns the `AsyncHook` instance.
+
+#### Hook Callbacks
+
+Key events in the lifetime of asynchronous events have been categorized into
+four areas: instantiation, before/after the callback is called, and when the
+instance is destructed.
+
+##### `init(asyncId, type, triggerId, resource)`
+
+* `asyncId` {number} a unique ID for the async resource
+* `type` {string} the type of the async resource
+* `triggerId` {number} the unique ID of the async resource in whose
+  execution context this async resource was created
+* `resource` {Object} reference to the resource representing the async operation,
+  needs to be released during _destroy_
+
+Called when a class is constructed that has the _possibility_ to emit an
+asynchronous event. This _does not_ mean the instance must call
+`before`/`after` before `destroy` is called, only that the possibility
+exists.
+
+This behavior can be observed by doing something like opening a resource then
+closing it before the resource can be used. The following snippet demonstrates
+this.
+
+```js
+require('net').createServer().listen(function() { this.close(); });
+// OR
+clearTimeout(setTimeout(() => {}, 10));
+```
+
+Every new resource is assigned a unique ID.
+
+###### `type`
+
+The `type` is a string that represents the type of resource that caused
+`init` to call. Generally it will be the name of the resource's constructor.
+The resource types provided by the built-in Node.js modules are:
+
+```
+FSEVENTWRAP, FSREQWRAP, GETADDRINFOREQWRAP, GETNAMEINFOREQWRAP, HTTPPARSER,
+JSSTREAM, PIPECONNECTWRAP, PIPEWRAP, PROCESSWRAP, QUERYWRAP, SHUTDOWNWRAP,
+SIGNALWRAP, STATWATCHER, TCPCONNECTWRAP, TCPWRAP, TIMERWRAP, TTYWRAP,
+UDPSENDWRAP, UDPWRAP, WRITEWRAP, ZLIB, SSLCONNECTION, PBKDF2REQUEST,
+RANDOMBYTESREQUEST, TLSWRAP
+```
+
+Users are be able to define their own `type` when using the public embedder API.
+
+*Note:* It is possible to have type name collisions. Embedders are encouraged
+to use a unique prefixes, such as the npm package name, to prevent collisions
+when listening to the hooks.
+
+###### `triggerid`
+
+`triggerId` is the `asyncId` of the resource that caused (or "triggered") the
+new resource to initialize and that caused `init` to call. This is different
+from `async_hooks.currentId()` that only shows *when* a resource was created,
+while `triggerId` shows *why* a resource was created.
+
+
+The following is a simple demonstration of `triggerId`:
+
+```js
+const async_hooks = require('async_hooks');
+
+async_hooks.createHook({
+  init(asyncId, type, triggerId) {
+    const cId = async_hooks.currentId();
+    fs.writeSync(1, `${type}(${asyncId}): ` +
+                    `trigger: ${triggerId} scope: ${cId}\n`);
+  }
+}).enable();
+
+require('net').createServer((conn) => {}).listen(8080);
+```
+
+Output when hitting the server with `nc localhost 8080`:
+
+```
+TCPWRAP(2): trigger: 1 scope: 1
+TCPWRAP(4): trigger: 2 scope: 0
+```
+
+The first `TCPWRAP` is the server which receives the connections.
+
+The second `TCPWRAP` is the new connection from the client. When a new
+connection is made the `TCPWrap` instance is immediately constructed. This
+happens outside of any JavaScript stack (side note: a `currentId()` of `0`
+means it's being executed from C++, with no JavaScript stack above it).
+With only that information it would be impossible to link resources together in
+terms of what caused them to be created, so `triggerId` is given the task of
+propagating what resource is responsible for the new resource's existence.
+
+###### `resource`
+
+`resource` is an object that represents the actual resource. This can contain
+useful information such as the hostname for the `GETADDRINFOREQWRAP` resource
+type, which will be used when looking up the ip for the hostname in
+`net.Server.listen`. The API for getting this information is currently not
+considered public, but using the Embedder API users can provide and document
+their own resource objects. Such as resource object could for example contain
+the SQL query being executed.
+
+*Note:* In some cases the resource object is reused for performance reasons,
+it is thus not safe to use it as a key in a `WeakMap` or add properties to it.
+
+###### asynchronous context example
+
+Below is another example with additional information about the calls to
+`init` between the `before` and `after` calls, specifically what the
+callback to `listen()` will look like. The output formatting is slightly more
+elaborate to make calling context easier to see.
+
+```js
+const async_hooks = require('async_hooks');
+
+let indent = 0;
+async_hooks.createHook({
+  init(asyncId, type, triggerId) {
+    const cId = async_hooks.currentId();
+    fs.writeSync(1, ' '.repeat(indent) + `${type}(${asyncId}): ` +
+                    `trigger: ${triggerId} scope: ${cId}\n`);
+  },
+  before(asyncId) {
+    fs.writeSync(1, ' '.repeat(indent) + `before:  ${asyncId}`);
+    indent += 2;
+  },
+  after(asyncId) {
+    indent -= 2;
+    fs.writeSync(1, ' '.repeat(indent) + `after:   ${asyncId}`);
+  },
+  destroy(asyncId) {
+    fs.writeSync(1, ' '.repeat(indent) + `destroy: ${asyncId}`);
+  },
+}).enable();
+
+require('net').createServer(() => {}).listen(8080, () => {
+  // Let's wait 10ms before logging the server started.
+  setTimeout(() => {
+    console.log('>>>', async_hooks.currentId());
+  }, 10);
+});
+```
+
+Output from only starting the server:
+
+```
+TCPWRAP(2): trigger: 1 scope: 1
+TickObject(3): trigger: 2 scope: 1
+before:  3
+  Timeout(4): trigger: 3 scope: 3
+  TIMERWRAP(5): trigger: 3 scope: 3
+after:   3
+destroy: 3
+before:  5
+  before:  4
+    TTYWRAP(6): trigger: 4 scope: 4
+    SIGNALWRAP(7): trigger: 4 scope: 4
+    TTYWRAP(8): trigger: 4 scope: 4
+>>> 4
+    TickObject(9): trigger: 4 scope: 4
+  after:   4
+  destroy: 4
+after:   5
+before:  9
+after:   9
+destroy: 9
+destroy: 5
+```
+
+*Note*: As illustrated in the example, `currentId()` and `scope` each specify
+the value of the current execution context; which is delineated by calls to
+`before` and `after`.
+
+Only using `scope` to graph resource allocation results in the following:
+
+```
+TTYWRAP(6) -> Timeout(4) -> TIMERWRAP(5) -> TickObject(3) -> root(1)
+```
+
+The `TCPWRAP` isn't part of this graph; even though it was the reason for
+`console.log()` being called. This is because binding to a port without a
+hostname is actually synchronous, but to maintain a completely asynchronous API
+the user's callback is placed in a `process.nextTick()`.
+
+The graph only shows *when* a resource was created, not *why*, so to track
+the *why* use `triggerId`.
+
+
+##### `before(asyncId)`
+
+* `asyncId` {number}
+
+When an asynchronous operation is initiated (such as a TCP server receiving a
+new connection) or completes (such as writing data to disk) a callback is
+called to notify the user. The `before` callback is called just before said
+callback is executed. `asyncId` is the unique identifier assigned to the
+resource about to execute the callback.
+
+The `before` callback will be called 0 to N times. The `before` callback
+will typically be called 0 times if the asynchronous operation was cancelled
+or for example if no connections are received by a TCP server. Asynchronous
+like the TCP server will typically call the `before` callback multiple times,
+while other operations like `fs.open()` will only call it once.
+
+
+##### `after(asyncId)`
+
+* `asyncId` {number}
+
+Called immediately after the callback specified in `before` is completed.
+
+*Note:* If an uncaught exception occurs during execution of the callback then
+`after` will run after the `'uncaughtException'` event is emitted or a
+`domain`'s handler runs.
+
+
+##### `destroy(asyncId)`
+
+* `asyncId` {number}
+
+Called after the resource corresponding to `asyncId` is destroyed. It is also called
+asynchronously from the embedder API `emitDestroy()`.
+
+*Note:* Some resources depend on GC for cleanup, so if a reference is made to
+the `resource` object passed to `init` it's possible that `destroy` is
+never called, causing a memory leak in the application. Of course if
+the resource doesn't depend on GC then this isn't an issue.
+
+#### `async_hooks.currentId()`
+
+* Returns {number} the `asyncId` of the current execution context. Useful to track
+  when something calls.
+
+For example:
+
+```js
+console.log(async_hooks.currentId());  // 1 - bootstrap
+fs.open(path, (err, fd) => {
+  console.log(async_hooks.currentId());  // 2 - open()
+});
+```
+
+It is important to note that the ID returned fom `currentId()` is related to
+execution timing, not causality (which is covered by `triggerId()`). For
+example:
+
+```js
+const server = net.createServer(function onConnection(conn) {
+  // Returns the ID of the server, not of the new connection, because the
+  // onConnection callback runs in the execution scope of the server's
+  // MakeCallback().
+  async_hooks.currentId();
+
+}).listen(port, function onListening() {
+  // Returns the ID of a TickObject (i.e. process.nextTick()) because all
+  // callbacks passed to .listen() are wrapped in a nextTick().
+  async_hooks.currentId();
+});
+```
+
+#### `async_hooks.triggerId()`
+
+* Returns {number} the ID of the resource responsible for calling the callback
+  that is currently being executed.
+
+For example:
+
+```js
+const server = net.createServer((conn) => {
+  // Though the resource that caused (or triggered) this callback to
+  // be called was that of the new connection. Thus the return value
+  // of triggerId() is the ID of "conn".
+  async_hooks.triggerId();
+
+}).listen(port, () => {
+  // Even though all callbacks passed to .listen() are wrapped in a nextTick()
+  // the callback itself exists because the call to the server's .listen()
+  // was made. So the return value would be the ID of the server.
+  async_hooks.triggerId();
+});
+```
+
+## JavaScript Embedder API
+
+Library developers that handle their own I/O, a connection pool, or
+callback queues will need to hook into the AsyncWrap API so that all the
+appropriate callbacks are called. To accommodate this a JavaScript API is
+provided.
+
+### `class AsyncResource()`
+
+The class `AsyncResource` was designed to be extended by the embedder's async
+resources. Using this users can easily trigger the lifetime events of their
+own resources.
+
+The `init` hook will trigger when an `AsyncResource` is instantiated.
+
+It is important that `before`/`after` calls are unwound
+in the same order they are called. Otherwise an unrecoverable exception
+will occur and node will abort.
+
+The following is an overview of the `AsyncResource` API.
+
+```js
+// AsyncResource() is meant to be extended. Instantiating a
+// new AsyncResource() also triggers init. If triggerId is omitted then
+// async_hook.currentId() is used.
+const asyncResource = new AsyncResource(type, triggerId);
+
+// Call AsyncHooks before callbacks.
+asyncResource.emitBefore();
+
+// Call AsyncHooks after callbacks.
+asyncResource.emitAfter();
+
+// Call AsyncHooks destroy callbacks.
+asyncResource.emitDestroy();
+
+// Return the unique ID assigned to the AsyncResource instance.
+asyncResource.asyncId();
+
+// Return the trigger ID for the AsyncResource instance.
+asyncResource.triggerId();
+```
+
+#### `AsyncResource(type[, triggerId])`
+
+* arguments
+  * `type` {string} the type of ascyc event
+  * `triggerId` {number} the ID of the execution context that created this async
+    event
+
+Example usage:
+
+```js
+class DBQuery extends AsyncResource {
+  constructor(db) {
+    super('DBQuery');
+    this.db = db;
+  }
+
+  getInfo(query, callback) {
+    this.db.get(query, (err, data) => {
+      this.emitBefore();
+      callback(err, data);
+      this.emitAfter();
+    });
+  }
+
+  close() {
+    this.db = null;
+    this.emitDestroy();
+  }
+}
+```
+
+#### `asyncResource.emitBefore()`
+
+* Returns {undefined}
+
+Call all `before` callbacks and let them know a new asynchronous execution
+context is being entered. If nested calls to `emitBefore()` are made, the stack
+of `asyncId`s will be tracked and properly unwound.
+
+#### `asyncResource.emitAfter()`
+
+* Returns {undefined}
+
+Call all `after` callbacks. If nested calls to `emitBefore()` were made, then
+make sure the stack is unwound properly. Otherwise an error will be thrown.
+
+If the user's callback throws an exception then `emitAfter()` will
+automatically be called for all `asyncId`s on the stack if the error is handled by
+a domain or `'uncaughtException'` handler.
+
+#### `asyncResource.emitDestroy()`
+
+* Returns {undefined}
+
+Call all `destroy` hooks. This should only ever be called once. An error will
+be thrown if it is called more than once. This **must** be manually called. If
+the resource is left to be collected by the GC then the `destroy` hooks will
+never be called.
+
+#### `asyncResource.asyncId()`
+
+* Returns {number} the unique `asyncId` assigned to the resource.
+
+#### `asyncResource.triggerId()`
+
+* Returns {number} the same `triggerId` that is passed to the `AsyncResource`
+constructor.
+
+[`Hook Callbacks`]: #hook-callbacks