Empirical evidence suggests that OS-level load balancing (that is,
having multiple processes listen on a socket and have the operating
system wake up one when a connection comes in) produces skewed load
distributions on Linux, Solaris and possibly other operating systems.
The observed behavior is that a fraction of the listening processes
receive the majority of the connections. From the perspective of the
operating system, that somewhat makes sense: a task switch is expensive,
to be avoided whenever possible. That's why the operating system likes
to give preferential treatment to a few processes, because it reduces
the number of switches.
However, that rather subverts the purpose of the cluster module, which
is to distribute the load as evenly as possible. That's why this commit
adds (and defaults to) round-robin support, meaning that the master
process accepts connections and distributes them to the workers in a
round-robin fashion, effectively bypassing the operating system.
Round-robin is currently disabled on Windows due to how IOCP is wired
up. It works and you can select it manually but it probably results in
a heavy performance hit.
Fixes#4435.
Commit 9352c19 ("child_process: don't emit same handle twice") trades
one bug for another.
Before said commit, a handle was sometimes delivered with messages it
didn't belong to.
The bug fix introduced another bug that needs some explaining. On UNIX
systems, handles are basically file descriptors that are passed around
with the sendmsg() and recvmsg() system calls, using auxiliary data
(SCM_RIGHTS) as the transport.
node.js and libuv depend on the fact that none of the supported systems
ever emit more than one SCM_RIGHTS message from a recvmsg() syscall.
That assumption is something we should probably address someday for the
sake of portability but that's a separate discussion.
So, SCM_RIGHTS messages are never coalesced. SCM_RIGHTS and normal
messages however _are_ coalesced. That is, recvmsg() might return this:
recvmsg(); // { "message-with-fd", "message", "message" }
The operating system implicitly breaks pending messages along
SCM_RIGHTS boundaries. Most Unices break before such messages but Linux
also breaks _after_ them. When the sender looks like this:
sendmsg("message");
sendmsg("message-with-fd");
sendmsg("message");
Then on most Unices the receiver sees messages arriving like this:
recvmsg(); // { "message" }
recvmsg(); // { "message-with-fd", "message" }
The bug fix in commit 9352c19 assumes this behavior. On Linux however,
those messages can also come in like this:
recvmsg(); // { "message", "message-with-fd" }
recvmsg(); // { "message" }
In other words, it's incorrect to assume that the file descriptor is
always attached to the first message. This commit makes node wise up.
Fixes#5330.
This adds proper support for the following situation:
w.cork();
w.write(...);
w.cork();
w.write(...);
w.uncork();
w.write(...);
w.uncork();
This is relevant when you have a function (as we do in HTTP) that wants
to use cork, but in some cases, want to have a cork/uncork *around*
that function, without losing the benefits of writev.
In synchronous Writable streams (where the _write cb is called on the
current tick), the 'finish' event (and thus the end() callback) can in
some cases be called before all the write() callbacks are called.
Use a counter, and have stream.Transform rely on the 'prefinish' event
instead of the 'finish' event.
This has zero effect on most streams, but it corrects an edge case and
makes it perform more deterministically, which is a Good Thing.
Implement support for debugging cluster workers. Each worker process
is assigned a new debug port in an increasing sequence.
I.e. when master process uses port 5858, then worker 1 uses port 5859,
worker 2 uses port 5860, and so on.
Introduce new command-line parameter '--debug-port=' which sets debug_port
but does not start debugger. This option works for all node processes, it
is not specific to cluster workers.
Fixesjoyent/node#5318.
When developer calls setBreakpoint with an unknown script name,
we convert the script name into regular expression matching all
paths ending with given name (name can be a relative path too).
To create such breakpoint in V8, we use type `scriptRegEx`
instead of `scriptId` for `setbreakpoint` request.
To restore such breakpoint, we save the original script name
send by the user. We use this original name to set (restore)
breakpoint in the new child process.
This is a back-port of commit 5db936d from the master branch.
Add a watchdog class which executes a timer in a separate event loop in
a separate thread that will terminate v8 execution if it expires.
Add timeout argument to functions in vm module which use the watchdog
if a non-zero timeout is specified.
Forward-port the comments from commit 01e2920 (v0.10) to the master
branch. Everything else from that patch already exists in master.
It didn't merge cleanly because lib/http.js has been split up in
several files.
When developer calls setBreakpoint with an unknown script name,
we convert the script name into regular expression matching all
paths ending with given name (name can be a relative path too).
To create such breakpoint in V8, we use type `scriptRegEx`
instead of `scriptId` for `setbreakpoint` request.
To restore such breakpoint, we save the original script name
send by the user. We use this original name to set (restore)
breakpoint in the new child process.
Fixed a bug in debugger repl where `restart` command did not work
when a custom debug port was specified via command-line option
--port={number}.
File test/simple/helper-debugger-repl.js was extracted
from test/simple/test-debugger-repl.js
Fixed a bug in debugger repl where `restart` command did not work
when a custom debug port was specified via command-line option
--port={number}.
File test/simple/helper-debugger-repl.js was extracted
from test/simple/test-debugger-repl.js
Fixes#3740
In the case of pipelined requests, you can have a situation where
the socket gets destroyed via one req/res object, but then trying
to destroy *another* req/res on the same socket will cause it to
call undefined.destroy(), since it was already removed from that
message.
Add a guard to OutgoingMessage.destroy and IncomingMessage.destroy
to prevent this error.
Fixes#3740
In the case of pipelined requests, you can have a situation where
the socket gets destroyed via one req/res object, but then trying
to destroy *another* req/res on the same socket will cause it to
call undefined.destroy(), since it was already removed from that
message.
Add a guard to OutgoingMessage.destroy and IncomingMessage.destroy
to prevent this error.
It needs to apply the Transform class when the _readableState,
_writableState, or _transformState properties are accessed,
otherwise things like setEncoding and on('data') don't work
properly.
Also, the methods wrappers are no longer needed, since they're only
problematic because they access the undefined properties.
Clean up and DRY the cluster source code. Fix a few bugs while we're
here:
* Short-lived handles in long-lived worker processes were never
reclaimed, resulting in resource leaks.
* Handles in the master process are now closed when the last worker
that holds a reference to them quits. Previously, they were only
closed at cluster shutdown.
* The cluster object no longer exposes functions/properties that are
only valid in the 'other' process, e.g. cluster.fork() is no longer
exported in worker processes.
So much goodness and still manages to reduce the line count from 590
to 320.
An absolute path will always open the same location regardless of your
current working directory. For posix, this just means path.charAt(0) ===
'/', but on Windows it's a little more complicated.
Fixesjoyent/node#5299.
4716dc6 made assert.equal() and related functions work better by
generating a better toString() from the expected, actual, and operator
values passed to fail(). Unfortunately, this was accomplished by putting
the generated message into the error's `name` property. When you passed
in a custom error message, the error would put the custom error into
`name` *and* `message`, resulting in helpful string representations like
"AssertionError: Oh no: Oh no".
This commit resolves that issue by storing the generated message in the
`message` property while leaving the error's name alone and adding
a regression test so that this doesn't pop back up later.
Closes#5292.
I broke dgram.Socket#bind(port, cb) almost a year ago in 332fea5a but
it wasn't until today that someone complained and none of the tests
caught it because they all either specify the address or omit the
callback.
Anyway, now it works again and does what you expect: it binds the
socket to the "any" address ("0.0.0.0" for IPv4 and "::" for IPv6.)
Make http.request() and friends escape unsafe characters in the request
path. That is, a request for '/foo bar' is now escaped as '/foo%20bar'.
Before this commit, the path was used as-is in the request status line,
creating an invalid HTTP request ("GET /foo bar HTTP/1.1").
Fixes#4381.
Fix#5272
The consumption of a readable stream is a dance with 3 partners.
1. The specific stream Author (A)
2. The Stream Base class (B), and
3. The Consumer of the stream (C)
When B calls the _read() method that A implements, it sets a 'reading'
flag, so that parallel calls to _read() can be avoided. When A calls
stream.push(), B knows that it's safe to start calling _read() again.
If the consumer C is some kind of parser that wants in some cases to
pass the source stream off to some other party, but not before "putting
back" some bit of previously consumed data (as in the case of Node's
websocket http upgrade implementation). So, stream.unshift() will
generally *never* be called by A, but *only* called by C.
Prior to this patch, stream.unshift() *also* unset the state.reading
flag, meaning that C could indicate the end of a read, and B would
dutifully fire off another _read() call to A. This is inappropriate.
In the case of fs streams, and other variably-laggy streams that don't
tolerate overlapped _read() calls, this causes big problems.
Also, calling stream.shift() after the 'end' event did not raise any
kind of error, but would cause very strange behavior indeed. Calling it
after the EOF chunk was seen, but before the 'end' event was fired would
also cause weird behavior, and could lead to data being lost, since it
would not emit another 'readable' event.
This change makes it so that:
1. stream.unshift() does *not* set state.reading = false
2. stream.unshift() is allowed up until the 'end' event.
3. unshifting onto a EOF-encountered and zero-length (but not yet
end-emitted) stream will defer the 'end' event until the new data is
consumed.
4. pushing onto a EOF-encountered stream is now an error.
So, if you read(), you have that single tick to safely unshift() data
back into the stream, even if the null chunk was pushed, and the length
was 0.
Don't scan the whole string for a "NODE_" substring, just check that
the string starts with the expected prefix.
This is a reprise of dbbfbe7 but this time for the child_process
module.
Don't scan the whole string for a "NODE_CLUSTER_" substring, just check
that the string starts with the expected prefix. The linear scan was
causing a noticeable (but unsurprising) slowdown on messages with a
large .cmd string property.
Buffer.byteLength() works only for string inputs. Thus, when connection
has pending Buffer to write, it should just use it's length instead of
throwing exception.
Since 049903e, an end callback could be called before a write
callback if end() is called before the write is done. This patch
resolves the issue.
In collaboration with @gne
Fixes felixge/node-formidable#209
Fixes#5215
We were assuming that any string can be concatenated safely to
CRLF. However, for hex, base64, or binary encoded writes, this
is not the case, and results in sending the incorrect response.
An unusual edge case, but certainly a bug.