In cases where there are multiple @-chars in a url, Node currently
parses the hostname and auth sections differently than web browsers.
This part of the bug is serious, and should be landed in v0.10, and
also ported to v0.8, and releases made as soon as possible.
The less serious issue is that there are many other sorts of malformed
urls which Node either accepts when it should reject, or interprets
differently than web browsers. For example, `http://a.com*foo` is
interpreted by Node like `http://a.com/*foo` when web browsers treat
this as `http://a.com%3Bfoo/`.
In general, *only* the `hostEndingChars` should be the characters that
delimit the host portion of the URL. Most of the current `nonHostChars`
that appear in the hostname should be escaped, but some of them (such as
`;` and `%` when it does not introduce a hex pair) should raise an
error.
We need to have a broader discussion about whether it's best to throw in
these cases, and potentially break extant programs, or return an object
that has every field set to `null` so that any attempt to read the
hostname/auth/etc. will appear to be empty.
In some cases, the http CONNECT/Upgrade API is unshifting an empty
bodyHead buffer onto the socket.
Normally, stream.unshift(chunk) does not set state.reading=false.
However, this check was not being done for the case when the chunk was
empty (either `''` or `Buffer(0)`), and as a result, it was causing the
socket to think that a read had completed, and to stop providing data.
This bug is not limited to http or web sockets, but rather would affect
any parser that unshifts data back onto the source stream without being
very careful to never unshift an empty chunk. Since the intent of
unshift is to *not* change the state.reading property, this is a bug.
Fixes#5557FixesLearnBoost/socket.io#1242
Before this, entering something like:
> JSON.parse('066');
resulted in the "..." prompt instead of displaying the expected
"SyntaxError: Unexpected number"
1. Emit `sslOutEnd` only when `_internallyPendingBytes() === 0`.
2. Read before checking `._halfRead`, otherwise we'll see only previous
value, and will invoke `._write` callback improperly.
3. Wait for both `end` and `finish` events in `.destroySoon`.
4. Unpipe encrypted stream from socket to prevent write after destroy.
Stream's `._write()` callback should be invoked only after it's opposite
stream has finished processing incoming data, otherwise `finish` event
fires too early and connection might be closed while there's some data
to send to the client.
see #5544
Quote from SSL_shutdown man page:
The output of SSL_get_error(3) may be misleading,
as an erroneous SSL_ERROR_SYSCALL may be flagged even though
no error occurred.
Also, handle all other errors to prevent assertion in `ClearError()`.
When writing bad data to EncryptedStream it'll first get to the
ClientHello parser, and, only after it will refuse it, to the OpenSSL.
But ClientHello parser has limited buffer and therefore write could
return `bytes_written` < `incoming_bytes`, which is not the case when
working with OpenSSL.
After such errors ClientHello parser disables itself and will
pass-through all data to the OpenSSL. So just trying to write data one
more time will throw the rest into OpenSSL and let it handle it.
Otherwise, writing an empty string causes the whole program to grind to
a halt when piping data into http messages.
This wasn't as much of a problem (though it WAS a bug) in 0.8 and
before, because our hyperactive 'drain' behavior would mean that some
*previous* write() would probably have a pending drain event, and cause
things to start moving again.
This saves a few calls to gettimeofday which can be expensive, and
potentially subject to clock drift. Instead use the loop time which
uses hrtime internally.
fixes#5497
This commit adds an optimization to the HTTP client that makes it
possible to:
* Pack the headers and the first chunk of the request body into a
single write().
* Pack the chunk header and the chunk itself into a single write().
Because only one write() system call is issued instead of several,
the chances of data ending up in a single TCP packet are phenomenally
higher: the benchmark with `type=buf size=32` jumps from 50 req/s to
7,500 req/s, a 150-fold increase.
This commit removes the check from e4b716ef that pushes binary encoded
strings into the slow path. The commit log mentions that:
We were assuming that any string can be concatenated safely to
CRLF. However, for hex, base64, or binary encoded writes, this
is not the case, and results in sending the incorrect response.
For hex and base64 strings that's certainly true but binary strings
are 'das Ding an sich': string.length is the same before and after
decoding.
Fixes#5528.
When an internal api needs a timeout, they should use
timers._unrefActive since that won't hold the loop open. This solves
the problem where you might have unref'd the socket handle but the
timeout for the socket was still active.
Previously one could write anywhere in a buffer pool if they accidently
got their offset wrong. Mainly because the cc level checks only test
against the parent slow buffer and not against the js object properties.
So now we check to make sure values won't go beyond bounds without
letting the dev know.
Add localAddress and localPort properties to tls.CleartextStream.
Like remoteAddress and localPort, delegate to the backing net.Socket
object.
Refs #5502.
Test case:
var t = setInterval(function() {}, 1);
process.nextTick(t.unref);
Output:
Assertion failed: (args.Holder()->InternalFieldCount() > 0),
function Unref, file ../src/handle_wrap.cc, line 78.
setInterval() returns a binding layer object. Make it stop doing that,
wrap the raw process.binding('timer_wrap').Timer object in a Timeout
object.
Fixes#4261.
Commit 38149bb changes http.get() and http.request() to escape unsafe
characters. However, that creates an incompatibility with v0.10 that
is difficult to work around: if you escape the path manually, then in
v0.11 it gets escaped twice. Change lib/http.js so it no longer tries
to fix up bad request paths, simply reject them with an exception.
The actual check is rather basic right now. The full check for illegal
characters is difficult to implement efficiently because it requires a
few characters of lookahead. That's why it currently only checks for
spaces because those are guaranteed to create an invalid request.
Fixes#5474.
getServers returns an array of ips that are currently being used for
resolution
setServers takes an array of ips that are to be used for resolution,
this will throw if there's invalid input but preserve the original
configuration
Pretty much everything assumes strings to be utf-8, but crypto
traditionally used binary strings, so we need to keep the default
that way until most users get off of that pattern.
If there is an encoding, and we do 'stream.push(chunk, enc)', and the
encoding argument matches the stated encoding, then we're converting from
a string, to a buffer, and then back to a string. Of course, this is a
completely pointless bit of work, so it's best to avoid it when we know
that we can do so safely.
Empirical evidence suggests that OS-level load balancing (that is,
having multiple processes listen on a socket and have the operating
system wake up one when a connection comes in) produces skewed load
distributions on Linux, Solaris and possibly other operating systems.
The observed behavior is that a fraction of the listening processes
receive the majority of the connections. From the perspective of the
operating system, that somewhat makes sense: a task switch is expensive,
to be avoided whenever possible. That's why the operating system likes
to give preferential treatment to a few processes, because it reduces
the number of switches.
However, that rather subverts the purpose of the cluster module, which
is to distribute the load as evenly as possible. That's why this commit
adds (and defaults to) round-robin support, meaning that the master
process accepts connections and distributes them to the workers in a
round-robin fashion, effectively bypassing the operating system.
Round-robin is currently disabled on Windows due to how IOCP is wired
up. It works and you can select it manually but it probably results in
a heavy performance hit.
Fixes#4435.
Commit 9352c19 ("child_process: don't emit same handle twice") trades
one bug for another.
Before said commit, a handle was sometimes delivered with messages it
didn't belong to.
The bug fix introduced another bug that needs some explaining. On UNIX
systems, handles are basically file descriptors that are passed around
with the sendmsg() and recvmsg() system calls, using auxiliary data
(SCM_RIGHTS) as the transport.
node.js and libuv depend on the fact that none of the supported systems
ever emit more than one SCM_RIGHTS message from a recvmsg() syscall.
That assumption is something we should probably address someday for the
sake of portability but that's a separate discussion.
So, SCM_RIGHTS messages are never coalesced. SCM_RIGHTS and normal
messages however _are_ coalesced. That is, recvmsg() might return this:
recvmsg(); // { "message-with-fd", "message", "message" }
The operating system implicitly breaks pending messages along
SCM_RIGHTS boundaries. Most Unices break before such messages but Linux
also breaks _after_ them. When the sender looks like this:
sendmsg("message");
sendmsg("message-with-fd");
sendmsg("message");
Then on most Unices the receiver sees messages arriving like this:
recvmsg(); // { "message" }
recvmsg(); // { "message-with-fd", "message" }
The bug fix in commit 9352c19 assumes this behavior. On Linux however,
those messages can also come in like this:
recvmsg(); // { "message", "message-with-fd" }
recvmsg(); // { "message" }
In other words, it's incorrect to assume that the file descriptor is
always attached to the first message. This commit makes node wise up.
Fixes#5330.
This adds proper support for the following situation:
w.cork();
w.write(...);
w.cork();
w.write(...);
w.uncork();
w.write(...);
w.uncork();
This is relevant when you have a function (as we do in HTTP) that wants
to use cork, but in some cases, want to have a cork/uncork *around*
that function, without losing the benefits of writev.
In synchronous Writable streams (where the _write cb is called on the
current tick), the 'finish' event (and thus the end() callback) can in
some cases be called before all the write() callbacks are called.
Use a counter, and have stream.Transform rely on the 'prefinish' event
instead of the 'finish' event.
This has zero effect on most streams, but it corrects an edge case and
makes it perform more deterministically, which is a Good Thing.
Implement support for debugging cluster workers. Each worker process
is assigned a new debug port in an increasing sequence.
I.e. when master process uses port 5858, then worker 1 uses port 5859,
worker 2 uses port 5860, and so on.
Introduce new command-line parameter '--debug-port=' which sets debug_port
but does not start debugger. This option works for all node processes, it
is not specific to cluster workers.
Fixesjoyent/node#5318.
When developer calls setBreakpoint with an unknown script name,
we convert the script name into regular expression matching all
paths ending with given name (name can be a relative path too).
To create such breakpoint in V8, we use type `scriptRegEx`
instead of `scriptId` for `setbreakpoint` request.
To restore such breakpoint, we save the original script name
send by the user. We use this original name to set (restore)
breakpoint in the new child process.
This is a back-port of commit 5db936d from the master branch.
Add a watchdog class which executes a timer in a separate event loop in
a separate thread that will terminate v8 execution if it expires.
Add timeout argument to functions in vm module which use the watchdog
if a non-zero timeout is specified.
Forward-port the comments from commit 01e2920 (v0.10) to the master
branch. Everything else from that patch already exists in master.
It didn't merge cleanly because lib/http.js has been split up in
several files.