While the new Buffer implementation is much faster we still have the
necessity of using Buffer pools. This is undesirable because it may
still lead to unwanted memory retention, but for the time being this is
the best solution.
Because of this re-introduction, and since there is no more SlowBuffer
type, the SlowBuffer method has been re-purposed to return a non-pooled
Buffer instance. This will be helpful for developers to store data for
indeterminate lengths of time without introducing a memory leak.
Another change to Buffer pools was that they are only allocated if the
requested chunk is < poolSize / 2. This was done because allocations are
much quicker now, and it's a better use of the pool.
Memory allocations are now done through smalloc. The Buffer cc class has
been removed completely, but for backwards compatibility have left the
namespace as Buffer.
The .parent attribute is only set if the Buffer is a slice of an
allocation. Which is then set to the alloc object (not a Buffer).
The .offset attribute is now a ReadOnly set to 0, for backwards
compatibility. I'd like to remove it in the future (pre v1.0).
A few alterations have been made to how arguments are either coerced or
thrown. All primitives will now be coerced to their respective values,
and (most) all out of range index requests will throw.
The indexes that are coerced were left for backwards compatibility. For
example: Buffer slice operates more like Array slice, and coerces
instead of throwing out of range indexes. This may change in the future.
The reason for wanting to throw for out of range indexes is because
giving js access to raw memory has high potential risk. To mitigate that
it's easier to make sure the developer is always quickly alerted to the
fact that their code is attempting to access beyond memory bounds.
Because SlowBuffer will be deprecated, and simply returns a new Buffer
instance, all tests on SlowBuffer have been removed.
Heapdumps will now show usage under "smalloc" instead of "Buffer".
ParseArrayIndex was added to node_internals to support proper uint
argument checking/coercion for external array data indexes.
SlabAllocator had to be updated since handle_ no longer exists.
Split `tls.js` into `_tls_legacy.js`, containing legacy
`createSecurePair` API, and `_tls_wrap.js` containing new code based on
`tls_wrap` binding.
Remove tests that are no longer useful/valid.
This reverts commit a40133d10c.
Unfortunately, this breaks socket.io. Even though it's not strictly an
API change, it is too subtle and in too brittle an area of node, to be
done in a stable branch.
Conflicts:
doc/api/http.markdown
Normalize the encoding in getEncoding() before using it. Fixes a
"AssertionError: Cannot change encoding" exception when the caller
mixes "utf8" and "utf-8".
Fixes#5655.
This is only relevant for CentOS 6.3 using kernel version 2.6.32.
On other linuxes and darwin, the `read` call gets an ECONNRESET in that
case. On sunos, the `write` call fails with EPIPE.
However, old CentOS will occasionally send an EOF instead of a
ECONNRESET or EPIPE when the client has been destroyed abruptly.
Make sure we don't keep trying to write or read more in that case.
Fixes#5504
However, there is still the question of what libuv should do when it
gets an EOF. Apparently in this case, it will continue trying to read,
which is almost certainly the wrong thing to do.
That should be fixed in libuv, even though this works around the issue.
In cases where there are multiple @-chars in a url, Node currently
parses the hostname and auth sections differently than web browsers.
This part of the bug is serious, and should be landed in v0.10, and
also ported to v0.8, and releases made as soon as possible.
The less serious issue is that there are many other sorts of malformed
urls which Node either accepts when it should reject, or interprets
differently than web browsers. For example, `http://a.com*foo` is
interpreted by Node like `http://a.com/*foo` when web browsers treat
this as `http://a.com%3Bfoo/`.
In general, *only* the `hostEndingChars` should be the characters that
delimit the host portion of the URL. Most of the current `nonHostChars`
that appear in the hostname should be escaped, but some of them (such as
`;` and `%` when it does not introduce a hex pair) should raise an
error.
We need to have a broader discussion about whether it's best to throw in
these cases, and potentially break extant programs, or return an object
that has every field set to `null` so that any attempt to read the
hostname/auth/etc. will appear to be empty.
In some cases, the http CONNECT/Upgrade API is unshifting an empty
bodyHead buffer onto the socket.
Normally, stream.unshift(chunk) does not set state.reading=false.
However, this check was not being done for the case when the chunk was
empty (either `''` or `Buffer(0)`), and as a result, it was causing the
socket to think that a read had completed, and to stop providing data.
This bug is not limited to http or web sockets, but rather would affect
any parser that unshifts data back onto the source stream without being
very careful to never unshift an empty chunk. Since the intent of
unshift is to *not* change the state.reading property, this is a bug.
Fixes#5557FixesLearnBoost/socket.io#1242
Before this, entering something like:
> JSON.parse('066');
resulted in the "..." prompt instead of displaying the expected
"SyntaxError: Unexpected number"
1. Emit `sslOutEnd` only when `_internallyPendingBytes() === 0`.
2. Read before checking `._halfRead`, otherwise we'll see only previous
value, and will invoke `._write` callback improperly.
3. Wait for both `end` and `finish` events in `.destroySoon`.
4. Unpipe encrypted stream from socket to prevent write after destroy.
Stream's `._write()` callback should be invoked only after it's opposite
stream has finished processing incoming data, otherwise `finish` event
fires too early and connection might be closed while there's some data
to send to the client.
see #5544
Quote from SSL_shutdown man page:
The output of SSL_get_error(3) may be misleading,
as an erroneous SSL_ERROR_SYSCALL may be flagged even though
no error occurred.
Also, handle all other errors to prevent assertion in `ClearError()`.
When writing bad data to EncryptedStream it'll first get to the
ClientHello parser, and, only after it will refuse it, to the OpenSSL.
But ClientHello parser has limited buffer and therefore write could
return `bytes_written` < `incoming_bytes`, which is not the case when
working with OpenSSL.
After such errors ClientHello parser disables itself and will
pass-through all data to the OpenSSL. So just trying to write data one
more time will throw the rest into OpenSSL and let it handle it.
Otherwise, writing an empty string causes the whole program to grind to
a halt when piping data into http messages.
This wasn't as much of a problem (though it WAS a bug) in 0.8 and
before, because our hyperactive 'drain' behavior would mean that some
*previous* write() would probably have a pending drain event, and cause
things to start moving again.
This saves a few calls to gettimeofday which can be expensive, and
potentially subject to clock drift. Instead use the loop time which
uses hrtime internally.
fixes#5497
This commit adds an optimization to the HTTP client that makes it
possible to:
* Pack the headers and the first chunk of the request body into a
single write().
* Pack the chunk header and the chunk itself into a single write().
Because only one write() system call is issued instead of several,
the chances of data ending up in a single TCP packet are phenomenally
higher: the benchmark with `type=buf size=32` jumps from 50 req/s to
7,500 req/s, a 150-fold increase.
This commit removes the check from e4b716ef that pushes binary encoded
strings into the slow path. The commit log mentions that:
We were assuming that any string can be concatenated safely to
CRLF. However, for hex, base64, or binary encoded writes, this
is not the case, and results in sending the incorrect response.
For hex and base64 strings that's certainly true but binary strings
are 'das Ding an sich': string.length is the same before and after
decoding.
Fixes#5528.
When an internal api needs a timeout, they should use
timers._unrefActive since that won't hold the loop open. This solves
the problem where you might have unref'd the socket handle but the
timeout for the socket was still active.
Previously one could write anywhere in a buffer pool if they accidently
got their offset wrong. Mainly because the cc level checks only test
against the parent slow buffer and not against the js object properties.
So now we check to make sure values won't go beyond bounds without
letting the dev know.
Add localAddress and localPort properties to tls.CleartextStream.
Like remoteAddress and localPort, delegate to the backing net.Socket
object.
Refs #5502.
Test case:
var t = setInterval(function() {}, 1);
process.nextTick(t.unref);
Output:
Assertion failed: (args.Holder()->InternalFieldCount() > 0),
function Unref, file ../src/handle_wrap.cc, line 78.
setInterval() returns a binding layer object. Make it stop doing that,
wrap the raw process.binding('timer_wrap').Timer object in a Timeout
object.
Fixes#4261.
Commit 38149bb changes http.get() and http.request() to escape unsafe
characters. However, that creates an incompatibility with v0.10 that
is difficult to work around: if you escape the path manually, then in
v0.11 it gets escaped twice. Change lib/http.js so it no longer tries
to fix up bad request paths, simply reject them with an exception.
The actual check is rather basic right now. The full check for illegal
characters is difficult to implement efficiently because it requires a
few characters of lookahead. That's why it currently only checks for
spaces because those are guaranteed to create an invalid request.
Fixes#5474.
getServers returns an array of ips that are currently being used for
resolution
setServers takes an array of ips that are to be used for resolution,
this will throw if there's invalid input but preserve the original
configuration
Pretty much everything assumes strings to be utf-8, but crypto
traditionally used binary strings, so we need to keep the default
that way until most users get off of that pattern.
If there is an encoding, and we do 'stream.push(chunk, enc)', and the
encoding argument matches the stated encoding, then we're converting from
a string, to a buffer, and then back to a string. Of course, this is a
completely pointless bit of work, so it's best to avoid it when we know
that we can do so safely.
Empirical evidence suggests that OS-level load balancing (that is,
having multiple processes listen on a socket and have the operating
system wake up one when a connection comes in) produces skewed load
distributions on Linux, Solaris and possibly other operating systems.
The observed behavior is that a fraction of the listening processes
receive the majority of the connections. From the perspective of the
operating system, that somewhat makes sense: a task switch is expensive,
to be avoided whenever possible. That's why the operating system likes
to give preferential treatment to a few processes, because it reduces
the number of switches.
However, that rather subverts the purpose of the cluster module, which
is to distribute the load as evenly as possible. That's why this commit
adds (and defaults to) round-robin support, meaning that the master
process accepts connections and distributes them to the workers in a
round-robin fashion, effectively bypassing the operating system.
Round-robin is currently disabled on Windows due to how IOCP is wired
up. It works and you can select it manually but it probably results in
a heavy performance hit.
Fixes#4435.
Commit 9352c19 ("child_process: don't emit same handle twice") trades
one bug for another.
Before said commit, a handle was sometimes delivered with messages it
didn't belong to.
The bug fix introduced another bug that needs some explaining. On UNIX
systems, handles are basically file descriptors that are passed around
with the sendmsg() and recvmsg() system calls, using auxiliary data
(SCM_RIGHTS) as the transport.
node.js and libuv depend on the fact that none of the supported systems
ever emit more than one SCM_RIGHTS message from a recvmsg() syscall.
That assumption is something we should probably address someday for the
sake of portability but that's a separate discussion.
So, SCM_RIGHTS messages are never coalesced. SCM_RIGHTS and normal
messages however _are_ coalesced. That is, recvmsg() might return this:
recvmsg(); // { "message-with-fd", "message", "message" }
The operating system implicitly breaks pending messages along
SCM_RIGHTS boundaries. Most Unices break before such messages but Linux
also breaks _after_ them. When the sender looks like this:
sendmsg("message");
sendmsg("message-with-fd");
sendmsg("message");
Then on most Unices the receiver sees messages arriving like this:
recvmsg(); // { "message" }
recvmsg(); // { "message-with-fd", "message" }
The bug fix in commit 9352c19 assumes this behavior. On Linux however,
those messages can also come in like this:
recvmsg(); // { "message", "message-with-fd" }
recvmsg(); // { "message" }
In other words, it's incorrect to assume that the file descriptor is
always attached to the first message. This commit makes node wise up.
Fixes#5330.
This adds proper support for the following situation:
w.cork();
w.write(...);
w.cork();
w.write(...);
w.uncork();
w.write(...);
w.uncork();
This is relevant when you have a function (as we do in HTTP) that wants
to use cork, but in some cases, want to have a cork/uncork *around*
that function, without losing the benefits of writev.
In synchronous Writable streams (where the _write cb is called on the
current tick), the 'finish' event (and thus the end() callback) can in
some cases be called before all the write() callbacks are called.
Use a counter, and have stream.Transform rely on the 'prefinish' event
instead of the 'finish' event.
This has zero effect on most streams, but it corrects an edge case and
makes it perform more deterministically, which is a Good Thing.