Before this commit, events were set to undefined rather than deleted
from the EventEmitter's backing dictionary for performance reasons:
`delete obj.key` causes a transition of the dictionary's hidden class
and that can be costly.
Unfortunately, that introduces a memory leak when many events are added
and then removed again. The strings containing the event names are never
reclaimed by the garbage collector because they remain part of the
dictionary.
That's why this commit makes EventEmitter delete events again. This
effectively reverts commit 0397223.
Fixes#5970.
Avoid a costly buffer-to-string operation. Instead, allocate a new
buffer, copy the chunk header and data into it and send that.
The speed difference is negligible on small payloads but it really
shines with larger (10+ kB) chunks. benchmark/http/end-vs-write-end
with 64 kB chunks gives 45-50% higher throughput. With 1 MB chunks,
the difference is a staggering 590%.
Of course, YMMV will vary with real workloads and networks but this
commit should have a positive impact on CPU and memory consumption.
Big kudos to Wyatt Preul (@wpreul) for reporting the issue and providing
the initial patch.
Fixes#5941 and #5944.
When using url.parse(), path and pathname usually return '/' when there
is no path available. However when you have a protocol that contains
non-lowercase letters and the input string does not have a trailing
slash, both path and pathname will be undefined.
A pooled https agent may get a Connection: close, but never finish
destroying the socket as the prior request had already emitted finish
likely from a pipe.
Since the socket is not marked as destroyed it may get reused by the
agent pool and result in an ECONNRESET.
re: #5712#5739
There was previously up to a second exit delay when exiting node
right after an http request/response, due to the utcDate() function
doing a setTimeout to update the cached date/time.
Fixing this should increase the performance of our http tests.
This reverts commit a40133d10c.
Unfortunately, this breaks socket.io. Even though it's not strictly an
API change, it is too subtle and in too brittle an area of node, to be
done in a stable branch.
Conflicts:
doc/api/http.markdown
Normalize the encoding in getEncoding() before using it. Fixes a
"AssertionError: Cannot change encoding" exception when the caller
mixes "utf8" and "utf-8".
Fixes#5655.
This is only relevant for CentOS 6.3 using kernel version 2.6.32.
On other linuxes and darwin, the `read` call gets an ECONNRESET in that
case. On sunos, the `write` call fails with EPIPE.
However, old CentOS will occasionally send an EOF instead of a
ECONNRESET or EPIPE when the client has been destroyed abruptly.
Make sure we don't keep trying to write or read more in that case.
Fixes#5504
However, there is still the question of what libuv should do when it
gets an EOF. Apparently in this case, it will continue trying to read,
which is almost certainly the wrong thing to do.
That should be fixed in libuv, even though this works around the issue.
In cases where there are multiple @-chars in a url, Node currently
parses the hostname and auth sections differently than web browsers.
This part of the bug is serious, and should be landed in v0.10, and
also ported to v0.8, and releases made as soon as possible.
The less serious issue is that there are many other sorts of malformed
urls which Node either accepts when it should reject, or interprets
differently than web browsers. For example, `http://a.com*foo` is
interpreted by Node like `http://a.com/*foo` when web browsers treat
this as `http://a.com%3Bfoo/`.
In general, *only* the `hostEndingChars` should be the characters that
delimit the host portion of the URL. Most of the current `nonHostChars`
that appear in the hostname should be escaped, but some of them (such as
`;` and `%` when it does not introduce a hex pair) should raise an
error.
We need to have a broader discussion about whether it's best to throw in
these cases, and potentially break extant programs, or return an object
that has every field set to `null` so that any attempt to read the
hostname/auth/etc. will appear to be empty.
In some cases, the http CONNECT/Upgrade API is unshifting an empty
bodyHead buffer onto the socket.
Normally, stream.unshift(chunk) does not set state.reading=false.
However, this check was not being done for the case when the chunk was
empty (either `''` or `Buffer(0)`), and as a result, it was causing the
socket to think that a read had completed, and to stop providing data.
This bug is not limited to http or web sockets, but rather would affect
any parser that unshifts data back onto the source stream without being
very careful to never unshift an empty chunk. Since the intent of
unshift is to *not* change the state.reading property, this is a bug.
Fixes#5557FixesLearnBoost/socket.io#1242
Before this, entering something like:
> JSON.parse('066');
resulted in the "..." prompt instead of displaying the expected
"SyntaxError: Unexpected number"
1. Emit `sslOutEnd` only when `_internallyPendingBytes() === 0`.
2. Read before checking `._halfRead`, otherwise we'll see only previous
value, and will invoke `._write` callback improperly.
3. Wait for both `end` and `finish` events in `.destroySoon`.
4. Unpipe encrypted stream from socket to prevent write after destroy.
Stream's `._write()` callback should be invoked only after it's opposite
stream has finished processing incoming data, otherwise `finish` event
fires too early and connection might be closed while there's some data
to send to the client.
see #5544
Quote from SSL_shutdown man page:
The output of SSL_get_error(3) may be misleading,
as an erroneous SSL_ERROR_SYSCALL may be flagged even though
no error occurred.
Also, handle all other errors to prevent assertion in `ClearError()`.
When writing bad data to EncryptedStream it'll first get to the
ClientHello parser, and, only after it will refuse it, to the OpenSSL.
But ClientHello parser has limited buffer and therefore write could
return `bytes_written` < `incoming_bytes`, which is not the case when
working with OpenSSL.
After such errors ClientHello parser disables itself and will
pass-through all data to the OpenSSL. So just trying to write data one
more time will throw the rest into OpenSSL and let it handle it.
Otherwise, writing an empty string causes the whole program to grind to
a halt when piping data into http messages.
This wasn't as much of a problem (though it WAS a bug) in 0.8 and
before, because our hyperactive 'drain' behavior would mean that some
*previous* write() would probably have a pending drain event, and cause
things to start moving again.
This commit adds an optimization to the HTTP client that makes it
possible to:
* Pack the headers and the first chunk of the request body into a
single write().
* Pack the chunk header and the chunk itself into a single write().
Because only one write() system call is issued instead of several,
the chances of data ending up in a single TCP packet are phenomenally
higher: the benchmark with `type=buf size=32` jumps from 50 req/s to
7,500 req/s, a 150-fold increase.
This commit removes the check from e4b716ef that pushes binary encoded
strings into the slow path. The commit log mentions that:
We were assuming that any string can be concatenated safely to
CRLF. However, for hex, base64, or binary encoded writes, this
is not the case, and results in sending the incorrect response.
For hex and base64 strings that's certainly true but binary strings
are 'das Ding an sich': string.length is the same before and after
decoding.
Fixes#5528.
When an internal api needs a timeout, they should use
timers._unrefActive since that won't hold the loop open. This solves
the problem where you might have unref'd the socket handle but the
timeout for the socket was still active.
Previously one could write anywhere in a buffer pool if they accidently
got their offset wrong. Mainly because the cc level checks only test
against the parent slow buffer and not against the js object properties.
So now we check to make sure values won't go beyond bounds without
letting the dev know.
Test case:
var t = setInterval(function() {}, 1);
process.nextTick(t.unref);
Output:
Assertion failed: (args.Holder()->InternalFieldCount() > 0),
function Unref, file ../src/handle_wrap.cc, line 78.
setInterval() returns a binding layer object. Make it stop doing that,
wrap the raw process.binding('timer_wrap').Timer object in a Timeout
object.
Fixes#4261.
Pretty much everything assumes strings to be utf-8, but crypto
traditionally used binary strings, so we need to keep the default
that way until most users get off of that pattern.
If there is an encoding, and we do 'stream.push(chunk, enc)', and the
encoding argument matches the stated encoding, then we're converting from
a string, to a buffer, and then back to a string. Of course, this is a
completely pointless bit of work, so it's best to avoid it when we know
that we can do so safely.
Commit 9352c19 ("child_process: don't emit same handle twice") trades
one bug for another.
Before said commit, a handle was sometimes delivered with messages it
didn't belong to.
The bug fix introduced another bug that needs some explaining. On UNIX
systems, handles are basically file descriptors that are passed around
with the sendmsg() and recvmsg() system calls, using auxiliary data
(SCM_RIGHTS) as the transport.
node.js and libuv depend on the fact that none of the supported systems
ever emit more than one SCM_RIGHTS message from a recvmsg() syscall.
That assumption is something we should probably address someday for the
sake of portability but that's a separate discussion.
So, SCM_RIGHTS messages are never coalesced. SCM_RIGHTS and normal
messages however _are_ coalesced. That is, recvmsg() might return this:
recvmsg(); // { "message-with-fd", "message", "message" }
The operating system implicitly breaks pending messages along
SCM_RIGHTS boundaries. Most Unices break before such messages but Linux
also breaks _after_ them. When the sender looks like this:
sendmsg("message");
sendmsg("message-with-fd");
sendmsg("message");
Then on most Unices the receiver sees messages arriving like this:
recvmsg(); // { "message" }
recvmsg(); // { "message-with-fd", "message" }
The bug fix in commit 9352c19 assumes this behavior. On Linux however,
those messages can also come in like this:
recvmsg(); // { "message", "message-with-fd" }
recvmsg(); // { "message" }
In other words, it's incorrect to assume that the file descriptor is
always attached to the first message. This commit makes node wise up.
Fixes#5330.
When developer calls setBreakpoint with an unknown script name,
we convert the script name into regular expression matching all
paths ending with given name (name can be a relative path too).
To create such breakpoint in V8, we use type `scriptRegEx`
instead of `scriptId` for `setbreakpoint` request.
To restore such breakpoint, we save the original script name
send by the user. We use this original name to set (restore)
breakpoint in the new child process.
This is a back-port of commit 5db936d from the master branch.
Fixed a bug in debugger repl where `restart` command did not work
when a custom debug port was specified via command-line option
--port={number}.
File test/simple/helper-debugger-repl.js was extracted
from test/simple/test-debugger-repl.js
Fixes#3740
In the case of pipelined requests, you can have a situation where
the socket gets destroyed via one req/res object, but then trying
to destroy *another* req/res on the same socket will cause it to
call undefined.destroy(), since it was already removed from that
message.
Add a guard to OutgoingMessage.destroy and IncomingMessage.destroy
to prevent this error.
It needs to apply the Transform class when the _readableState,
_writableState, or _transformState properties are accessed,
otherwise things like setEncoding and on('data') don't work
properly.
Also, the methods wrappers are no longer needed, since they're only
problematic because they access the undefined properties.
4716dc6 made assert.equal() and related functions work better by
generating a better toString() from the expected, actual, and operator
values passed to fail(). Unfortunately, this was accomplished by putting
the generated message into the error's `name` property. When you passed
in a custom error message, the error would put the custom error into
`name` *and* `message`, resulting in helpful string representations like
"AssertionError: Oh no: Oh no".
This commit resolves that issue by storing the generated message in the
`message` property while leaving the error's name alone and adding
a regression test so that this doesn't pop back up later.
Closes#5292.
I broke dgram.Socket#bind(port, cb) almost a year ago in 332fea5a but
it wasn't until today that someone complained and none of the tests
caught it because they all either specify the address or omit the
callback.
Anyway, now it works again and does what you expect: it binds the
socket to the "any" address ("0.0.0.0" for IPv4 and "::" for IPv6.)
Fix#5272
The consumption of a readable stream is a dance with 3 partners.
1. The specific stream Author (A)
2. The Stream Base class (B), and
3. The Consumer of the stream (C)
When B calls the _read() method that A implements, it sets a 'reading'
flag, so that parallel calls to _read() can be avoided. When A calls
stream.push(), B knows that it's safe to start calling _read() again.
If the consumer C is some kind of parser that wants in some cases to
pass the source stream off to some other party, but not before "putting
back" some bit of previously consumed data (as in the case of Node's
websocket http upgrade implementation). So, stream.unshift() will
generally *never* be called by A, but *only* called by C.
Prior to this patch, stream.unshift() *also* unset the state.reading
flag, meaning that C could indicate the end of a read, and B would
dutifully fire off another _read() call to A. This is inappropriate.
In the case of fs streams, and other variably-laggy streams that don't
tolerate overlapped _read() calls, this causes big problems.
Also, calling stream.shift() after the 'end' event did not raise any
kind of error, but would cause very strange behavior indeed. Calling it
after the EOF chunk was seen, but before the 'end' event was fired would
also cause weird behavior, and could lead to data being lost, since it
would not emit another 'readable' event.
This change makes it so that:
1. stream.unshift() does *not* set state.reading = false
2. stream.unshift() is allowed up until the 'end' event.
3. unshifting onto a EOF-encountered and zero-length (but not yet
end-emitted) stream will defer the 'end' event until the new data is
consumed.
4. pushing onto a EOF-encountered stream is now an error.
So, if you read(), you have that single tick to safely unshift() data
back into the stream, even if the null chunk was pushed, and the length
was 0.
Don't scan the whole string for a "NODE_" substring, just check that
the string starts with the expected prefix.
This is a reprise of dbbfbe7 but this time for the child_process
module.
Don't scan the whole string for a "NODE_CLUSTER_" substring, just check
that the string starts with the expected prefix. The linear scan was
causing a noticeable (but unsurprising) slowdown on messages with a
large .cmd string property.
Buffer.byteLength() works only for string inputs. Thus, when connection
has pending Buffer to write, it should just use it's length instead of
throwing exception.