The copyright and license notice is already in the LICENSE file. There
is no justifiable reason to also require that it be included in every
file, since the individual files are not individually distributed except
as part of the entire package.
This prevents the following sort of thing from being confusing:
```javascript
stream.on('data', function() { console.error('got data'); });
stream.pause(); // stop reading
// turns out no data is available
stream.push(null);
// Hand the stream to someone else, who does stuff...
setTimeout(function() {
// too late! 'end' is already emitted!
stream.on('end', function() { console.error('got end'); });
});
```
With this change, the `end` event is not emitted until you call `read()`
*past* the EOF null. So, a paused stream will not swallow the `end`
event and emit it before you `resume()` the stream.
Closes#5860
In streams2, there is an "old mode" for compatibility. Once switched
into this mode, there is no going back.
With this change, there is a "flowing mode" and a "paused mode". If you
add a data listener, then this will start the flow of data. However,
hitting the `pause()` method will switch *back* into a non-flowing mode,
where the `read()` method will pull data out.
Every time `read()` returns a data chunk, it also emits a `data` event.
In this way, a passive data listener can be added, and the stream passed
off to some other reader, for use with progress bars and the like.
There is no API change beyond this added flexibility.
In some cases, the http CONNECT/Upgrade API is unshifting an empty
bodyHead buffer onto the socket.
Normally, stream.unshift(chunk) does not set state.reading=false.
However, this check was not being done for the case when the chunk was
empty (either `''` or `Buffer(0)`), and as a result, it was causing the
socket to think that a read had completed, and to stop providing data.
This bug is not limited to http or web sockets, but rather would affect
any parser that unshifts data back onto the source stream without being
very careful to never unshift an empty chunk. Since the intent of
unshift is to *not* change the state.reading property, this is a bug.
Fixes#5557FixesLearnBoost/socket.io#1242
Pretty much everything assumes strings to be utf-8, but crypto
traditionally used binary strings, so we need to keep the default
that way until most users get off of that pattern.
If there is an encoding, and we do 'stream.push(chunk, enc)', and the
encoding argument matches the stated encoding, then we're converting from
a string, to a buffer, and then back to a string. Of course, this is a
completely pointless bit of work, so it's best to avoid it when we know
that we can do so safely.
Fix#5272
The consumption of a readable stream is a dance with 3 partners.
1. The specific stream Author (A)
2. The Stream Base class (B), and
3. The Consumer of the stream (C)
When B calls the _read() method that A implements, it sets a 'reading'
flag, so that parallel calls to _read() can be avoided. When A calls
stream.push(), B knows that it's safe to start calling _read() again.
If the consumer C is some kind of parser that wants in some cases to
pass the source stream off to some other party, but not before "putting
back" some bit of previously consumed data (as in the case of Node's
websocket http upgrade implementation). So, stream.unshift() will
generally *never* be called by A, but *only* called by C.
Prior to this patch, stream.unshift() *also* unset the state.reading
flag, meaning that C could indicate the end of a read, and B would
dutifully fire off another _read() call to A. This is inappropriate.
In the case of fs streams, and other variably-laggy streams that don't
tolerate overlapped _read() calls, this causes big problems.
Also, calling stream.shift() after the 'end' event did not raise any
kind of error, but would cause very strange behavior indeed. Calling it
after the EOF chunk was seen, but before the 'end' event was fired would
also cause weird behavior, and could lead to data being lost, since it
would not emit another 'readable' event.
This change makes it so that:
1. stream.unshift() does *not* set state.reading = false
2. stream.unshift() is allowed up until the 'end' event.
3. unshifting onto a EOF-encountered and zero-length (but not yet
end-emitted) stream will defer the 'end' event until the new data is
consumed.
4. pushing onto a EOF-encountered stream is now an error.
So, if you read(), you have that single tick to safely unshift() data
back into the stream, even if the null chunk was pushed, and the length
was 0.
In cases where a stream may have data added to the read queue before the
user adds a 'readable' event, there is never any indication that it's
time to start reading.
True, there's already data there, which the user would get if they
checked However, as we use 'readable' event listening as the signal to
start the flow of data with a read(0) call internally, we ought to
trigger the same effect (ie, emitting a 'readable' event) even if the
'readable' listener is added after the first emission.
To avoid confusing weirdness, only the *first* 'readable' event listener
is granted this privileged status. After we've started the flow (or,
alerted the consumer that the flow has started) we don't need to start
it again. At that point, it's the consumer's responsibility to consume
the stream.
Closes#5141
Also, set paused=false *before* calling resume(). Otherwise,
there's an edge case where an immediately-emitted chunk might make
it call pause() again incorrectly.
This solves the problem of calling `readable.pipe(writable)` after the
readable stream has already emitted 'end', as often is the case when
writing simple HTTP proxies.
The spirit of streams2 is that things will work properly, even if you
don't set them up right away on the first tick.
This approach breaks down, however, because pipe()ing from an ended
readable will just do nothing. No more data will ever arrive, and the
writable will hang open forever never being ended.
However, that does not solve the case of adding a `on('end')` listener
after the stream has received the EOF chunk, if it was the first chunk
received (and thus, length was 0, and 'end' got emitted). So, with
this, we defer the 'end' event emission until the read() function is
called.
Also, in pipe(), if the source has emitted 'end' already, we call the
cleanup/onend function on nextTick. Piping from an already-ended stream
is thus the same as piping from a stream that is in the process of
ending.
Updates many tests that were relying on 'end' coming immediately, even
though they never read() from the req.
Fix#4942
In the function that pre-emptively fills the Readable queue, it relies
on a recursion through:
stream.push(chunk) ->
maybeReadMore(stream, state) ->
if (not reading more and < hwm) stream.read(0) ->
stream._read() ->
stream.push(chunk) -> repeat.
Since this was only calling read() a single time, and then relying on a
future nextTick to collect more data, it ends up causing a nextTick
recursion error (and potentially a RangeError, even) if you have a very
high highWaterMark, and are getting very small chunks pushed
synchronously in _read (as happens with TLS, or many simple test
streams).
This change implements a new approach, so that read(0) is called
repeatedly as long as it is effective (that is, the length keeps
increasing), and thus quickly fills up the buffer for streams such as
these, without any stacks overflowing.
Now that highWaterMark increases when there are large reads, this
greatly reduces the number of calls necessary to _read(size), assuming
that _read actually respects the size argument.
When a readable listener is added, call read(0) so that data will flow in, up to
the high water mark.
Otherwise, it's somewhat confusing that you have to listen for readable,
and ALSO call read() (when it will certainly return null) just to get some
data out of the stream.
See: #4720
Ability to return just the length of listeners for a given type, using
EventEmitter.listenerCount(emitter, event). This will be a lot cheaper
than creating a copy of the listeners array just to check its length.
This makes it so that `stream.push(chunk)` is the only way to signal the
end of reading, removing the confusing disparity between the
callback-style _read method, and the fact that most real-world streams
do not have a 1:1 corollation between the "please give me data" event,
and the actual arrival of a chunk of data.
It is still possible, of course, to implement a `CallbackReadable` on
top of this. Simply provide a method like this as the callback:
function readCallback(er, chunk) {
if (er)
stream.emit('error', er);
else
stream.push(chunk);
}
However, *only* fs streams actually would behave in this way, so it
makes not a lot of sense to make TCP, TLS, HTTP, and all the rest have
to bend into this uncomfortable paradigm.
A primary motivation of this is to make the onread function more
inline-friendly, but also to make it more easy to explore not having
onread at all, in favor of always using push() to signal the end of
reading.
The Readable and Writable classes will nextTick certain things
if in sync mode. The sync flag gets unset after a call to _read
or _write. However, most of these behaviors should also be
deferred until nextTick if no reads have been made (for example,
the automatic '_read up to hwm' behavior on Readable.push(chunk))
Set the sync flag to true in the constructor, so that it will not
trigger an immediate 'readable' event, call to _read, before the
user has had a chance to set a _read method implementation.
There are cases where a push() call would return true, even though
the thing being pushed was in fact way way larger than the high
water mark, simply because the 'needReadable' was already set, and
would not get unset until nextTick.
In some cases, this could lead to an infinite loop of pushing data
into the buffer, never getting to the 'readable' event which would
unset the needReadable flag.
Fix by splitting up the emitReadable function, so that it always
sets the flag on this tick, even if it defers until nextTick to
actually emit the event.
Also, if we're not ending or already in the process of reading, it
now calls read(0) if we're below the high water mark. Thus, the
highWaterMark value is the intended amount to buffer up to, and it
is smarter about hitting the target.