-
-
Notifications
You must be signed in to change notification settings - Fork 33.5k
[WIP] fs: add FD object to manage file descriptors #18109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note: None of the existing fs APIs are modified to use this new object. To use the new object with, for instance, |
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take note: this uses a sync close. If we decide that it's necessary to use an async close, we can switch to that.
There is an open question about what to do with the result. We can use the new env->SetImmediate() to schedule a callback out to javascript land after the close attempt in order to report any errors or to give the equivalent to fs.close(fd, callback), but given that the primary use case is for this to close on garbage collection, I did not consider adding that mechanism to be critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an error, will the file descriptor be leaked? Can we assure that we have no leak before swallowing?
This mechanism implies that we are cleaning up after ourselves whatever happens.
IMHO it should go to ‘uncaughtException’ if it is failing, and he OS fd is still there.
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be fd.close()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
|
Tbh, I don’t like the idea of exposing this as a public API. The JS engine is not aware about file descriptors being a (comparatively) scarce resource, so the garbage collector might not act on the need for cleaning these up; closing the FD once its usage is finished is still the right thing to do. I would definitely approve of this feature being enabled for |
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be something like {[fs.FD][#fs_class_fs_fd]} or {[fs.FD][]} with a bottom reference?
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open() -> fs.openFD()
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.openSync() -> fs.openFDSync()
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open() -> fs.openFD()
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open( -> fs.openFD(
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open( -> fs.openFD(
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open() -> fs.openFD()? Is this note still applicable here?
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open() -> fs.openFD()?
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above?
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open() -> fs.openFD() with a new link?
|
I debugged too many leaked file descriptor situations because Node did not have this feature. I think fd manipulation is somehing very few are doing directly. For those that want to do that, this would simplify things considerably. For everybody else, it changes nothing. I’m +1. Can you also use it within streams in a separate commit in the PR? I think it will help making a case for it. Can we add something similar for net as well? Can it be made generic? |
I’ve thought about these quite a bit in the past, and the thing is, that does not work (for readable streams). These objects need to be GC roots, because unfortunately there is no reason why any For example, this code is perfectly valid (and common) releasing the stream for garbage collection would break it – and worst of all, in a non-deterministic way: fs.createReadStream('file.txt')
.on('data', ondata).on('end', end);
If that is the goal here, I would propose:
I think I could get on board with that. |
@addaleax the problem for file descriptor leaking in streams comes when you do I think the stream reference is kept alive by the closure created within |
There isn’t always a |
In the example you made, with In any case, if there is not a read scheduled and no one is holding a reference, then a file descriptor will be leaked. |
|
Garbage collection is non-deterministic. Basing deterministic resource management on something that isn't is a terrible idea of the you-don't-need-this-even-if-you-think-you-do variety. If the streams API has problems, they should be fixed, but not like this. |
This is not about basing deterministic resource management on gc. This is about preventing silent file descriptor leaks. |
|
A tool for detecting file descriptor leaks is a worthy goal but we can do much better than this PR. Just of the top off my head:
Best of all: can be done as a third-party module, doesn't need to be built-in. |
|
@addaleax said:
There is already a defined Web API for a
We could but I'm not a huge fan of that. The existing fs API, for all of it's warts, is familiar and well established. The promises based version should really try not to diverge from it too much in order to make code migration less painful. Eventually we might get there, and this would give us a foundation to work from, but I'd generally prefer not. What I can do right now in order to make that easier, however, is have the public API expose a pure JS object with the internal native @bnoordhuis said:
As @mcollina indicates, this is not about replacing or eliminating explicitly calling I like the idea of emitting a warning if the FWIW, I knew this bit was going to be controversial which is why I separated it out into it's own PR. There's no need to rush this part and I'd like to get it right. |
Okay, fair point – my line of thinking was, file descriptors are not necessarily a concept that JS programmers are familiar with, so we might want to provide a more explicit identifier than
We could make the methods on the new class Promises-only. 😉 But I’m also okay with saying “this is userland material” for that. |
|
Ok, so here's what I'm looking at for changes to this:
How does that look? |
|
That, to me, sounds pretty reasonable. |
|
Will adding a dual callback/promise system for |
|
@mcollina ... there won't be a dual callback. The |
|
Ok, I have points 1 and 2 implemented. I do not yet have point 3 implemented (failing with an unhandled error if closing during garbage collection). I will work on that next, but please take a look in the meantime |
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this return a Promise instead of returning a callback? the rest of the code is promise-based. It should be either 100% callbacks or 100% promises, having a callback here but fd.close().then() is not consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm going back and forth on this and thinking that rather than introduce fs.openFD() and fs.openFDSync() we just introduce a Promise returning fs.promises.openFD() and keep this isolated to use of the Promises API for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if that is the approach to take on this, then I'll just close this PR and bundle this commit in with the one that introduces the initial broader set of promisified functions
|
Ok, updated the implementation on this to hit all three of the points with one exception: if close fails during the garbage collection, a This does not yet address @mcollina's last point about the ping @mcollina , @addaleax, @bnoordhuis |
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: if this stuff becomes more common, it may make sense to drop a couple of new utility functions in Environment... specifically... env->Resolve(promise, value) and env->Reject(promise, value)
(and various reject variants like, env->RejectUVException(promise, err, syscall, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am LGTM on all but the fs.openFD(callback) API.
If it's callback-based, it should all be callback based. If it's promise-based, it should all be promise based.
I propose we keep this into fs.promise, and then we add an equivalent callback-based concept if it is needed.
@jasnell would it be better to open up a separate repo for this? So we can review the individual changes?
doc/api/fs.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d still go by my earlier point that this should ideally not be called FD … File might already be taken but I’d still kinda prefer something that would be more intuitive to people who haven’t done much UNIX/systems programming…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to suggestions :) I'm not good at the naming of things lol. Once had a cat named Cat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nodejs/collaborators Ideas? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileDescriptor, FileHandle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with FileHandle
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last three lines are no-ops (assuming the CHECK(persistent().IsEmpty()) does not crash)
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d prefer the uv_ namespace to remain reserved for libuv, otherwise it gets hard to tell which definitions are coming from where … also, it seems like you could just pass the message string as the data argument for the immediate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the warning, sure, but not raising the UVException, which requires the errcode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Sorry!
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not tearing down the process, it could still invoke uncaughtException handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm will have to test and verify that. I don't know if the uncaughtException handler will get triggered in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yeah, uncaughtException does not catch the FatalException here. The process just dies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it prints the error at least, then dies :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ ./node --expose-gc ~/tmp/test.js
undefined:0
Error: EBADF: Closing file descriptor 12 on garbage collection failed, close
src/node_file.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should inherit from ReqWrap<uv_fs_t>, like e.g. FSReqWrap does (or just inherit from the latter to begin with, that may or may not be easier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, considered that but it makes it a bit more heavyweight than it really needs to be. Shouldn't be a problem tho.
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is async context propagated here? I can’t see anything? (same for resolve)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know less than I need to on that part. What needs to be added here to propagate that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell Once CloseReq is an ReqWrap, you’ll want a line like InternalCallbackScope callback_scope(this); before the resolve calls. That’s all. :)
(The idea is that the CloseReq is an AsyncWrap then, so it tracks state automatically, and you only need to enter a scope to make async_hooks listeners aware that things are happening in the CloseReq’s async context.)
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you call this e.g. AfterOpenFD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, no problem
src/node_file.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the non-deprecated overloads of Int32Value(), or just .As<Int32>()->Value()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Will make that change in Open() also
|
@mcollina ... understood on the JS api.. my plan is to get the basic implementation right for the C++ class here in this PR, but I'm going to move the actual commit into the FS promises PR along with everything else and not land this PR as is. |
|
Also, I don't think a separate repo is necessary. The changes are actually quite discreet are are no where near as massive as something like the http2 impl. One more PR should have the majority FS promise APIs implemented. |
Currently, if any env->SetImmediate callback uses any of the
env->Throw{Whatever} variants, the exception is lost and
nothing happens. This commit catches those with a v8::TryCatch
and does a FatalException to tear down the process.
In preparation for providinng a promisified fs API, introduce a FD wrapper class that automatically closes the file descriptor on garbage collection, helping to ensure that the fd will not leak. The object should still be explicitly closed, and a process warning will be emitted if it is closed on gc (a fatal exception will occur if closing during gc fails). The promisified fs API will use these objects instead of the numeric file descriptors in order to ensure that fd's are closed appropriately following promise resolve or reject operations.
|
@addaleax ... updated :-) |
| } | ||
| } else { | ||
| // Already closed. Just reject the promise immediately | ||
| resolver->Reject(context, UVException(isolate, UV_EBADF, "close")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax ... do we also need an InternalCallbackScope here? This is not using the CloseReq
|
Closing in favor of #18297 |
In preparation for providinng a promisified fs API, introduce a FD wrapper class that automatically closes
the file descriptor on garbage collection, helping to ensure that the fd will not leak.
The promisified fs API will use these objects instead of the numeric file descriptors in order to ensure that fd's are closed appropriately following promise resolve or reject operations.
This was extracted from the WIP fs promises PR: #17739
/cc @addaleax @mcollina
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passesAffected core subsystem(s)
fs