- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Implement fs::realpath #11734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement fs::realpath #11734
Conversation
        
          
                src/libstd/io/fs.rs
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a constant defined in libstd/num or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is indeed a terrible name for this constant. I meant for this to be the maximum number of symlinks which will get resolved. I'll update this.
        
          
                src/libstd/io/fs.rs
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not going to behave properly for Windows paths. I don't know if Windows even has symlinks, but if so, the root of the filesystem is not going to be /. You can use original.root_path() to get the root of the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also worth noting that os::make_absolute may not actually return a fully absolute path on Windows. If the given path is a volume-relative path, e.g. C:foo, and the cwd is not from that volume, the resulting path will be the volume-relative path rather than an absolute path. There's been some discussion in #11579 as to whether volume-relative paths really are valid, but for the moment they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused. It looks like root_path returns option, so there's some failure modes to consider. On unix, I presume it'll never return None because make_absolute will work, but I don't understand the windows case at all.
If you have a volume-relative path, what's it relative to, if not the current directory? It could reference any file on the filesystem, but we opened the path from the OS and read the file so there's some path that it points to somewhere, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcrichton The output of os::make_absolute().root_path() should always be a Some value. I say "should" because this assumes that getcwd() returns an absolute path (which it damn well should). The result of absolute_path.join(any_path) is a path that will return a Some value from .root_path() (and that's how os::make_absolute() works). Given that, it's perfectly fine to say let mut result = original.root_path().unwrap().
Volume-relative paths are not going to be usable with this function. Technically, they're relative to the current working directory for the specified volume. Classically, Windows maintained a separate working directory per-volume, and this has to do with how they supported old DOS programs that didn't have the notion of directories. There's a bunch of history in #11579 about this if you're curious.
Assuming we keep volume-relative paths (and I'm leaning towards doing so, though I still haven't thought through everything), I think the appropraite behavior here is something like
let original = os::make_absolute(original);
if !original.is_absolute() { return original.clone(); } // for Windows C:foo paths
let mut result = original.root_path().unwrap();
// ...This way a volume-relative path will be returned unchanged.
| r=me if the noted issues can be fixed. | 
| r? @kballard This seems tricky enough I'd like another pair of eyes before auto r+'ing | 
This can be used to strip all symlinks from a pathname.
| As far as I can tell, this is broken, because if a symlink points to a path that contains a symlink as an intermediate component, that will remain in the final path. That is, realpath("/a") where /a is a symlink to "/b/c", /b is a symlink to "d": realpath will return "/b/c", but it should return "/d/c". Also, you can't just come up with an invented magic number for the link limit: it must be exactly the same as the one used by the operating system, which you get by  Is implementing this by hand in Rust the best approach, as opposed to calling uv_realpath in librustuv and in libnative realpath on POSIX and something like CreateFile+GetFinalPathNameByHandle on Windows? If so, I'd suggest reading the source code of realpath in glibc, copying what it does, and then testing that it works on Windows with both junctions and symbolic links (and if not, doing whatever is necessary there). | 
| Looks like I just learned that  No need to reimplement in rust due to how tricky it is. | 
| @bill-myers That is a very good collection of points. I didn't even realize the symloop limit was configurable. Curiously, my manpage for sysconf() does not list _SC_SYMLOOP_MAX (this is OS X), although unistd.h does inded have it. If libuv provides realpath functionality, using that seems like a good idea for libgreen. For libnative in POSIX, I agree that realpath() is a good solution. For Windows, I would suggest finding existing code that solves this problem (maybe even see what uv_realpath() does for Windows), because creating a file seems like a bad idea (or even opening one; resolving a path should work even if one does not have permission to read the file at the end of that path, as long as the ancestor directories are all readable). | 
| Actually, it appears that the manpages for  | 
| FWIW, on OS X, the realpath source (note: apparently covered by the 3-clause BSD license, if you have any worries about reading the source) uses  It also uses  Given all this, it seems to me that we should use  | 
| The manpage also says "The resolved_path == NULL feature, not standardized in POSIX.1-2001, but standardized in POSIX.1-2008, allows this design problem to be avoided". The function is usable as long as either PATH_MAX is defined, or you specify NULL for  Not too sure on Windows, but I'd suggest to try to avoid writing your own at all costs. Windows has short vs long file names, UNC paths, NT vs Win32 paths, two kinds of symbolic links (junctions and "symbolic links") plus pluggable reparse points with arbitrary behavior, and at least one among junctions and symbolic links, when it is on a network path, will refer to the path on the remote machine even if it is absolute. | 
| I'll defer to your judgements, sounds like you guys know more about this area than I do. | 
| Just filed an issue to track the fact that we need to find a solution for this. | 
| Thanks @kballard | 
When calculating the sysroot, it's more accurate to use realpath() rather than just one readlink() to account for any intermediate symlinks that the rustc binary resolves itself to. For rpath, realpath() is necessary because the rpath must dictate a relative rpath from the destination back to the originally linked library, which works more robustly if there are no symlinks involved. Concretely, any binary generated on OSX into $TMPDIR requires an absolute rpath because the temporary directory is behind a symlink with one layer of indirection. This symlink causes all relative rpaths to fail to resolve. cc rust-lang#11734 cc rust-lang#11857
When calculating the sysroot, it's more accurate to use realpath() rather than just one readlink() to account for any intermediate symlinks that the rustc binary resolves itself to. For rpath, realpath() is necessary because the rpath must dictate a relative rpath from the destination back to the originally linked library, which works more robustly if there are no symlinks involved. Concretely, any binary generated on OSX into $TMPDIR requires an absolute rpath because the temporary directory is behind a symlink with one layer of indirection. This symlink causes all relative rpaths to fail to resolve. cc rust-lang#11734 cc rust-lang#11857
This can be used to strip all symlinks from a pathname. This is then used to
close #3632.