如何在shell脚本中解决符号链接问题

Question

如何在shell脚本中解决符号链接问题

263

给定一个在类Unix系统中的绝对或相对路径，我想确定在解析任何中间符号链接后的目标的完整路径。同时解析~username表示方法将获得额外奖励。

如果目标是一个目录，则可能可以chdir()进入该目录然后调用getcwd()，但我真的想从shell脚本中执行此操作，而不是编写一个C辅助程序。不幸的是，shell往往会试图隐藏符号链接的存在（这是OS X上的bash）：

$ ls -ld foo bar
drwxr-xr-x   2 greg  greg  68 Aug 11 22:36 bar
lrwxr-xr-x   1 greg  greg   3 Aug 11 22:36 foo -> bar
$ cd foo
$ pwd
/Users/greg/tmp/foo
$

我想要的是一个resolve()函数，当从上述示例中的tmp目录执行时，resolve("foo")应返回"/Users/greg/tmp/bar"。

- Greg Hewgill

21个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- mklement0 · Answer 1

更新：

自至少macOS 13（Ventura）起，macOS现在也支持readlink -f以跟随符号链接到其最终目标（FreeBSD已经支持了很长时间），因此对于符合POSIX标准的解决方案的需求已经大大减少。
然而，一个符合POSIX标准的解决方案 - 如下所示 - 仍然可能会引起兴趣：
- 如果您需要支持旧版本的macOS和其他未实现readlink -f的类Unix平台。
- 和/或者您需要跨平台保持一致的行为：
  - 值得注意的是，在macOS / FreeBSD上的readlink -f与GNU实现（在Linux发行版中找到）不等同：-f在macOS / FreeBSD上对应于GNU的-e，它强制存在最终目标（GNU的-f允许最终目标的最后路径段不存在）。

下面是一个完全符合POSIX标准的脚本/函数，因此是跨平台的（在macOS上也可以使用，由于10.12（Sierra）的readlink仍不支持-f） - 它仅使用 POSIX shell语言特性和仅符合POSIX标准的实用程序调用。

这是GNU的readlink -e的可移植实现（readlink -f的更严格版本）。

您可以使用sh运行脚本或在：

例如，在脚本内部，您可以按以下方式使用它来获取运行脚本真正的原始目录，其中解析了符号链接：

trueScriptDir=$(dirname -- "$(rreadlink "$0")")

rreadlink 脚本/函数定义：

^{此代码是在感激之情下改编自这个回答。

我还创建了一个基于 bash 的独立实用程序版本此处，您可以通过以下方式安装：

npm install rreadlink -g，如果您已安装 Node.js。}

#!/bin/sh # SYNOPSIS # rreadlink <fileOrDirPath> # DESCRIPTION # Resolves <fileOrDirPath> to its ultimate target, if it is a symlink, and # prints its canonical path. If it is not a symlink, its own canonical path # is printed. # A broken symlink causes an error that reports the non-existent target. # LIMITATIONS # - Won't work with filenames with embedded newlines or filenames containing # the string ' -> '. # COMPATIBILITY # This is a fully POSIX-compliant implementation of what GNU readlink's # -e option does. # EXAMPLE # In a shell script, use the following to get that script's true directory of origin: # trueScriptDir=$(dirname -- "$(rreadlink "$0")") rreadlink() ( # Execute the function in a *subshell* to localize variables and the effect of `cd`. target=$1 fname= targetDir= CDPATH= # Try to make the execution environment as predictable as possible: # All commands below are invoked via `command`, so we must make sure that # `command` itself is not redefined as an alias or shell function. # (Note that command is too inconsistent across shells, so we don't use it.) # `command` is a *builtin* in bash, dash, ksh, zsh, and some platforms do not # even have an external utility version of it (e.g, Ubuntu). # `command` bypasses aliases and shell functions and also finds builtins # in bash, dash, and ksh. In zsh, option POSIX_BUILTINS must be turned on for # that to happen. { \unalias command; \unset -f command; } >/dev/null 2>&1 [ -n "$ZSH_VERSION" ] && options[POSIX_BUILTINS]=on # make zsh find *builtins* with `command` too. while :; do # Resolve potential symlinks until the ultimate target is found. [ -L "$target" ] || [ -e "$target" ] || { command printf '%s\n' "ERROR: '$target' does not exist." >&2; return 1; } command cd "$(command dirname -- "$target")" # Change to target dir; necessary for correct resolution of target path. fname=$(command basename -- "$target") # Extract filename. [ "$fname" = '/' ] && fname='' # !! curiously, `basename /` returns '/' if [ -L "$fname" ]; then # Extract [next] target path, which may be defined # *relative* to the symlink's own directory. # Note: We parse `ls -l` output to find the symlink target # which is the only POSIX-compliant, albeit somewhat fragile, way. target=$(command ls -l "$fname") target=${target#* -> } continue # Resolve [next] symlink target. fi break # Ultimate target reached. done targetDir=$(command pwd -P) # Get canonical dir. path # Output the ultimate target's canonical path. # Note that we manually resolve paths ending in /. and /.. to make sure we have a normalized path. if [ "$fname" = '.' ]; then command printf '%s\n' "${targetDir%/}" elif [ "$fname" = '..' ]; then # Caveat: something like /var/.. will resolve to /private (assuming /var@ -> /private/var), i.e. the '..' is applied # AFTER canonicalization. command printf '%s\n' "$(command dirname -- "${targetDir}")" else command printf '%s\n' "${targetDir%/}/$fname" fi ) rreadlink "$@"

关于安全的话题：

jarno在评论中提到了一个函数，确保内置的command没有被同名的别名或者shell函数所覆盖，他问道：

如果unalias或者unset以及[被设置为别名或者shell函数会怎样呢？

rreadlink确保command具有其原始含义的动机是将其用于绕过交互式shell中经常用于隐藏标准命令的（良性）方便别名和函数，例如重新定义ls以包括喜欢的选项等。

我认为可以肯定地说，除非你处理不受信任的恶意环境，否则不必担心unalias或者unset - 或者，就此事而言，while，do，... 被重新定义。

函数必须依赖于某些东西才能保持其原始的含义和行为 - 这是无法避免的。
类似 POSIX 的 shell 允许重新定义内置命令甚至语言关键字，这本质上是一种安全风险（编写偏执代码通常很困难）。

针对您的具体问题：

该函数依赖于 unalias 和 unset 保持其原始含义。如果将它们重新定义为更改其行为的 shell 函数，那么就会出现问题；将其重新定义为别名不一定是一个问题，因为在命令名称中使用引号（例如，\unalias）可以绕过别名。

然而，引用对于 shell 的关键词（while、for、if、do 等）是不可行的选项。尽管 shell 关键词比 shell 的函数更优先，但在 bash 和 zsh 中，别名具有最高优先级。因此，为了防止 shell 关键词的重新定义，必须使用它们的名称运行 unalias 命令（在非交互式的 bash shell（例如脚本）中，默认情况下不扩展别名 - 只有在显式调用 shopt -s expand_aliases 时才会扩展别名）。

为了确保作为内置命令的 unalias 具有其原始含义，必须首先使用 \unset 对其进行取消设置，这需要 unset 具有其原始含义：

unset 是一个 shell 内置命令，因此为了确保它被调用为内置命令，您必须确保它本身没有被重新定义为函数。虽然您可以通过引用来绕过别名形式，但无法绕过 shell 函数形式 - 这是个死结。

因此，除非您可以依赖于 unset 具有其原始含义，否则从我所知，没有保证的方法来防御所有恶意重新定义。

- nakwa · Answer 2

如果无法使用pwd（例如从不同位置调用脚本），请使用realpath（带或不带dirname）：

$(dirname $(realpath $PATH_TO_BE_RESOLVED))

无论是通过（多个）符号链接调用还是直接从任何位置调用脚本，都可以正常工作。

- solidsnack · Answer 3

这是一个在Bash中解析符号链接的工具，无论链接是目录还是非目录：

function readlinks {(
  set -o errexit -o nounset
  declare n=0 limit=1024 link="$1"

  # If it's a directory, just skip all this.
  if cd "$link" 2>/dev/null
  then
    pwd -P
    return 0
  fi

  # Resolve until we are out of links (or recurse too deep).
  while [[ -L $link ]] && [[ $n -lt $limit ]]
  do
    cd "$(dirname -- "$link")"
    n=$((n + 1))
    link="$(readlink -- "${link##*/}")"
  done
  cd "$(dirname -- "$link")"

  if [[ $n -ge $limit ]]
  then
    echo "Recursion limit ($limit) exceeded." >&2
    return 2
  fi

  printf '%s/%s\n' "$(pwd -P)" "${link##*/}"
)}

请注意，所有的cd和set操作都是在子shell中进行的。

- Dave · Answer 4

2

function realpath {
    local r=$1; local t=$(readlink $r)
    while [ $t ]; do
        r=$(cd $(dirname $r) && cd $(dirname $t) && pwd -P)/$(basename $t)
        t=$(readlink $r)
    done
    echo $r
}

#example usage
SCRIPT_PARENT_DIR=$(dirname $(realpath "$0"))/..

- Dave

这将在以下情况下中断：（a）任何路径包含空格或shell元字符，以及（b）损坏的符号链接（如果您只想要运行脚本的父路径，则假定（a）不适用，这不是问题）。 - mklement0

如果您关注空格，请使用引号。 - Dave

请执行 - 尽管这不会解决 (b) 的问题。 - mklement0

请给我一个 b) 失败的例子。根据定义，损坏的符号链接指向不存在的目录条目。这个脚本的目的是解决另一个方向上的符号链接。如果符号链接已经损坏，您将无法执行该脚本。此示例旨在演示解析当前正在执行的脚本。 - Dave

“旨在演示解决当前正在执行的脚本”-确实，这缩小了您选择关注的问题范围；只要您说明就可以了。由于您没有这样做，我在我的评论中进行了说明。请修复引用问题，这是答案范围无关的问题。 - mklement0

- keen · Answer 5

由于多年来我遇到了这个问题很多次，而这一次我需要一个纯Bash可移植版本，可以在OSX和Linux上使用，所以我写了一个：

最新版本在这里：

https://github.com/keen99/shell-functions/tree/master/resolve_path

为了SO的缘故，这里是当前版本（我觉得经过了很好的测试...但我乐意听取反馈！）

可能不难让它在普通的bourne shell（sh）中工作，但我没有尝试过...我太喜欢$FUNCNAME了。 :)

#!/bin/bash

resolve_path() {
    #I'm bash only, please!
    # usage:  resolve_path <a file or directory> 
    # follows symlinks and relative paths, returns a full real path
    #
    local owd="$PWD"
    #echo "$FUNCNAME for $1" >&2
    local opath="$1"
    local npath=""
    local obase=$(basename "$opath")
    local odir=$(dirname "$opath")
    if [[ -L "$opath" ]]
    then
    #it's a link.
    #file or directory, we want to cd into it's dir
        cd $odir
    #then extract where the link points.
        npath=$(readlink "$obase")
        #have to -L BEFORE we -f, because -f includes -L :(
        if [[ -L $npath ]]
         then
        #the link points to another symlink, so go follow that.
            resolve_path "$npath"
            #and finish out early, we're done.
            return $?
            #done
        elif [[ -f $npath ]]
        #the link points to a file.
         then
            #get the dir for the new file
            nbase=$(basename $npath)
            npath=$(dirname $npath)
            cd "$npath"
            ndir=$(pwd -P)
            retval=0
            #done
        elif [[ -d $npath ]]
         then
        #the link points to a directory.
            cd "$npath"
            ndir=$(pwd -P)
            retval=0
            #done
        else
            echo "$FUNCNAME: ERROR: unknown condition inside link!!" >&2
            echo "opath [[ $opath ]]" >&2
            echo "npath [[ $npath ]]" >&2
            return 1
        fi
    else
        if ! [[ -e "$opath" ]]
         then
            echo "$FUNCNAME: $opath: No such file or directory" >&2
            return 1
            #and break early
        elif [[ -d "$opath" ]]
         then 
            cd "$opath"
            ndir=$(pwd -P)
            retval=0
            #done
        elif [[ -f "$opath" ]]
         then
            cd $odir
            ndir=$(pwd -P)
            nbase=$(basename "$opath")
            retval=0
            #done
        else
            echo "$FUNCNAME: ERROR: unknown condition outside link!!" >&2
            echo "opath [[ $opath ]]" >&2
            return 1
        fi
    fi
    #now assemble our output
    echo -n "$ndir"
    if [[ "x${nbase:=}" != "x" ]]
     then
        echo "/$nbase"
    else 
        echo
    fi
    #now return to where we were
    cd "$owd"
    return $retval
}

以下是一个经典的例子，感谢brew：

%% ls -l `which mvn`
lrwxr-xr-x  1 draistrick  502  29 Dec 17 10:50 /usr/local/bin/mvn@ -> ../Cellar/maven/3.2.3/bin/mvn

使用此函数，它将返回“真实”路径：

%% cat test.sh
#!/bin/bash
. resolve_path.inc
echo
echo "relative symlinked path:"
which mvn
echo
echo "and the real path:"
resolve_path `which mvn`


%% test.sh

relative symlinked path:
/usr/local/bin/mvn

and the real path:
/usr/local/Cellar/maven/3.2.3/libexec/bin/mvn

- diyism · Answer 6

试试这个：

cd $(dirname $([ -L $0 ] && readlink -f $0 || echo $0))

- Clemens Tolboom · Answer 7

1

为了解决Mac不兼容的问题，我想出了一个解决方法。

echo `php -r "echo realpath('foo');"`

不是很好，但跨操作系统。

- Clemens Tolboom

3

Python 2.6+比php在更多的终端用户系统上都可用，因此python -c "from os import path; print(path.realpath('${SYMLINK_PATH}'));"可能更加合理。不过，当您需要从shell脚本中使用Python时，最好直接使用Python来避免跨平台shell脚本编写时的麻烦。 - Jonathan Baldwin

你不需要更多的东西，只需使用sh内置函数readlink、dirname和basename。 - Dave

1

@Dave：dirname、basename和readlink是外部_实用程序_，而不是shell内置命令；dirname和basename是POSIX的一部分，而readlink则不是。 - mklement0

@mklement0 - 你说得很对。它们由CoreUtils或等效物提供。我不应该在凌晨1点后访问SO。我的评论要点是，除了安装在基本系统中的语言解释器之外，PHP或任何其他语言解释器都不需要。我自1997年以来在这个页面上提供的脚本在每个Linux变体和自2006年以来的MacOS X上使用，没有出现错误。OP没有要求POSIX解决方案。他们特定的环境是Mac OS X。 - Dave

@Dave：是的，使用原始工具确实可以做到，但这也很难做到（正如您的脚本的缺点所证明的那样）。如果 OS X 真正是重点，那么这个答案就完全可以了 - 而且更简单 - 因为 php 已经随 OS X 一起提供了。然而，尽管问题的主体提到了 OS X，但它没有被标记为这样，而且已经变得清楚，各种平台的人们都来这里寻找答案，因此值得指出什么是特定于平台的/非 POSIX。 - mklement0

- dxlr8r · Answer 8

我的两分钱。这个函数符合POSIX标准，源和目标都可以包含->。然而，我尝试过使用包含换行符或制表符的文件名时它无法正常工作，因为ls通常会遇到这些问题。

resolve_symlink() {
  test -L "$1" && ls -l "$1" | awk -v SYMLINK="$1" '{ SL=(SYMLINK)" -> "; i=index($0, SL); s=substr($0, i+length(SL)); print s }'
}

我相信这里的解决方案是使用file命令，配合一个自定义的魔术文件，只输出提供符号链接的目标。

- Andi · Answer 9

这是最佳解决方案，在Bash 3.2.57中经过测试：

# Read a path (similar to `readlink`) recursively, until the physical path without any links (like `cd -P`) is found.
# Accepts any existing path, prints its physical path and exits `0`, exits `1` if some contained links don't exist.
# Motivation: `${BASH_SOURCE[0]}` often contains links; using it directly to extract your project's path may fail.
#
# Example: Safely `source` a file located relative to the current script
#
#     source "$(dirname "$(rreadlink "${BASH_SOURCE[0]}")")/relative/script.sh"
#Inspiration: https://dev59.com/5nVD5IYBdhLWcg3wWaNh#51089005
rreadlink () {
    declare p="$1" d l
    while :; do
        d="$(cd -P "$(dirname "$p")" && pwd)" || return $? #absolute path without symlinks
        p="$d/$(basename "$p")"
        if [ -h "$p" ]; then
            l="$(readlink "$p")" || break

            #A link must be resolved from its fully resolved parent dir.
            d="$(cd "$d" && cd -P "$(dirname "$l")" && pwd)" || return $?
            p="$d/$(basename "$l")"
        else
            break
        fi
    done
    printf '%s\n' "$p"
}

- Arunas Bartisius · Answer 10

我的回答在这里 Bash：如何获取符号链接的实际路径？

但简而言之，在脚本中非常方便：

script_home=$( dirname $(realpath "$0") )
echo Original script home: $script_home

这些是GNU coreutils的一部分，适用于Linux系统。

为了测试所有内容，我们将符号链接放入/home/test2/，进行一些额外的修改，并从根目录运行/调用它：

/$ /home/test2/symlink
/home/test
Original script home: /home/test

在哪里

Original script is: /home/test/realscript.sh
Called script is: /home/test2/symlink