sbt在CI中是否会一直重新编译整个项目,即使启用了缓存?

11

我正在尝试使用SBT进行CI流程,具体流程如下:

  1. 编译测试
  2. 缓存~/.sbt~/.ivy2/cache
  3. 缓存项目中所有target目录

接下来的步骤如下:

  1. 恢复~/.sbt~/.ivy2/cache
  2. 恢复完整项目,包括以前生成的包含.class文件和相同源代码的target目录(应该是相同的检查)
  3. 通过sbt test运行测试

每次都会重新编译整个项目,我想了解或调试为什么会这样,因为自上次编译以来没有更改任何内容(好吧,不应该有任何更改,那么是什么导致它认为发生了更改?)

我目前正在使用带有docker执行程序的circleci。这意味着每个步骤都有一个新的docker实例,来自相同的镜像,但我希望缓存可以解决这个问题。

.circleci/config.yml的相关部分(如果您不使用circle,则仍然可以理解;我已经注释了我能理解的部分):

---
version: 2

jobs:
  # compile and cache compilation
  test-compile:
    working_directory: /home/circleci/myteam/myproj
    docker:
      - image: myorg/myimage:sbt-1.2.8
    steps:
      # the directory to be persisted (cached/restored) to the next step
      - attach_workspace:
          at: /home/circleci/myteam
      # git pull to /home/circleci/myteam/myproj
      - checkout
      - restore_cache:
          # look for a pre-existing set of ~/.ivy2/cache, ~/.sbt dirs 
          # from a prior build
          keys:
            - sbt-artifacts-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}
      - restore_cache:
          # look for pre-existing set of 'target' dirs from a prior build
          keys:
            - build-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}
      - run:
          # the compile step
          working_directory: /home/circleci/myteam/myproj
          command: sbt test:compile
      # per: https://www.scala-sbt.org/1.0/docs/Travis-CI-with-sbt.html
      # Cleanup the cached directories to avoid unnecessary cache updates
      - run:
          working_directory: /home/circleci
          command: |
            rm -rf /home/circleci/.ivy2/.sbt.ivy.lock
            find /home/circleci/.ivy2/cache -name "ivydata-*.properties" -print -delete
            find /home/circleci/.sbt -name "*.lock" -print -delete
      - save_cache:
          # cache ~/.ivy2/cache and ~/.sbt for subsequent builds
          key: sbt-artifacts-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}-{{ .Revision }}
          paths:
            - /home/circleci/.ivy2/cache
            - /home/circleci/.sbt
      - save_cache:
          # cache the `target` dirs for subsequenet builds
          key: build-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}-{{ .Revision }}
          paths:
            - /home/circleci/myteam/myproj/target
            - /home/circleci/myteam/myproj/project/target
            - /home/circleci/myteam/myproj/project/project/target
      # in circle, a 'workflow' undergoes several jobs, this first one 
      # is 'compile', the next will run the tests (see next 'job' section
      # 'test-run' below). 
      # 'persist to workspace' takes any files from this job and ensures 
      # they 'come with' the workspace to the next job in the workflow
      - persist_to_workspace:
          root: /home/circleci/myteam
          # bring the git checkout, including all target dirs
          paths:
            - myproj
      - persist_to_workspace:
          root: /home/circleci
          # bring the big stuff
          paths:
            - .ivy2/cache
            - .sbt

  # actually runs the tests compiled in the previous job
  test-run:
    environment:
      SBT_OPTS: -XX:+UseConcMarkSweepGC -XX:+UnlockDiagnosticVMOptions  -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Duser.timezone=Etc/UTC -Duser.language=en -Duser.country=US
    docker:
      # run tests in the same image as before, but technically 
      # a different instance
      - image: myorg/myimage:sbt-1.2.8
    steps:
      # bring over all files 'persist_to_workspace' in the last job
      - attach_workspace:
          at: /home/circleci/myteam
      # restore ~/.sbt and ~/.ivy2/cache via `mv` from the workspace 
      # back to the home dir
      - run:
          working_directory: /home/circleci/myteam
          command: |
            [[ ! -d /home/circleci/.ivy2 ]] && mkdir /home/circleci/.ivy2

            for d in .ivy2/cache .sbt; do
              [[ -d "/home/circleci/$d" ]] && rm -rf "/home/circleci/$d"
              if [ -d "$d"  ]; then
                mv -v "$d" "/home/circleci/$d"
              else
                echo "$d does not exist" >&2
                ls -la . >&2
                exit 1
              fi
            done
      - run:
          # run the tests, already compiled
          # note: recompiles everything every time!
          working_directory: /home/circleci/myteam/myproj
          command: sbt test
          no_output_timeout: 3900s

workflows:
  version: 2
  build-and-test:
    jobs:
      - test-compile
      - test-run:
          requires:
            - test-compile

第二阶段的输出通常如下:

#!/bin/bash -eo pipefail
sbt test

[info] Loading settings for project myproj-build from native-packager.sbt,plugins.sbt ...
[info] Loading project definition from /home/circleci/myorg/myproj/project
[info] Updating ProjectRef(uri("file:/home/circleci/myorg/myproj/project/"), "myproj-build")...
[info] Done updating.
[warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings.
[info] Compiling 1 Scala source to /home/circleci/myorg/myproj/project/target/scala-2.12/sbt-1.0/classes ...
[info] Done compiling.
[info] Loading settings for project root from build.sbt ...
[info] Set current project to Piranha (in build file:/home/circleci/myorg/myproj/)
[info] Compiling 1026 Scala sources to /home/circleci/myorg/myproj/target/scala-2.12/classes ...

我该怎么确定为什么这个程序会再次重新编译所有源码并减轻这种情况呢?
我正在一个Linux容器中使用sbt 1.2.8与scala 2.12.8来运行。

更新

我还没有解决问题,但我想分享一个解决最严重问题的解决方法。

主要问题:将“测试编译”与“测试运行”分开 次要问题:更快的构建,而不必在每次推送时重新编译所有内容

我没有次要问题的解决方案。 对于主要问题:

我可以通过scala -cp ... org.scalatest.tools.Runner从CLI中运行scalatest runner,而不是通过sbt test以避免任何尝试重新编译。 这个运行器可以针对一个.class文件目录进行操作。

变更摘要:

  1. 更新docker容器以包含scala cli安装。(不幸的是,我现在需要保持这些版本同步)
  2. 构建阶段:sbt test:compile 'inspect run' 'export test:fullClasspath' | tee >(grep -F '.jar' > ~test-classpath.txt)
    • 编译并记录可复制的类路径字符串,适合传递到scala -cp VALUE_HERE来运行测试
  3. 测试阶段:scala -cp "$(cat test-classpath.txt)" org.scalatest.tools.Runner -R target/scala-2.12/test-classes/ -u target/test-reports -oD
    • 通过运行器运行scalatest,使用target/scala-2.12/test-classes中编译的.class文件,使用编译阶段报告的类路径,并打印到stdout以及报告目录

我不太喜欢这个方法,它也有一些问题,但我想分享这个解决方法。

5个回答

1

如果您使用的是比1.0.4版本更高的sbt版本,则缓存功能将无法正常工作,因为编译器会始终使所有内容失效。 这个zinc编译器问题已经在这里报告过了:https://github.com/sbt/sbt/issues/4168

我的建议是降级sbt版本以用于CI。同样要检查和验证CI是否正在改变.sbt或.ivy2文件时间戳。如果它们被更改,请通过压缩和解压缩它们来单独缓存它们。

我在Bitbucket Pipelines CI上遇到了同样的问题,并成功解决了它here


0

我有同样的问题。我放弃了尝试让所有时间戳匹配,最终发现我可以使用:

sbt 'set  Compile / compile / skip := true' 'test'

它仍然不完美,sourceGenerators和可能还有其他一些东西可能仍然运行,但肯定比没有好得多。


0

0

我在使用sbt 1.2.8时也遇到了这个问题,它出现在gitlab的作业中。之前(在使用sbt 0.13时)缓存target目录是正常工作的。

现在我正在尝试手动调试,通过设置:

logLevel := Level.Debug,
incOptions := incOptions.value.withApiDebug(true).withRelationsDebug(true),

在我的构建中。这应该打印出无效的原因。它产生了太多的输出以至于在CI中运行时我很难复现我看到问题的确切条件。


1
缓存目标目录正常工作的确切sbt版本是什么?我刚尝试了0.13.17,但它仍然无论如何都会重新编译所有内容。 - Ihor Kaharlichenko

0

我在travis构建中遇到了类似的问题,我认为这个解决方案同样适用于circle-ci。根本原因是缓存被存储为tar文件,其中文件的修改时间只有一秒的分辨率。您可以指定具有足够分辨率的格式。对我来说,解决方案是创建一个小脚本travis_tar.sh

#!/bin/bash
/bin/tar-orig --format=posix $@

然后用这个脚本替换系统的tar:

sudo mv /bin/tar /bin/tar-orig
sudo mv .travis/travis_tar.sh /bin/tar
sudo chmod +x /bin/tar

这可能发生在缓存加载后,原始系统tar可以很好地解压POSIX格式的tar文件。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接