使用Selenium在Chrome中进行全页面截图

20

据我所知,这在selenium+PhantomJS中是可能的。 - Andersson
4
是的,但我需要Chrome驱动程序。 - Majid Laissi
如果你正在使用Java,似乎有人已经创建了一个很好的库来完成这样的事情。也许可以试试这个 https://github.com/yandex-qatools/ashot/ - st0ve
@stewartm 那个库使用 JavaScript 滚动页面并拍摄许多截图。我在我的代码中采用了类似的方法,但当有浮动窗口时效果不佳。 - Majid Laissi
关于 Ashot -- 它可以滚动,因此浮动标题会重复出现。 - JohnP2
3个回答

38

自Chrome v59以来,使用Selenium可以拍摄整个网页截图。Chrome驱动程序新增了两个端点以直接调用DevTools API:

/session/:sessionId/chromium/send_command_and_get_result
/session/:sessionId/chromium/send_command

Selenium API并没有实现这些命令,因此您必须使用基础执行器直接发送它们。虽然不太直观,但至少可以产生与 DevTools 完全相同的结果。

以下是一个使用 Python 在本地或远程实例上工作的示例:

from selenium import webdriver
import json, base64

capabilities = {
  'browserName': 'chrome',
  'chromeOptions':  {
    'useAutomationExtension': False,
    'args': ['--disable-infobars']
  }
}

driver = webdriver.Chrome(desired_capabilities=capabilities)
driver.get("https://stackoverflow.com/questions")

png = chrome_takeFullScreenshot(driver)

with open(r"C:\downloads\screenshot.png", 'wb') as f:
  f.write(png)

以及获取整个页面截图的代码:

def chrome_takeFullScreenshot(driver) :

  def send(cmd, params):
    resource = "/session/%s/chromium/send_command_and_get_result" % driver.session_id
    url = driver.command_executor._url + resource
    body = json.dumps({'cmd':cmd, 'params': params})
    response = driver.command_executor._request('POST', url, body)
    return response.get('value')

  def evaluate(script):
    response = send('Runtime.evaluate', {'returnByValue': True, 'expression': script})
    return response['result']['value']

  metrics = evaluate( \
    "({" + \
      "width: Math.max(window.innerWidth, document.body.scrollWidth, document.documentElement.scrollWidth)|0," + \
      "height: Math.max(innerHeight, document.body.scrollHeight, document.documentElement.scrollHeight)|0," + \
      "deviceScaleFactor: window.devicePixelRatio || 1," + \
      "mobile: typeof window.orientation !== 'undefined'" + \
    "})")
  send('Emulation.setDeviceMetricsOverride', metrics)
  screenshot = send('Page.captureScreenshot', {'format': 'png', 'fromSurface': True})
  send('Emulation.clearDeviceMetricsOverride', {})

  return base64.b64decode(screenshot['data'])

使用Java:

public static void main(String[] args) throws Exception {

    ChromeOptions options = new ChromeOptions();
    options.setExperimentalOption("useAutomationExtension", false);
    options.addArguments("disable-infobars");

    ChromeDriverEx driver = new ChromeDriverEx(options);

    driver.get("https://stackoverflow.com/questions");
    File file = driver.getFullScreenshotAs(OutputType.FILE);
}
import java.lang.reflect.Method;
import java.util.Map;
import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeDriverService;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.remote.CommandInfo;
import org.openqa.selenium.remote.HttpCommandExecutor;
import org.openqa.selenium.remote.http.HttpMethod;


public class ChromeDriverEx extends ChromeDriver {

    public ChromeDriverEx() throws Exception {
        this(new ChromeOptions());
    }

    public ChromeDriverEx(ChromeOptions options) throws Exception {
        this(ChromeDriverService.createDefaultService(), options);
    }

    public ChromeDriverEx(ChromeDriverService service, ChromeOptions options) throws Exception {
        super(service, options);
        CommandInfo cmd = new CommandInfo("/session/:sessionId/chromium/send_command_and_get_result", HttpMethod.POST);
        Method defineCommand = HttpCommandExecutor.class.getDeclaredMethod("defineCommand", String.class, CommandInfo.class);
        defineCommand.setAccessible(true);
        defineCommand.invoke(super.getCommandExecutor(), "sendCommand", cmd);
    }

    public <X> X getFullScreenshotAs(OutputType<X> outputType) throws Exception {
        Object metrics = sendEvaluate(
            "({" +
            "width: Math.max(window.innerWidth,document.body.scrollWidth,document.documentElement.scrollWidth)|0," +
            "height: Math.max(window.innerHeight,document.body.scrollHeight,document.documentElement.scrollHeight)|0," +
            "deviceScaleFactor: window.devicePixelRatio || 1," +
            "mobile: typeof window.orientation !== 'undefined'" +
            "})");
        sendCommand("Emulation.setDeviceMetricsOverride", metrics);
        Object result = sendCommand("Page.captureScreenshot", ImmutableMap.of("format", "png", "fromSurface", true));
        sendCommand("Emulation.clearDeviceMetricsOverride", ImmutableMap.of());
        String base64EncodedPng = (String)((Map<String, ?>)result).get("data");
        return outputType.convertFromBase64Png(base64EncodedPng);
    }

    protected Object sendCommand(String cmd, Object params) {
        return execute("sendCommand", ImmutableMap.of("cmd", cmd, "params", params)).getValue();
    }

    protected Object sendEvaluate(String script) {
        Object response = sendCommand("Runtime.evaluate", ImmutableMap.of("returnByValue", true, "expression", script));
        Object result = ((Map<String, ?>)response).get("result");
        return ((Map<String, ?>)result).get("value");
    }
}

1
在Java中如何实现? - Majid Laissi
2
尝试扩展HttpCommandExecutor并调用defineCommand以使用新的端点。然后,使用扩展的HttpCommandExecutor实例化一个新的RemoteWebDriver。 - Florent B.
"Emulation.resetViewport" 在 Chrome 61 上似乎无法正常工作(而在 Chrome 60 上可以)。有其他替代方法吗? - Majid Laissi
命令“Emulation.forceViewport”和“Emulation.resetViewport”在API中已不再存在,而Emulation.setVisibleSize现已过时。它们被“Emulation.setDeviceMetricsOverride”和“Emulation.clearDeviceMetricsOverride”所取代。似乎不再可能冻结视口的大小。因此,如果内容的大小根据视口的大小设置,则屏幕截图可能会在底部被截断。我还添加了Java代码。 - Florent B.
我知道我不能评论表示感谢,但该死的,我找不到其他的方法来表达我的感激之情。非常感谢你的回答,希望你能在它被删除之前看到它。 - Marthinus Bosman
显示剩余2条评论

4
要在Java中使用Selenium Webdriver实现这一点需要花费一些功夫。正如Florent B.所示,我们需要更改默认ChromeDriver使用的某些类才能实现这一点。首先,我们需要创建一个新的DriverCommandExecutor,该执行器添加了新的Chrome命令:
import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.remote.CommandInfo;
import org.openqa.selenium.remote.http.HttpMethod;
import org.openqa.selenium.remote.service.DriverCommandExecutor;
import org.openqa.selenium.remote.service.DriverService;

public class MyChromeDriverCommandExecutor extends DriverCommandExecutor {
    private static final ImmutableMap<String, CommandInfo> CHROME_COMMAND_NAME_TO_URL;

    public MyChromeDriverCommandExecutor(DriverService service) {
        super(service, CHROME_COMMAND_NAME_TO_URL);
    }

    static {
        CHROME_COMMAND_NAME_TO_URL = ImmutableMap.of("launchApp", new CommandInfo("/session/:sessionId/chromium/launch_app", HttpMethod.POST)
        , "sendCommandWithResult", new CommandInfo("/session/:sessionId/chromium/send_command_and_get_result", HttpMethod.POST)
        );
    }
}

接下来我们需要创建一个新的ChromeDriver类,然后使用这个东西。我们需要创建这个类,因为原始类没有构造函数让我们替换命令执行器...所以新类变成了:

import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.Capabilities;
import org.openqa.selenium.WebDriverException;
import org.openqa.selenium.chrome.ChromeDriverService;
import org.openqa.selenium.html5.LocalStorage;
import org.openqa.selenium.html5.Location;
import org.openqa.selenium.html5.LocationContext;
import org.openqa.selenium.html5.SessionStorage;
import org.openqa.selenium.html5.WebStorage;
import org.openqa.selenium.interactions.HasTouchScreen;
import org.openqa.selenium.interactions.TouchScreen;
import org.openqa.selenium.mobile.NetworkConnection;
import org.openqa.selenium.remote.FileDetector;
import org.openqa.selenium.remote.RemoteTouchScreen;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.openqa.selenium.remote.html5.RemoteLocationContext;
import org.openqa.selenium.remote.html5.RemoteWebStorage;
import org.openqa.selenium.remote.mobile.RemoteNetworkConnection;

public class MyChromeDriver  extends RemoteWebDriver implements LocationContext, WebStorage, HasTouchScreen, NetworkConnection {
    private RemoteLocationContext locationContext;
    private RemoteWebStorage webStorage;
    private TouchScreen touchScreen;
    private RemoteNetworkConnection networkConnection;

    //public MyChromeDriver() {
    //  this(ChromeDriverService.createDefaultService(), new ChromeOptions());
    //}
    //
    //public MyChromeDriver(ChromeDriverService service) {
    //  this(service, new ChromeOptions());
    //}

    public MyChromeDriver(Capabilities capabilities) {
        this(ChromeDriverService.createDefaultService(), capabilities);
    }

    //public MyChromeDriver(ChromeOptions options) {
    //  this(ChromeDriverService.createDefaultService(), options);
    //}

    public MyChromeDriver(ChromeDriverService service, Capabilities capabilities) {
        super(new MyChromeDriverCommandExecutor(service), capabilities);
        this.locationContext = new RemoteLocationContext(this.getExecuteMethod());
        this.webStorage = new RemoteWebStorage(this.getExecuteMethod());
        this.touchScreen = new RemoteTouchScreen(this.getExecuteMethod());
        this.networkConnection = new RemoteNetworkConnection(this.getExecuteMethod());
    }

    @Override
    public void setFileDetector(FileDetector detector) {
        throw new WebDriverException("Setting the file detector only works on remote webdriver instances obtained via RemoteWebDriver");
    }

    @Override
    public LocalStorage getLocalStorage() {
        return this.webStorage.getLocalStorage();
    }

    @Override
    public SessionStorage getSessionStorage() {
        return this.webStorage.getSessionStorage();
    }

    @Override
    public Location location() {
        return this.locationContext.location();
    }

    @Override
    public void setLocation(Location location) {
        this.locationContext.setLocation(location);
    }

    @Override
    public TouchScreen getTouch() {
        return this.touchScreen;
    }

    @Override
    public ConnectionType getNetworkConnection() {
        return this.networkConnection.getNetworkConnection();
    }

    @Override
    public ConnectionType setNetworkConnection(ConnectionType type) {
        return this.networkConnection.setNetworkConnection(type);
    }

    public void launchApp(String id) {
        this.execute("launchApp", ImmutableMap.of("id", id));
    }
}

这主要是原始类的副本,但有一些构造函数被禁用了(因为一些需要的代码是包私有的)。如果您需要这些构造函数,则必须将类放置在org.openqa.selenium.chrome包中。

通过这些更改,您可以使用Selenium API来调用所需的代码,如Florent B.所示,但现在是Java语言:

import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.remote.Command;
import org.openqa.selenium.remote.Response;

import javax.annotation.Nonnull;
import javax.annotation.Nullable;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
import java.util.HashMap;
import java.util.Map;

public class ChromeExtender {
    @Nonnull
    private MyChromeDriver m_wd;

    public ChromeExtender(@Nonnull MyChromeDriver wd) {
        m_wd = wd;
    }

    public void takeScreenshot(@Nonnull File output) throws Exception {
        Object visibleSize = evaluate("({x:0,y:0,width:window.innerWidth,height:window.innerHeight})");
        Long visibleW = jsonValue(visibleSize, "result.value.width", Long.class);
        Long visibleH = jsonValue(visibleSize, "result.value.height", Long.class);

        Object contentSize = send("Page.getLayoutMetrics", new HashMap<>());
        Long cw = jsonValue(contentSize, "contentSize.width", Long.class);
        Long ch = jsonValue(contentSize, "contentSize.height", Long.class);

        /*
         * In chrome 61, delivered one day after I wrote this comment, the method forceViewport was removed.
         * I commented it out here with the if(false), and hopefully wrote a working alternative in the else 8-/
         */
        if(false) {
            send("Emulation.setVisibleSize", ImmutableMap.of("width", cw, "height", ch));
            send("Emulation.forceViewport", ImmutableMap.of("x", Long.valueOf(0), "y", Long.valueOf(0), "scale", Long.valueOf(1)));
        } else {
            send("Emulation.setDeviceMetricsOverride",
                ImmutableMap.of("width", cw, "height", ch, "deviceScaleFactor", Long.valueOf(1), "mobile", Boolean.FALSE, "fitWindow", Boolean.FALSE)
            );
            send("Emulation.setVisibleSize", ImmutableMap.of("width", cw, "height", ch));
        }

        Object value = send("Page.captureScreenshot", ImmutableMap.of("format", "png", "fromSurface", Boolean.TRUE));

        // Since chrome 61 this call has disappeared too; it does not seem to be necessary anymore with the new code.
        // send("Emulation.resetViewport", ImmutableMap.of()); 
        send("Emulation.setVisibleSize", ImmutableMap.of("x", Long.valueOf(0), "y", Long.valueOf(0), "width", visibleW, "height", visibleH));

        String image = jsonValue(value, "data", String.class);
        byte[] bytes = Base64.getDecoder().decode(image);

        try(FileOutputStream fos = new FileOutputStream(output)) {
            fos.write(bytes);
        }
    }

    @Nonnull
    private Object evaluate(@Nonnull String script) throws IOException {
        Map<String, Object> param = new HashMap<>();
        param.put("returnByValue", Boolean.TRUE);
        param.put("expression", script);

        return send("Runtime.evaluate", param);
    }

    @Nonnull
    private Object send(@Nonnull String cmd, @Nonnull Map<String, Object> params) throws IOException {
        Map<String, Object> exe = ImmutableMap.of("cmd", cmd, "params", params);
        Command xc = new Command(m_wd.getSessionId(), "sendCommandWithResult", exe);
        Response response = m_wd.getCommandExecutor().execute(xc);

        Object value = response.getValue();
        if(response.getStatus() == null || response.getStatus().intValue() != 0) {
            //System.out.println("resp: " + response);
            throw new MyChromeDriverException("Command '" + cmd + "' failed: " + value);
        }
        if(null == value)
            throw new MyChromeDriverException("Null response value to command '" + cmd + "'");
        //System.out.println("resp: " + value);
        return value;
    }

    @Nullable
    static private <T> T jsonValue(@Nonnull Object map, @Nonnull String path, @Nonnull Class<T> type) {
        String[] segs = path.split("\\.");
        Object current = map;
        for(String name: segs) {
            Map<String, Object> cm = (Map<String, Object>) current;
            Object o = cm.get(name);
            if(null == o)
                return null;
            current = o;
        }
        return (T) current;
    }
}

这样可以按照指定的命令进行操作,并创建一个包含png格式图像的文件。当然,你也可以直接使用ImageIO.read()读取字节并创建BufferedImage。

太棒了!!!你能否添加ChromeExtender类的导入?我猜测了其中一些。 - SiKing
嗨@SiKing,我添加了导入。但是要注意,Chrome 61(我今天得到的)删除了在我的原始示例中使用的forceViewport调用。我已经在takeScreenshot()的代码中添加了一种替代方法,似乎对我有用。 - fjalvingh
"Emulation.resetViewport" 在 Chrome 61 上似乎无法正常工作(在 Chrome 60 上可以)。有没有其他替代方法? - Majid Laissi
似乎不再需要那个调用了。我刚刚将其删除,看起来没问题了? - fjalvingh
结果证明,在截图之前仅使用Emulation.setVisibleSize,并在之后将其重置为先前的值即可解决问题。至少在Chrome 61中是这样的。感谢您的帮助。 - Majid Laissi

0
在Selenium 4中,FirefoxDriver提供了一个getFullPageScreenshotAs方法,可以处理垂直和水平滚动,以及固定元素(例如导航栏)。ChromeDriver可能会在以后的版本中实现此方法。
System.setProperty("webdriver.gecko.driver", "path/to/geckodriver");
final FirefoxOptions options = new FirefoxOptions();
// set options...
final FirefoxDriver driver = new FirefoxDriver(options);
driver.get("https://stackoverflow.com/");
File fullScreenshotFile = driver.getFullPageScreenshotAs(OutputType.FILE);
// File will be deleted once the JVM exits, so you should copy it

对于 ChromeDriver,可以使用 selenium-shutterbug 库。

Shutterbug.shootPage(driver, Capture.FULL, true).save();
// Take screenshot of the whole page using Chrome DevTools

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接