Java.io.IOException: ensureRemaining: 只剩下0个字节,试图读取1个

4
我在Giraph中使用自定义类遇到了一些问题。我创建了VertexInput和Output格式,但总是出现以下错误:
java.io.IOException: ensureRemaining: Only * bytes remaining, trying to read *

在不同的位置放置“*”值。

这是在单节点集群上测试的。

当vertexIterator执行next()时,如果没有更多的顶点,则会出现此问题。该迭代器从flush方法中调用,但我不明白为什么“next()”方法会失败。下面是一些日志和类...

我的日志如下:

15/09/08 00:52:21 INFO bsp.BspService: BspService: Connecting to ZooKeeper with job giraph_yarn_application_1441683854213_0001, 1 on localhost:22181
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:host.name=localhost
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_79
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.class.path=.:${CLASSPATH}:./**/
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/l$
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:os.version=3.13.0-62-generic
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hduser
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:user.dir=/app/hadoop/tmp/nm-local-dir/usercache/hduser/appcache/application_1441683854213_0001/container_1441683854213_0001_01_000003
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@4256d3a0
15/09/08 00:52:21 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:22181. Will not attempt to authenticate using SASL (unknown error)
15/09/08 00:52:21 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:22181, initiating session
15/09/08 00:52:21 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:22181, sessionid = 0x14fab0de0bb0002, negotiated timeout = 40000
15/09/08 00:52:21 INFO bsp.BspService: process: Asynchronous connection complete.
15/09/08 00:52:21 INFO netty.NettyServer: NettyServer: Using execution group with 8 threads for requestFrameDecoder.
15/09/08 00:52:21 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/09/08 00:52:21 INFO netty.NettyServer: start: Started server communication server: localhost/127.0.0.1:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288
15/09/08 00:52:21 INFO netty.NettyClient: NettyClient: Using execution handler with 8 threads after request-encoder.
15/09/08 00:52:21 INFO graph.GraphTaskManager: setup: Registering health of this worker...
15/09/08 00:52:21 INFO yarn.GiraphYarnTask: [STATUS: task-1] WORKER_ONLY starting...
15/09/08 00:52:22 INFO bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/giraph_yarn_application_1441683854213_0001/_masterJobState)
15/09/08 00:52:22 INFO bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir already exists!
15/09/08 00:52:22 INFO bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir already exists!
15/09/08 00:52:22 INFO worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir/0/_superstepD$
15/09/08 00:52:22 INFO netty.NettyServer: start: Using Netty without authentication.
15/09/08 00:52:22 INFO bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
15/09/08 00:52:22 INFO worker.BspServiceWorker: startSuperstep: Master(hostname=localhost, MRtaskID=0, port=30000)
15/09/08 00:52:22 INFO worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/giraph_yarn_application_1441683854$
15/09/08 00:52:22 INFO yarn.GiraphYarnTask: [STATUS: task-1] startSuperstep: WORKER_ONLY - Attempt=0, Superstep=-1
15/09/08 00:52:22 INFO netty.NettyClient: Using Netty without authentication.
15/09/08 00:52:22 INFO netty.NettyClient: Using Netty without authentication.
15/09/08 00:52:22 INFO netty.NettyClient: connectAllAddresses: Successfully added 2 connections, (2 total connected) 0 failed, 0 failures total.
15/09/08 00:52:22 INFO netty.NettyServer: start: Using Netty without authentication.
15/09/08 00:52:22 INFO handler.RequestDecoder: decode: Server window metrics MBytes/sec received = 0, MBytesReceived = 0.0001, ave received req MBytes = 0.0001, secs waited = 1.44168435E9
15/09/08 00:52:22 INFO worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 1 total splits.
15/09/08 00:52:22 INFO worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0, overall roughly 0.0% input splits rese$
15/09/08 00:52:22 INFO worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://hdnode01:54310/u$
15/09/08 00:52:22 INFO worker.InputSplitsCallable: loadFromInputSplit: Finished loading /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0 (v=6, e=10)
15/09/08 00:52:22 INFO worker.InputSplitsCallable: call: Loaded 1 input splits in 0.16241108 secs, (v=6, e=10) 36.94329 vertices/sec, 61.572155 edges/sec
15/09/08 00:52:22 ERROR utils.LogStacktraceCallable: Execution of callable failed

java.lang.IllegalStateException: next: IOException
        at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101)
        at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
        at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
        at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
        at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
        at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
        at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
        at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1
        at org.apache.giraph.utils.UnsafeReads.ensureRemaining(UnsafeReads.java:77)
        at org.apache.giraph.utils.UnsafeArrayReads.readByte(UnsafeArrayReads.java:123)
        at org.apache.giraph.utils.UnsafeReads.readLine(UnsafeReads.java:100)
        at pruebas.TextAndDoubleComplexWritable.readFields(TextAndDoubleComplexWritable.java:37)
        at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:540)
        at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
        ... 11 more
15/09/08 00:52:22 ERROR worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerHea$
15/09/08 00:52:22 ERROR yarn.GiraphYarnTask: GiraphYarnTask threw a top-level exception, failing task
java.lang.RuntimeException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@4bbf48f0
        at org.apache.giraph.yarn.GiraphYarnTask.run(GiraphYarnTask.java:104)
        at org.apache.giraph.yarn.GiraphYarnTask.main(GiraphYarnTask.java:183)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@4bbf48f0
        at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
        at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
        at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
        at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
        at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
        at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
        at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
        at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
        at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
        at org.apache.giraph.yarn.GiraphYarnTask.run(GiraphYarnTask.java:92)
        ... 1 more
Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: next: IOException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:202)
        at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
        at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
        ... 10 more
Caused by: java.lang.IllegalStateException: next: IOException
        at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101)
        at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
        at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
        at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
        at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
        at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
        at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
        at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1
        at org.apache.giraph.utils.UnsafeReads.ensureRemaining(UnsafeReads.java:77)
        at org.apache.giraph.utils.UnsafeArrayReads.readByte(UnsafeArrayReads.java:123)
        at org.apache.giraph.utils.UnsafeReads.readLine(UnsafeReads.java:100)
        at pruebas.TextAndDoubleComplexWritable.readFields(TextAndDoubleComplexWritable.java:37)
        at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:540)
        at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
        ... 11 more

我的输入格式:

package pruebas;

import org.apache.giraph.edge.Edge;
import org.apache.giraph.edge.EdgeFactory;
import org.apache.giraph.io.formats.AdjacencyListTextVertexInputFormat;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.TaskAttemptContext;

/**
 * @author hduser
 *
 */
public class IdTextWithComplexValueInputFormat
        extends
        AdjacencyListTextVertexInputFormat<Text, TextAndDoubleComplexWritable, DoubleWritable> {

    @Override
    public AdjacencyListTextVertexReader createVertexReader(InputSplit split,
            TaskAttemptContext context) {
        return new TextComplexValueDoubleAdjacencyListVertexReader();
    }

    protected class TextComplexValueDoubleAdjacencyListVertexReader extends
            AdjacencyListTextVertexReader {

        /**
         * Constructor with
         * {@link AdjacencyListTextVertexInputFormat.LineSanitizer}.
         *
         * @param lineSanitizer
         *            the sanitizer to use for reading
         */
        public TextComplexValueDoubleAdjacencyListVertexReader() {
            super();
        }

        @Override
        public Text decodeId(String s) {
            return new Text(s);
        }

        @Override
        public TextAndDoubleComplexWritable decodeValue(String s) {
            TextAndDoubleComplexWritable valorComplejo = new TextAndDoubleComplexWritable();
            valorComplejo.setVertexData(Double.valueOf(s));
            valorComplejo.setIds_vertices_anteriores("");
            return valorComplejo;
        }

        @Override
        public Edge<Text, DoubleWritable> decodeEdge(String s1, String s2) {
            return EdgeFactory.create(new Text(s1),
                    new DoubleWritable(Double.valueOf(s2)));
        }
    }

}

TextAndDoubleComplexWritable:

package pruebas;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;

import org.apache.hadoop.io.Writable;

public class TextAndDoubleComplexWritable implements Writable {

    private String idsVerticesAnteriores;

    private double vertexData;

    public TextAndDoubleComplexWritable() {
        super();
        this.idsVerticesAnteriores = "";
    }

    public TextAndDoubleComplexWritable(double vertexData) {
        super();
        this.vertexData = vertexData;
    }

    public TextAndDoubleComplexWritable(String ids_vertices_anteriores,
            double vertexData) {
        super();
        this.idsVerticesAnteriores = ids_vertices_anteriores;
        this.vertexData = vertexData;
    }

    public void write(DataOutput out) throws IOException {
        out.writeUTF(idsVerticesAnteriores);
    }

    public void readFields(DataInput in) throws IOException {
        idsVerticesAnteriores = in.readLine();
    }

    public String getIds_vertices_anteriores() {
        return idsVerticesAnteriores;
    }

    public void setIds_vertices_anteriores(String ids_vertices_anteriores) {
        this.idsVerticesAnteriores = ids_vertices_anteriores;
    }

    public double getVertexData() {
        return vertexData;
    }

    public void setVertexData(double vertexData) {
        this.vertexData = vertexData;
    }
}

我的输入文件:

Portada 0.0     Sugerencias     1.0
Sugerencias     3.0     Portada 1.0

而我使用以下命令来执行它:
$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner lectura_de_grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas.IdTextWithComplexValueInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250

希望得到任何帮助!


更新: 我的输入文件有问题。Giraph(或我使用的示例)无法很好地处理发往未列出顶点的传出信息。

但问题仍然存在。我在原始问题中更新了文件数据。

更新2: 没有使用OutputFormat,计算算法也从未执行过。我删除了它们以帮助澄清问题。

更新3,2015年11月19日: 问题不在输入格式中,输入格式工作正常并完全读取了数据。 问题出现在类 TextAndDoubleComplexWritable 中,我将其添加到原始问题中,以更好地解释最终解决方案(我也添加了答案)。

3个回答

3

这里是异常org.apache.giraph.utils.UnsafeReads.ensureRemaining的根本原因。请注意,它是由giraph utils调用的。

该异常意味着读取器坚持需要从输入流中获取更多输入,但输入流没有那么多剩余的输入(即已到达EOF)。


谢谢你的回答,但是如果你读了我的整个问题,当 next() 方法被调用时问题发生了,而且那个方法应该在读者达到 EOF 时停止迭代,对吧?但它没有停止。我不知道原因,这就是为什么我在这里问的原因!;) - chomp
请@ash或您的答案支持者(点赞者),如果您能提供一些额外的信息来帮助我解决问题,那将是非常好的。我更新了我的问题,并提供了更多的信息,尽我所能地帮助解决问题。 - chomp
你解决了这个问题吗?如果我不了解giraph期望的输入格式,那么帮助你就会很困难。 - ash
关于之前的评论(关于在EOF上停止迭代)- 它不能在EOF上干净地停止的原因是它期望在实际的EOF处有更多的输入,并在这种情况下中止(即抛出异常)。如果它不再期望更多的输入,那么我猜它应该指示EOF而不是抛出异常(我需要再仔细阅读代码来确保它的操作方式)。 - ash

1

我猜测你是否尝试检查next()方法是否返回了null,因为它可能在读取到结尾时出现这种情况。

比如:

if(method == null){
//Continue
}
else{
//It's Null
}

嗨@thomasjcf21,感谢您的回答,方法next()并没有返回null,我对此非常确定。但是,就像在任何迭代中一样,如果hasNext()不返回true,下一个方法就不应该被调用...但在我的情况下它确实被调用了,我不明白为什么... - chomp

0
问题出在TextAndDoubleComplexWritable类中。我没有意识到实现Writable接口时方法readFields和write的重要性。这两个方法非常关键,因为它们让我们在giraph中发送和接收信息。我在readFields方法中写了一个空字符串,而我应该使用该方法来写入我的顶点的两个值。我按以下方式更新了这两个方法:
public void write(DataOutput out) throws IOException {
        out.writeDouble(this.vertexData);
        out.writeUTF(this.idsVerticesAnteriores != "" ? "hola"
                : this.idsVerticesAnteriores);
}

public void readFields(DataInput in) throws IOException {
    this.vertexData = in.readDouble();
    this.idsVerticesAnteriores = in.readUTF();
    // idsVerticesAnteriores = in.readLine();
}

终于好了,这个程序终于能正常工作了!!


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接