识别图片中的物体颜色，有什么方法可以获取其颜色？

Question

识别图片中的物体颜色，有什么方法可以获取其颜色？

javatensorflowimage-recognition

9

我正在使用Tensorflow来识别提供的图片中的对象，遵循此教程并使用这个存储库，我成功使我的程序返回图片中的对象。例如，这是我用作输入的图片：

这是我的程序输出的结果：

我想获取识别物品的颜色（最后一个案例是红色运动衫），这可能吗？

以下是代码（来自最后一个链接，只有小改动）。

/* Copyright 2016 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
    http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/

package com.test.sec.compoment;

import java.io.IOException;
import java.io.PrintStream;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
import org.tensorflow.DataType;
import org.tensorflow.Graph;
import org.tensorflow.Output;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
import org.tensorflow.types.UInt8;

/** Sample use of the TensorFlow Java API to label images using a pre-trained model. */
public class ImageRecognition {
  private static void printUsage(PrintStream s) {
    final String url =
        "https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip";
    s.println(
        "Java program that uses a pre-trained Inception model (http://arxiv.org/abs/1512.00567)");
    s.println("to label JPEG images.");
    s.println("TensorFlow version: " + TensorFlow.version());
    s.println();
    s.println("Usage: label_image <model dir> <image file>");
    s.println();
    s.println("Where:");
    s.println("<model dir> is a directory containing the unzipped contents of the inception model");
    s.println("            (from " + url + ")");
    s.println("<image file> is the path to a JPEG image file");
  }

  public void index() {
        String modelDir = "C:/Users/Admin/Downloads/inception5h";
        String imageFile = "C:/Users/Admin/Desktop/red-tshirt.jpg";

    byte[] graphDef = readAllBytesOrExit(Paths.get(modelDir, "tensorflow_inception_graph.pb"));
    List<String> labels =
        readAllLinesOrExit(Paths.get(modelDir, "imagenet_comp_graph_label_strings.txt"));
    byte[] imageBytes = readAllBytesOrExit(Paths.get(imageFile));

    try (Tensor<Float> image = constructAndExecuteGraphToNormalizeImage(imageBytes)) {
      float[] labelProbabilities = executeInceptionGraph(graphDef, image);
      int bestLabelIdx = maxIndex(labelProbabilities);
      System.out.println(
          String.format("BEST MATCH: %s (%.2f%% likely)",
              labels.get(bestLabelIdx),
              labelProbabilities[bestLabelIdx] * 100f));
    }
  }

  private static Tensor<Float> constructAndExecuteGraphToNormalizeImage(byte[] imageBytes) {
    try (Graph g = new Graph()) {
      GraphBuilder b = new GraphBuilder(g);
      // Some constants specific to the pre-trained model at:
      // https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
      //
      // - The model was trained with images scaled to 224x224 pixels.
      // - The colors, represented as R, G, B in 1-byte each were converted to
      //   float using (value - Mean)/Scale.
      final int H = 224;
      final int W = 224;
      final float mean = 117f;
      final float scale = 1f;

      // Since the graph is being constructed once per execution here, we can use a constant for the
      // input image. If the graph were to be re-used for multiple input images, a placeholder would
      // have been more appropriate.
      final Output<String> input = b.constant("input", imageBytes);
      final Output<Float> output =
          b.div(
              b.sub(
                  b.resizeBilinear(
                      b.expandDims(
                          b.cast(b.decodeJpeg(input, 3), Float.class),
                          b.constant("make_batch", 0)),
                      b.constant("size", new int[] {H, W})),
                  b.constant("mean", mean)),
              b.constant("scale", scale));
      try (Session s = new Session(g)) {
        return s.runner().fetch(output.op().name()).run().get(0).expect(Float.class);
      }
    }
  }

  private static float[] executeInceptionGraph(byte[] graphDef, Tensor<Float> image) {
    try (Graph g = new Graph()) {
      g.importGraphDef(graphDef);
      try (Session s = new Session(g);
          Tensor<Float> result =
              s.runner().feed("input", image).fetch("output").run().get(0).expect(Float.class)) {
        final long[] rshape = result.shape();
        if (result.numDimensions() != 2 || rshape[0] != 1) {
          throw new RuntimeException(
              String.format(
                  "Expected model to produce a [1 N] shaped tensor where N is the number of labels, instead it produced one with shape %s",
                  Arrays.toString(rshape)));
        }
        int nlabels = (int) rshape[1];
        return result.copyTo(new float[1][nlabels])[0];
      }
    }
  }

  private static int maxIndex(float[] probabilities) {
    int best = 0;
    for (int i = 1; i < probabilities.length; ++i) {
      if (probabilities[i] > probabilities[best]) {
        best = i;
      }
    }
    return best;
  }

  private static byte[] readAllBytesOrExit(Path path) {
    try {
      return Files.readAllBytes(path);
    } catch (IOException e) {
      System.err.println("Failed to read [" + path + "]: " + e.getMessage());
      System.exit(1);
    }
    return null;
  }

  private static List<String> readAllLinesOrExit(Path path) {
    try {
      return Files.readAllLines(path, Charset.forName("UTF-8"));
    } catch (IOException e) {
      System.err.println("Failed to read [" + path + "]: " + e.getMessage());
      System.exit(0);
    }
    return null;
  }

  // In the fullness of time, equivalents of the methods of this class should be auto-generated from
  // the OpDefs linked into libtensorflow_jni.so. That would match what is done in other languages
  // like Python, C++ and Go.
  static class GraphBuilder {
    GraphBuilder(Graph g) {
      this.g = g;
    }

    Output<Float> div(Output<Float> x, Output<Float> y) {
      return binaryOp("Div", x, y);
    }

    <T> Output<T> sub(Output<T> x, Output<T> y) {
      return binaryOp("Sub", x, y);
    }

    <T> Output<Float> resizeBilinear(Output<T> images, Output<Integer> size) {
      return binaryOp3("ResizeBilinear", images, size);
    }

    <T> Output<T> expandDims(Output<T> input, Output<Integer> dim) {
      return binaryOp3("ExpandDims", input, dim);
    }

    <T, U> Output<U> cast(Output<T> value, Class<U> type) {
      DataType dtype = DataType.fromClass(type);
      return g.opBuilder("Cast", "Cast")
          .addInput(value)
          .setAttr("DstT", dtype)
          .build()
          .<U>output(0);
    }

    Output<UInt8> decodeJpeg(Output<String> contents, long channels) {
      return g.opBuilder("DecodeJpeg", "DecodeJpeg")
          .addInput(contents)
          .setAttr("channels", channels)
          .build()
          .<UInt8>output(0);
    }

    <T> Output<T> constant(String name, Object value, Class<T> type) {
      try (Tensor<T> t = Tensor.<T>create(value, type)) {
        return g.opBuilder("Const", name)
            .setAttr("dtype", DataType.fromClass(type))
            .setAttr("value", t)
            .build()
            .<T>output(0);
      }
    }
    Output<String> constant(String name, byte[] value) {
      return this.constant(name, value, String.class);
    }

    Output<Integer> constant(String name, int value) {
      return this.constant(name, value, Integer.class);
    }

    Output<Integer> constant(String name, int[] value) {
      return this.constant(name, value, Integer.class);
    }

    Output<Float> constant(String name, float value) {
      return this.constant(name, value, Float.class);
    }

    private <T> Output<T> binaryOp(String type, Output<T> in1, Output<T> in2) {
      return g.opBuilder(type, type).addInput(in1).addInput(in2).build().<T>output(0);
    }

    private <T, U, V> Output<T> binaryOp3(String type, Output<U> in1, Output<V> in2) {
      return g.opBuilder(type, type).addInput(in1).addInput(in2).build().<T>output(0);
    }
    private Graph g;
  }
}

- Neji Soltani

这个程序正在使用Inception网络https://arxiv.org/pdf/1512.00567v2.pdf，如果你想从中得到分割结果，你需要使用一个完全卷积的网络。 - matt

该网络不会产生分割结果，只是对图像进行分类。它永远不会提取球衣的形状或像素。因此，如果你想使用现有代码，我建议你对图像进行直方图处理并提取出最常见的颜色。 - matt

@matt 谢谢，我已经查过了，但在大多数情况下它将返回背景颜色。 - Neji Soltani

我把你正在使用的示例更改为使用紫外线提供的物体检测器代码。如果你感兴趣的话。 - matt

2

@NejiSoltani 只需跟随官方OpenCV Android示例“color-blob-detection” https://github.com/opencv/opencv/tree/master/samples/android/color-blob-detection，您可以在此之前尝试一下 https://play.google.com/store/apps/details?id=com.jnardari.opencv_androidsamples - sladomic

显示剩余7条评论

3个回答

0

首先，您需要删除背景像素以保留您的对象，然后构建一个包含所有剩余像素的列表，最后计算平均颜色。

关于颜色检测方法，您可以参考《彩色图像处理：新兴应用》、颜色检测，尤其是我们如何处理颜色检测。

- A. STEFANI

0

使用下面的代码片段可以获取RGB颜色代码，但由于图像可能包含不同颜色的像素，因此需要您决定一个点（例如中心）并获取具有垂直（Y）和水平（X）坐标的RGB代码。

//create image object from byte array
BufferedImage imageobj=null;
Color[][] imgcolor=null;
try {
    imageobj=ImageIO.read(new ByteArrayInputStream(imageBytes));
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}
if(imageobj!=null){
    imgcolor=new Color[imageobj.getWidth()][imageobj.getHeight()];
    for(int i=0;i<imageobj.getWidth();i++){
        for(int j=0;j<imageobj.getHeight();j++){
            imgcolor[i][j]=new Color(imageobj.getRGB(i, j));
        }
    }
}


if(imgcolor!=null && imgcolor.length>0){
          System.out.println("Object Color "+imgcolor[imageobj.getWidth()/2][imageobj.getHeight()/2].toString());
      }

- Syed Mudasir

只有当对象在中心位置时，才会有效。 - Neji Soltani

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sumsuddin Shojib · Accepted Answer

您正在使用一个预测给定图像标签的代码，即将图像分类为一些经过训练的类别，因此您不知道对象的确切像素。

因此，我建议您执行以下任何一个操作：

使用对象检测器检测对象的位置并获取边界框。然后获取大多数像素的颜色。
使用像这个一样的像素级分类（分割）来获取对象的确切像素。

请注意，您可能需要手动为对象训练网络（或模型）。

对于Java对象检测示例，请查看this项目，该项目是为android编写的，但在桌面应用程序中使用它们应该很简单。更具体地，请查看this部分。

您不需要同时进行对象检测和分割，但如果想要，我认为首先尝试使用Python训练分割模型（链接在上面提供），然后像对象检测模型一样在Java中使用该模型即可。

编辑2：

我已经在java中添加了一个简单物体检测客户端，它使用Tensorflow Object detection API models，只是为了向您展示您可以在java中使用任何冻结的模型。

此外，请查看这个漂亮的存储库，它使用像素级分割。