使用协议缓冲和内部数据模型

Question

使用协议缓冲和内部数据模型

15

我有一个现有的内部数据模型，用于表示图片，如下所示：

package test.model;
public class Picture {

  private int height, width;
  private Format format;

  public enum Format {
    JPEG, BMP, GIF
  }

  // Constructor, getters and setters, hashCode, equals, toString etc.
}

我现在想使用protocol buffers将其序列化。我编写了一个Picture.proto文件，该文件镜像了Picture类的字段，并且使用PictureProtoBuf作为类名，在test.model.protobuf包下编译了代码：

package test.model.protobuf;

option java_package = "test.model.protobuf";
option java_outer_classname = "PictureProtoBuf";

message Picture {
  enum Format {
    JPEG = 1;
    BMP = 2;
    GIF = 3;
  }
  required uint32 width = 1;
  required uint32 height = 2;
  required Format format = 3;
}

现在我假设如果我有一个要序列化并发送的Picture，我需要创建一个PictureProtoBuf对象，并将所有字段映射过去，如下所示：

Picture p = new Picture(100, 200, Picture.JPEG);
PictureProtoBuf.Picture.Builder output = PictureProtoBuf.Picture.newBuilder();
output.setHeight(p.getHeight());
output.setWidth(p.getWidth());

我在数据模型中使用枚举时遇到了问题。我目前使用的方法很丑陋：

output.setFormat(PictureProtoBuf.Picture.Format.valueOf(p.getFormat().name());

然而，这种方法容易出现问题并且依赖于内部数据模型和协议缓冲器数据模型之间的枚举名称一致（在.proto文件中枚举名称需要唯一，因此这不是一个很好的假设）。如果来自内部模型的.name()调用与protobuf生成的枚举名称不匹配，我可以看到自己不得不手工制作枚举switch语句。我想知道的是，我这样做是正确的方式吗? 我是否应该舍弃我的内部数据模型(test.model.Picture)，转而使用protobuf生成的模型(test.model.protobuf.PictureProtoBuf)？如果是这样，那么我怎么能实现我在内部数据模型中所做的某些良好特性（例如hashCode(), equals(Object), toString()等）？

- Catchwa

我没有尝试过它（仅因为我主要是一个.NET开发者），但我相信protostuff可以让你继续使用现有的模型。 - Marc Gravell

@MarcGravell - 感谢您的建议。您的直觉是正确的；protostuff正好符合我的要求，但在后端保留了协议缓冲区（尚未测试其与Google协议缓冲库的兼容性）。 - Catchwa

3个回答

6

如果您可以控制内部数据模型，您可以修改test.model.Picture，使枚举值知道它们对应的protobuf等效项，可能会将对应关系传递给您的枚举构造函数。

例如，使用Guava的BiMap（具有唯一值的双向映射），我们得到如下结果：

enum ProtoEnum { // we don't control this
  ENUM1, ENUM2, ENUM3;
}

enum MyEnum {
  ONE(ProtoEnum.ENUM1), TWO(ProtoEnum.ENUM2), THREE(ProtoEnum.ENUM3);

  static final ImmutableBiMap<MyEnum, ProtoEnum> CORRESPONDENCE;

  static {
    ImmutableBiMap.Builder<ProtoEnum, MyEnum> builder = ImmutableBiMap.builder();
    for (MyEnum x : MyEnum.values()) {
      builder.put(x.corresponding, x);
    }
    CORRESPONDENCE = builder.build();
  }

  private final ProtoEnum corresponding;

  private MyEnum(ProtoEnum corresponding) {
    this.corresponding = corresponding;
  }
}

然后，如果我们想要查找与ProtoEnum对应的MyEnum，只需要执行MyEnum.CORRESPONDENCE.get(protoEnum)。反过来，我们只需执行MyEnum.CORRESPONDENCE.inverse().get(myEnum)或myEnum.getCorresponding()。

- Louis Wasserman

谢谢你的回答。我认为我理解了这个概念，但我不确定如何实现它。你介意帮我写一些代码吗？ - Catchwa

2

一种方法是只保留生成的枚举：

package test.model;
public class Picture {

  private int height, width;
  private PictureProtoBuf.Picture.Format format;

 // Constructor, getters and setters, hashCode, equals, toString etc.
}

我已经使用过这个几次了，它可能或可能不适用于你的情况。但是，使用protobuf生成的类作为数据模型（或扩展它们以添加功能）是不推荐的。

- Dmitri

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Catchwa · Accepted Answer

尽管现有的答案很好，但我决定进一步研究Marc Gravell的建议并探索protostuff。您可以使用protostuff 运行时模块以及动态ObjectSchema来为内部数据模型创建运行时模式。我的代码现在简化为：

// Do this once
private static Schema<Picture> schema = RuntimeSchema.getSchema(Picture.class);
private static final LinkedBuffer buffer = LinkedBuffer.allocate(DEFAULT_BUFFER_SIZE);

// For each Picture you want to serialize...
Picture p = new Picture(100, 200, Picture.JPEG);
byte[] result = ProtobufIOUtil.toByteArray(p, schema, buffer);
buffer.clear();
return result;

这是对Google protobuf库的巨大改进（请参见我的问题），特别适用于内部数据模型中有大量属性的情况。在我的使用案例中，我没有发现任何速度惩罚。