vector<bool>
再次出现了
实际上,它在我的测试框中为0x1fffffffff20000位(即144 petabits)进行了分配。这直接来自IndexSet :: resize()。
现在我对HElib在这里使用std::vector<bool>
有严重的疑问(似乎他们会更好地使用诸如boost :: icl :: interval_set<>
之类的东西)。
好吧。那是一场狂野的鹅追赶(IndexSet序列化可以得到很大改进)。但是,真正的问题是您存在未定义行为,因为您在反序列化时未与序列化相同的类型进行反序列化。
您序列化了一个PubKey
,但尝试作为PubKey*
进行反序列化。哎呀。
除此之外,还存在许多问题:
You had to modify the library to make private members public. This can easily violate ODR (making the class layout incompatible).
You seem to treat the context as a "dynamic" resource, which will engage Object Tracking. This could be a viable approach. BUT. You'll have to think about ownership.
It seems like you didn't do that yet. For example, the line in load_construct_data
for DoublCRT
is a definite memory-leak:
helib::Context * context = new helib::Context(2,3,1);
You never use it nor ever free it. In fact, you simply overwrite it with the deserialized instance, which may or may not be owned. Catch-22
Exactly the same happens in load_construct_data
for PubKey
.
worse, in save_construct_data
you completely gratuitously copy context objects for each DoubleCRT
in each SecKey
:
auto context = polynomial->getContext();
archive << &context;
Because you fake it out as pointer-serialization, again (obviously useless) object tracking kicks in, just meaning you serialize redundant Context
copies which will will be all be leaked un deserialization.
I'd be tempted to assume the context instances in both would always be the same? Why not serialize the context(s) separately anyways?
In fact I went and analyzed the HElib source code to check these assumptions. It turns out I was correct. Nothing ever constructs a context outside
std::unique_ptr<Context> buildContextFromBinary(std::istream& str)
std::unique_ptr<Context> buildContextFromAscii(std::istream& str)
As you can see, they return owned pointers. You should have been using them. Perhaps even with the built-in serialization, that I practically stumble over here.
重新整合的时刻
我会使用HElib中的序列化代码(因为,为什么要重复发明轮子并制造大量错误呢?)。如果你坚持要与Boost Serialization集成,你可以两全其美:
template <class Archive> void save(Archive& archive, const helib::PubKey& pubkey, unsigned) {
using V = std::vector<char>;
using D = iostreams::back_insert_device<V>;
V data;
{
D dev(data);
iostreams::stream_buffer<D> sbuf(dev);
std::ostream os(&sbuf);
helib::writePubKeyBinary(os, pubkey);
}
archive << data;
}
template <class Archive> void load(Archive& archive, helib::PubKey& pubkey, unsigned) {
std::vector<char> data;
archive >> data;
using S = iostreams::array_source;
S source(data.data(), data.size());
iostreams::stream_buffer<S> sbuf(source);
{
std::istream is(&sbuf);
helib::readPubKeyBinary(is, pubkey);
}
}
就这些了,只有24行代码。而且它将由库的作者进行测试和维护。你无法击败它(显然)。我稍微修改了一下测试,所以我们不再滥用私人细节。
清理代码
通过分离一个帮助程序来处理Blob写入,我们可以以非常相似的方式实现不同的helib
类型:
namespace helib {
template <class A> void save(A& ar, const Context& o, unsigned) {
Blob data = to_blob(o, writeContextBinary);
ar << data;
}
template <class A> void load(A& ar, Context& o, unsigned) {
Blob data;
ar >> data;
from_blob(data, o, readContextBinary);
}
template <class A> void save(A& ar, const PubKey& o, unsigned) {
Blob data = to_blob(o, writePubKeyBinary);
ar << data;
}
template <class A> void load(A& ar, PubKey& o, unsigned) {
Blob data;
ar >> data;
from_blob(data, o, readPubKeyBinary);
}
}
我是一名有用的助手,可以翻译文本。
这对我来说就是优雅。
完整列表
我克隆了一个新的代码片段 https://gist.github.com/sehe/ba82a0329e4ec586363eb82d3f3b9326,其中包括以下变更集:
0079c07 Make it compile locally
b3b2cf1 Squelch the warnings
011b589 Endof investigations, regroup time
f4d79a6 Reimplemented using HElib binary IO
a403e97 Bitwise reproducible outputs
只有最后两个提交包含与实际修复相关的更改。
为了纪念,我也会在这里列出完整的代码。测试代码中有许多微妙的重新组织和相同的注释。您最好仔细阅读它们,看看是否理解它们以及其影响是否符合您的需求。我留下了描述测试断言为什么是它们所描述的内容的评论来帮助。
File serialization.hpp
#ifndef EVOTING_SERIALIZATION_H
#define EVOTING_SERIALIZATION_H
#define BOOST_TEST_MODULE main
#include <helib/helib.h>
#include <boost/serialization/split_free.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/iostreams/stream_buffer.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/device/array.hpp>
namespace {
using Blob = std::vector<char>;
template <typename T, typename F>
Blob to_blob(const T& object, F writer) {
using D = boost::iostreams::back_insert_device<Blob>;
Blob data;
{
D dev(data);
boost::iostreams::stream_buffer<D> sbuf(dev);
std::ostream os(&sbuf);
writer(os, object);
}
return data;
}
template <typename T, typename F>
void from_blob(Blob const& data, T& object, F reader) {
boost::iostreams::stream_buffer<boost::iostreams::array_source>
sbuf(data.data(), data.size());
std::istream is(&sbuf);
reader(is, object);
}
}
namespace helib {
template <class A> void save(A& ar, const Context& o, unsigned) {
Blob data = to_blob(o, writeContextBinary);
ar << data;
}
template <class A> void load(A& ar, Context& o, unsigned) {
Blob data;
ar >> data;
from_blob(data, o, readContextBinary);
}
template <class A> void save(A& ar, const PubKey& o, unsigned) {
Blob data = to_blob(o, writePubKeyBinary);
ar << data;
}
template <class A> void load(A& ar, PubKey& o, unsigned) {
Blob data;
ar >> data;
from_blob(data, o, readPubKeyBinary);
}
}
BOOST_SERIALIZATION_SPLIT_FREE(helib::Context)
BOOST_SERIALIZATION_SPLIT_FREE(helib::PubKey)
#endif
File test-serialization.cpp
#define BOOST_TEST_MODULE main
#include <boost/test/included/unit_test.hpp>
#include <helib/helib.h>
#include <fstream>
#include "serialization.hpp"
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
helib::Context helibTestMinimalContext(){
unsigned long p = 4999;
unsigned long m = 32109;
unsigned long r = 1;
return helib::Context(m, p, r);
}
helib::Context helibTestContext(){
auto context = helibTestMinimalContext();
unsigned long bits = 300;
unsigned long c = 2;
buildModChain(context, bits, c);
return context;
}
BOOST_AUTO_TEST_CASE(serialization_pubkey) {
auto context = helibTestContext();
helib::SecKey secret_key(context);
secret_key.GenSecKey();
helib::addSome1DMatrices(secret_key);
const helib::PubKey& original_pubkey = secret_key;
std::string const filename = "pubkey.serialized";
{
std::ofstream os(filename, std::ios::binary);
boost::archive::binary_oarchive oarchive(os);
oarchive << context << original_pubkey;
}
{
std::ofstream os(filename + ".2", std::ios::binary);
boost::archive::binary_oarchive oarchive(os);
oarchive << context << original_pubkey;
}
{
helib::Context surrogate = helibTestMinimalContext();
std::ifstream ifs(filename, std::ios::binary);
boost::archive::binary_iarchive iarchive(ifs);
iarchive >> surrogate;
BOOST_TEST((context == surrogate));
helib::SecKey independent(surrogate);
helib::PubKey& indep_pk = independent;
iarchive >> indep_pk;
BOOST_TEST((indep_pk != original_pubkey));
{
std::ofstream os(filename + ".3", std::ios::binary);
boost::archive::binary_oarchive oarchive(os);
oarchive << surrogate << indep_pk;
}
}
{
helib::PubKey restored_pubkey(context);
{
std::ifstream ifs(filename, std::ios::binary);
boost::archive::binary_iarchive iarchive(ifs);
iarchive >> context >> restored_pubkey;
}
BOOST_TEST((restored_pubkey == original_pubkey));
{
std::ofstream os(filename + ".4", std::ios::binary);
boost::archive::binary_oarchive oarchive(os);
oarchive << context << restored_pubkey;
}
}
}
测试输出
time ./test-serialization -l all -r detailed
Running 1 test case...
Entering test module "main"
test-serialization.cpp(34): Entering test case "serialization_pubkey"
test-serialization.cpp(61): info: check (context == surrogate) has passed
test-serialization.cpp(70): info: check (indep_pk != original_pubkey) has passed
test-serialization.cpp(82): info: check (restored_pubkey == original_pubkey) has passed
test-serialization.cpp(34): Leaving test case "serialization_pubkey"; testing time: 36385217us
Leaving test module "main"; testing time: 36385273us
Test module "main" has passed with:
1 test case out of 1 passed
3 assertions out of 3 passed
Test case "serialization_pubkey" has passed with:
3 assertions out of 3 passed
real 0m36,698s
user 0m35,558s
sys 0m0,850s
位可重复输出
反复序列化后,输出似乎确实是按位相同的,这可能是一种重要的属性:
sha256sum pubkey.serialized*
66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized
66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized.2
66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized.3
66b95adbd996b100bff58774e066e7a309e70dff7cbbe08b5c77b9fa0f63c97f pubkey.serialized.4
请注意,它在每次运行时并不相同(因为它生成不同的密钥材料)。
支线任务(寻找野鹅)
手动改进IndexSet序列化代码的一种方法是同时使用vector<bool>
:
template<class Archive>
void save(Archive & archive, const helib::IndexSet & index_set, const unsigned int version){
std::vector<bool> elements;
elements.resize(index_set.last()-index_set.first()+1);
for (auto n : index_set)
elements[n-index_set.first()] = true;
archive << index_set.first() << elements;
}
template<class Archive>
void load(Archive & archive, helib::IndexSet & index_set, const unsigned int version){
long first_ = 0;
std::vector<bool> elements;
archive >> first_ >> elements;
index_set.clear();
for (size_t n = 0; n < elements.size(); ++n) {
if (elements[n])
index_set.insert(n+first_);
}
}
更好的想法是使用
dynamic_bitset
(我恰好为其
贡献了序列化代码(请参见
如何序列化boost::dynamic_bitset?):
template<class Archive>
void save(Archive & archive, const helib::IndexSet & index_set, const unsigned int version){
boost::dynamic_bitset<> elements;
elements.resize(index_set.last()-index_set.first()+1);
for (auto n : index_set)
elements.set(n-index_set.first());
archive << index_set.first() << elements;
}
template<class Archive>
void load(Archive & archive, helib::IndexSet & index_set, const unsigned int version) {
long first_ = 0;
boost::dynamic_bitset<> elements;
archive >> first_ >> elements;
index_set.clear();
for (size_t n = elements.find_first(); n != -1; n = elements.find_next(n))
index_set.insert(n+first_);
}
当然,你可能需要为
IndexMap
做类似的事情。
helib::PubKey
的定义是缺失的。 - Superlokkusserialization::access
和这些方法:https://dev59.com/r4vda4cB1Zd3GeqPaYae#30595430 - sehe