在Rust中使用serde_json时如何避免双重反斜杠转义?

3

在SERDE中,是否有可能从字节中创建字符串值而不需要将反斜杠字符加倍?

Playground:

use serde_json::json;
use serde_json::{Value};
use std::str;

fn main() {
    let bytes = [79, 66, 88, 90, 70, 65, 68, 54, 80, 54, 76, 65, 92,
                117, 48, 48, 49, 102, 50, 50, 50, 50, 71, 66, 54, 87,
                65, 65, 85, 52, 54, 87, 87, 86, 92, 117, 48, 48, 49, 102,
                123, 92, 34, 36, 116, 122, 92, 34, 58, 92, 34, 69, 117, 114,
                111, 112, 101, 47, 66, 101, 114, 108, 105, 110, 92, 34, 125];
    let string = str::from_utf8(&bytes).unwrap();
    let json_string = json!(&string);
    let json_string2 = Value::String(string.to_string());
    println!("string: {}",string);
    println!("json 1: {}",json_string);
    println!("json 2: {}",json_string2);
}
3个回答

3

您有一个已包含转义字符的字符串。为了避免反斜杠本身被转义,您可以在将字符串传递给serde之前自行解释这些转义字符。例如,使用unescape crate来解释这些转义字符,代码如下:

use serde_json::json;
use std::str;
use unescape::unescape;

fn main() {
    let bytes = [
        79, 66, 88, 90, 70, 65, 68, 54, 80, 54, 76, 65, 92, 117, 48, 48, 49, 102, 50, 50, 50, 50,
        71, 66, 54, 87, 65, 65, 85, 52, 54, 87, 87, 86, 92, 117, 48, 48, 49, 102, 123, 92, 34, 36,
        116, 122, 92, 34, 58, 92, 34, 69, 117, 114, 111, 112, 101, 47, 66, 101, 114, 108, 105, 110,
        92, 34, 125,
    ];
    let string_with_escapes = str::from_utf8(&bytes).unwrap();
    let unescaped_string = unescape(string_with_escapes).unwrap();
    let json_string = json!(&unescaped_string);
    println!("string with escapes: {}", string_with_escapes);
    println!("string without escapes: {}", unescaped_string);
    println!("json: {}", json_string);
}

输出(但请注意,没有转义的字符串包含一些不可打印的字符,无法呈现):

string with escapes: OBXZFAD6P6LA\u001f2222GB6WAAU46WWV\u001f{\"$tz\":\"Europe/Berlin\"}
string without escapes: OBXZFAD6P6LA2222GB6WAAU46WWV{"$tz":"Europe/Berlin"}
json: "OBXZFAD6P6LA\u001f2222GB6WAAU46WWV\u001f{\"$tz\":\"Europe/Berlin\"}"

如果你希望避免依赖于 unescape(自2016年创立以来一直未更新),甚至可以让 serde_json 来执行非转义操作:

fn unescape(s: &str) -> serde_json::Result<String> {
    serde_json::from_str(&format!("\"{}\"", s))
}

Playground


嗨 @user4815162342,谢谢你的回答。我也遇到了一些问题,但是有一个问题,我使用的是str而不是utf8,出现了一个错误返回,你知道为什么吗?链接:https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=31fdf0620890af4e1ca074d37122a44d - steven-lie

1
以下字符在JSON中是保留的,必须进行适当的转义才能在字符串中使用:
  • 回退用\b替换
  • 换页用\f替换
  • 换行符用\n替换
  • 回车用\r替换
  • 制表符用\t替换
  • 双引号用\"替换
  • 反斜杠用\\替换
答案为“No”,因为如果不转义反斜杠,则serde将生成无效的JSON。

但如何从&[u8]构建serde_json :: Value :: String?

您必须首先创建一个常规字符串,然后转义保留字符。幸运的是,serde已经为我们提供了json!()宏来完成后者:
use serde_json::json;

fn main() {
    // if it's not a UTF-8 encoded string, then you should use some external crate to do the decoding
    let slice: &[u8] = //some utf-8 encoded slice
    let string = String::from_utf8(slice.to_vec()).unwrap();
    let v = json!("hel\"lo");
    println!("{:?}", v);
}

好的,谢谢。我想做的是动态构建一个 JSON 对象。你知道如何从 &[u8] 构建一个 serde_json::Value::String 吗?比如说,如果我有一个 &[u8],我该如何将它动态地插入到一个 JSON 对象中呢? - Marko Seidenglanz

0

您可以创建自定义格式化器

use serde::ser::Serialize;
use serde_json::ser::{CharEscape, Serializer};
use std::{io, str};

struct NoEscape;

impl serde_json::ser::Formatter for NoEscape {
    fn write_char_escape<W: ?Sized>(
        &mut self,
        writer: &mut W,
        char_escape: CharEscape,
    ) -> io::Result<()>
    where
        W: io::Write,
    {
        use self::CharEscape::*;

        let c = match char_escape {
            Quote => '"',
            ReverseSolidus => '\\',
            Solidus => '/',
            Backspace => 'b',
            FormFeed => 'f',
            LineFeed => 'n',
            CarriageReturn => 'r',
            Tab => 't',
            AsciiControl(_) => todo!(),
        };

        write!(writer, "{}", c)
    }
}

fn main() {
    let raw = [
        79, 66, 88, 90, 70, 65, 68, 54, 80, 54, 76, 65, 92, 117, 48, 48, 49, 102, 50, 50, 50, 50,
        71, 66, 54, 87, 65, 65, 85, 52, 54, 87, 87, 86, 92, 117, 48, 48, 49, 102, 123, 92, 34, 36,
        116, 122, 92, 34, 58, 92, 34, 69, 117, 114, 111, 112, 101, 47, 66, 101, 114, 108, 105, 110,
        92, 34, 125,
    ];
    let foo = str::from_utf8(&raw).unwrap();
    let mut ser = Serializer::with_formatter(Vec::new(), NoEscape {});
    foo.serialize(&mut ser).unwrap();
    let writer = ser.into_inner();
    let result = str::from_utf8(&writer).unwrap();

    assert_eq!(
        result,
        r#""OBXZFAD6P6LA\u001f2222GB6WAAU46WWV\u001f{\"$tz\":\"Europe/Berlin\"}""#
    )
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接