如何使用String更新libc :: c_char数组？

Question

如何使用String更新libc :: c_char数组？

4

我已经编写了一些与C/C++相关的FFI代码，

use libc::c_char;
use std::ffi::CString;

type arr_type = [c_char; 20]; // arr_type is the type in C
let mut arr : arr_type = [0; 20]; 

let s = "happy123";
let c_s = CString::new(s).unwrap();
let s_ptr = c_s.as_ptr();

如何使用字符串 s 更新 arr ？在C / C ++中，我可以使用 memcpy ， strcpy 等 ...。

我已经尝试了很多种方法，例如使用rlibc :: memcpy并发现它不能与libc一起使用...但编译器不允许我通过，关于Rust中的数组几乎没有信息。

添加：阅读回复后，我想添加一些信息和更多问题。

1.

在C ++中，我使用 strcpy_s 将字符串复制到char数组中，因为字符串的长度和数组的大小都已知。

我尝试了下面的两种方法。 std :: iter :: Zip 方法非常像 strcpy_s ，但我不知道是否有一些性能影响。 copy_nonoverlapping 方法使用 as_mut_ptr（）将数组转换为指针，然后没有长度信息，因为它位于 unsafe {} 块中，并且我尝试复制一个比数组更长的字符串，并且没有显示任何错误...我想知道那样行吗？

并且在Rust中是否有类似于C ++中的strcpy_s的函数？

2.

我正在使用windows和msvc，对于char数组，我指的是~~不处理编码或使用默认代码页编码~~。

以下源文件都可以接受：

std::string s = "world is 世界";
std::wstring ws = L"world is 世界";

Qt:

QString qs = QStringLiteral("world is 世界");

Python 3:

s = 'world is 世界'

但在Rust中，以下内容可能是错误的吗？因为我在Eclipse调试窗口中看到了这个。

let s = "world is 世界";

我找到了rust-encoding并尝试了以下操作:

use encoding::{Encoding, EncoderTrap};
use encoding::all::GB18030;

let s = "world is 世界";  
let enc = GB18030.encode(&s , EncoderTrap::Strict);

有没有更好的方式在Rust中实现？

- sbant

阅读完回复后，我想补充一些信息并提出更多问题。添加更多信息是可以的，但是在 Stack Overflow 礼仪中，应该单独提出不同的问题。此外，人们可能甚至看不到您的第二个问题。 - Shepmaster

3个回答

2

我建议通过同时迭代数组和字符串来单独更新每个字符，并将字符串字符分配给数组字符。我在 Rust 字符串中添加了最终的 \0。

#![feature(libc)]
extern crate libc;

fn main() {
    use libc::c_char;

    type ArrType = [c_char; 20]; // arr_type is the type in C
    let mut arr : ArrType = [0; 20]; 

    let s = "happy123\0";
    assert!(s.len() <= arr.len());
    for (a, c) in arr.iter_mut().zip(s.bytes()) {
        *a = c as i8;
    }
}

试一试PlayPen

在大多数情况下，llvm会将该循环优化为memcopy。

define internal void @_ZN4main20hf4c098c7157f3263faaE() unnamed_addr #0 {
entry-block:
  %0 = alloca %"2.core::str::Bytes", align 8
  %arg4 = alloca %str_slice, align 8
  %1 = bitcast %"2.core::str::Bytes"* %0 to i8*
  call void @llvm.lifetime.start(i64 16, i8* %1)
  %2 = bitcast %str_slice* %arg4 to i8*
  call void @llvm.lifetime.start(i64 16, i8* %2)
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* bitcast (%str_slice* @const26 to i8*), i64 16, i32 8, i1 false)
  call void @_ZN3str3str5bytes20h68b1cf722a654e56XOgE(%"2.core::str::Bytes"* noalias nocapture sret dereferenceable(16) %0, %str_slice* noalias nocapture dereferenceable(16) %arg4)
  call void @llvm.lifetime.end(i64 16, i8* %2)
  call void @llvm.lifetime.end(i64 16, i8* %1) #3, !alias.scope !0, !noalias !3
  call void @llvm.lifetime.end(i64 16, i8* %1)
  ret void
}

- oli_obk

1

“你可能不想将[非ASCII字符]传递给C” — 为什么？我编写并使用了能够处理UTF-8的C代码。 - Shepmaster

那么你不应该使用 c_char，而应该使用 uint8_t 或类似的类型。 - oli_obk

1

即使所讨论的 C 方法接受 char * 类型的参数？ - Shepmaster

那么它就不支持utf8或接口规范有误。 - oli_obk

1

你能提供一些证明使用 char * 的 C 方法只接受 ASCII，如果你想使用 UTF-8 应该使用 8 位整数类型的资源吗？ - Shepmaster

我现在只想悄悄地躲在角落里，对于散布虚假信息我感到非常抱歉。 - oli_obk

1

在C/C++中，我可以使用memcpy、strcpy等函数... 在Rust中也可以使用它们，没有问题：

extern { fn memcpy(dst: *mut libc::c_void, src: *const libc::c_void, len: libc::size_t); }

let t_slice: &mut [c_char] = &mut arr;
unsafe {
    memcpy(t_slice.as_mut_ptr() as *mut libc::c_void, 
        s_ptr as *const libc::c_void, 
        c_s.as_bytes_with_nul().len() as libc::size_t);
}

但最好使用来自ptr模块的Rust等效函数std::ptr::copy_nonoverlapping：

let t_slice: &mut [c_char] = &mut arr;
unsafe {
    ptr::copy_nonoverlapping(t_slice.as_mut_ptr(), s_ptr, c_s.as_bytes_with_nul().len());
}

你应该注意unsafe块，因此你需要负责检查arr中是否有足够的空间。

- swizard

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Shepmaster · Accepted Answer

以下是一种不需要使用不安全代码的解决方案，但不幸的是大部分方法标记为不稳定。

#![feature(libc)]
#![feature(core)]
#![feature(collections)]

extern crate libc;

use libc::c_char;
use std::ffi::CString;
use std::slice::IntSliceExt;

type arr_type = [c_char; 20];

fn main() {
    let mut c_string: arr_type = [0; 20]; 
    let value = CString::new("happy123").unwrap();

    c_string.clone_from_slice(value.as_bytes_with_nul().as_signed());
}