C语言 - 无符号整数转无符号字符数组

Question

C语言 - 无符号整数转无符号字符数组

cbyteunsigned-integertype-conversionunsigned-char

23

我有一个无符号整数（2字节），我想将其转换为无符号字符类型。从我的搜索中，我发现大多数人建议执行以下操作：

 unsigned int x;
 ...
 unsigned char ch = (unsigned char)x;

这种方法正确吗？我问这个问题是因为unsigned char是一个字节，而我们从2个字节的数据转换为1个字节。

为了防止任何数据丢失，我想创建一个unsigned char[]数组，并将每个字节保存到数组中。但我卡在了以下问题：

 unsigned char ch[2];
 unsigned int num = 272;

 for(i=0; i<2; i++){
      // how should the individual bytes from num be saved in ch[0] and ch[1] ??
 }

此外，我们如何将 unsigned char[2] 转换回 unsigned int。

非常感谢。

- Jake

可能是[signed short to byte in c++]的重复问题（http://stackoverflow.com/questions/10288109/signed-short-to-byte-in-c）。 - Paul R

7个回答

16

怎么样：

ch[0] = num & 0xFF;
ch[1] = (num >> 8) & 0xFF;

这个相反的操作留作练习。

- cnicutar

8

使用一个联合体如何？

union {
    unsigned int num;
    unsigned char ch[2];
}  theValue;

theValue.num = 272;
printf("The two bytes: %d and %d\n", theValue.ch[0], theValue.ch[1]);

- abelenky

7

这取决于你的目标：你为什么想要将它转换为unsigned char？根据答案，有几种不同的方法可以实现：

Truncate: This is what was recomended. If you are just trying to squeeze data into a function which requires an unsigned char, simply cast uchar ch = (uchar)x (but, of course, beware of what happens if your int is too big).
Specific endian: Use this when your destination requires a specific format. Usually networking code likes everything converted to big endian arrays of chars:
```
int n = sizeof x;
for(int y=0; n-->0; y++)
    ch[y] = (x>>(n*8))&0xff;
```
will does that.
Machine endian. Use this when there is no endianness requirement, and the data will only occur on one machine. The order of the array will change across different architectures. People usually take care of this with unions:
```
union {int x; char ch[sizeof (int)];} u;
u.x = 0xf00
//use u.ch 
```
with memcpy:
```
uchar ch[sizeof(int)];
memcpy(&ch, &x, sizeof x);
```
or with the ever-dangerous simple casting (which is undefined behavior, and crashes on numerous systems):
```
char *ch = (unsigned char *)&x;
```

- Dave

4

当然，足够容纳更大值的字符数组大小必须与该值本身完全相同。因此，您可以简单地假装这个更大的值已经是一个字符数组：

unsigned int x = 12345678;//well, it should be just 1234.
unsigned char* pChars;

pChars = (unsigned char*) &x;

pChars[0];//one byte is here
pChars[1];//another byte here

一旦你理解了正在发生的事情，就可以不使用任何变量来完成，全部都是强制转换。

- Agent_L

2

如果 12345678 可以适应于 unsigned int，且 sizeof(unsigned int) == 2，那么 CHAR_BIT 就比平常大了；-) - Steve Jessop

我责怪32位社会宠坏了我！ - Agent_L

3

您只需要使用按位与运算符来提取这些字节。OxFF是一个十六进制掩码，用于提取一个字节。请参考此处的各种位操作 - http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/。

以下是一个示例程序：

#include <stdio.h>

int main()
{
    unsigned int i = 0x1122;
    unsigned char c[2];

    c[0] = i & 0xFF;
    c[1] = (i>>8) & 0xFF;

    printf("c[0] = %x \n", c[0]);
    printf("c[1] = %x \n", c[1]);
    printf("i    = %x \n", i);

    return 0;
}

输出：

$ gcc 1.c 
$ ./a.out 
c[0] = 22 
c[1] = 11 
i    = 1122 
$

- Sangeeth Saravanaraj

1

你的意思是要将num & 0xFF00向左移8位，对吗？(num & 0xFF00) >>8。否则，你只有一个低字节为零的16位整数。你仍然没有一个字节。或者，你可以直接移位：num >> 8; - abelenky

1

支持@abelenky的建议，使用union将是一种更加可靠的方法。

union unsigned_number {
    unsigned int  value;        // An int is 4 bytes long
    unsigned char index[4];     // A char is 1 byte long
};

这种类型的特点是编译器只会为数据结构中最大的成员unsigned_number分配内存，因为在这种情况下，两个成员（value和index）的大小相同，所以将会分配4字节。如果您将其定义为struct，则会在内存中分配8字节，因为编译器会为struct的所有成员进行分配。

此外，这里解决了你的问题，union数据结构的成员共享同一内存位置，这意味着它们都指向相同的数据 - 就像GNU/Linux系统上的硬链接一样。

因此，我们会有：

union unsigned_number my_number;

// Assigning decimal value 202050300 to my_number
// which is represented as 0xC0B0AFC in hex format
my_number.value = 0xC0B0AFC;   // Representation:  Binary - Decimal
                               // Byte 3: 00001100 - 12
                               // Byte 2: 00001011 - 11
                               // Byte 1: 00001010 - 10
                               // Byte 0: 11111100 - 252

// Printing out my_number one byte at time
for (int i = 0; i < (sizeof(my_number.value)); i++)
{
    printf("index[%d]: %u, 0x%x\n", \
        i, my_number.index[i], my_number.index[i]);
}

// Printing out my_number as an unsigned integer
printf("my_number.value: %u, 0x%x", my_number.value, my_number.value);

输出结果将是:

index[0]: 252, 0xfc
index[1]: 10, 0xa
index[2]: 11, 0xb
index[3]: 12, 0xc
my_number.value: 202050300, 0xc0b0afc

至于你的最后一个问题，我们不需要将 unsigned char 转换回 unsigned int，因为这些值已经存在。你只需要选择想要访问它的方式。

注意1：我使用了一个4字节的整数来简化概念的理解。对于你提出的问题，你必须使用：

union unsigned_number {
    unsigned short int value;        // A short int is 2 bytes long
    unsigned char      index[2];     // A char is 1 byte long
};

注意2：我已将byte 0分配给252，以突出我们的index字段的无符号特性。如果它被声明为signed char，那么输出将是index[0]: -4, 0xfc。

- jonathask

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jpalecek · Accepted Answer

在这种情况下，您可以使用memcpy:

memcpy(ch, (char*)&num, 2); /* although sizeof(int) would be better */

同样的方法，只需要将memcpy函数的参数反过来即可实现将unsigned char[2]转换为unsigned int。