如何将压缩整数（16.16）定点转换为浮点数？

Question

如何将压缩整数（16.16）定点转换为浮点数？

floating-pointfixed-point

16

如何将一个“32位带符号定点数（16.16）”转换为浮点数？

(fixed >> 16) + (fixed & 0xffff) / 65536.0可以吗？-2.5呢？-0.5呢？

还是fixed / 65536.0是正确的方法吗？

（附：有符号定点数“-0.5”在内存中长什么样子？）

- Ecir Hana

一个好的思考方式是将简单的值可视化为格式 <16位十六进制整数部分>.<16位十六进制小数部分>。1 = 0b1 = 0x0001.0000 = 65536。因此，如果1等于65536，那么2将是2 x 65536，以此类推。 - vaughan

4个回答

6

class FixedPointUtils {
  public static final int ONE = 0x10000;

  /**
   * Convert an array of floats to 16.16 fixed-point
   * @param arr The array
   * @return A newly allocated array of fixed-point values.
   */
  public static int[] toFixed(float[] arr) {
    int[] res = new int[arr.length];
    toFixed(arr, res);
    return res;
  }

  /**
   * Convert a float to  16.16 fixed-point representation
   * @param val The value to convert
   * @return The resulting fixed-point representation
   */
  public static int toFixed(float val) {
    return (int)(val * 65536F);
  }

  /**
   * Convert an array of floats to 16.16 fixed-point
   * @param arr The array of floats
   * @param storage The location to store the fixed-point values.
   */
  public static void toFixed(float[] arr, int[] storage)
  {
    for (int i=0;i<storage.length;i++) {
      storage[i] = toFixed(arr[i]);
    }
  }

  /**
   * Convert a 16.16 fixed-point value to floating point
   * @param val The fixed-point value
   * @return The equivalent floating-point value.
   */
  public static float toFloat(int val) {
    return ((float)val)/65536.0f;
  }

  /**
   * Convert an array of 16.16 fixed-point values to floating point
   * @param arr The array to convert
   * @return A newly allocated array of floats.
   */
  public static float[] toFloat(int[] arr) {
    float[] res = new float[arr.length];
    toFloat(arr, res);
    return res;
  }

  /**
   * Convert an array of 16.16 fixed-point values to floating point
   * @param arr The array to convert
   * @param storage Pre-allocated storage for the result.
   */
  public static void toFloat(int[] arr, float[] storage)
  {
    for (int i=0;i<storage.length;i++) {
      storage[i] = toFloat(arr[i]);
    }
  }

}

- Ashok Domadiya

0

CodesInChaos 其实是错误的说法

(fixed >> 16) + (fixed & 0xffff) / 65536.0

不起作用。如果fixed是一个32位有符号整数，那么对于负数，它实际上是从0或者说0x1_0000_0000中减去的值，即一个33位的数字。这就是二进制补码的工作原理。因此，需要从下一个较小的整数中添加这些小数位以读取正确的值！

因此，对于整数-1，(fixed >> 16)将产生浮点数-1，并且将(fixed & 0xffff) / 65536.0=65535/65536添加到-1将产生正确的值-1/65536，因为-65536/65536 + 65535/65536 = -1/65536

- bastel

0

在阅读了CodesInChaos的回答后，我编写了一个非常方便的C++函数模板。您可以传递小数部分的长度（例如，BMP文件格式使用2.30固定点数）。如果省略小数部分长度，则该函数假定小数部分和整数部分具有相同的长度。

#include <math.h> // for NaN
#include <limits.h> // for CHAR_BIT = 8

template<class T> inline double fixed_point2double(const T& x, int frac_digits = (CHAR_BIT * sizeof(T)) / 2 )
{
  if (frac_digits >= CHAR_BIT * sizeof(T)) return NAN;
  return double(x) / double( T(1) << frac_digits) );
}

如果你想从内存中读取这样的数字，我写了一个函数模板

#include <math.h> // for NaN
#include <limits.h> // for CHAR_BIT = 8

template<class T> inline double read_little_endian_fixed_point(const unsigned char *x, int frac_digits = (CHAR_BIT * sizeof(T)) / 2)
// ! do not use for single byte types 'T'
{
  if (frac_digits >= CHAR_BIT * sizeof(T)) return NAN;

  T res = 0;

  for (int i = 0, shift = 0; i < sizeof(T); ++i, shift += CHAR_BIT)
    res |= ((T)x[i]) << shift;

  return double(res) / double( T(1) << frac_digits) );
}

- John Smith

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- CodesInChaos · Accepted Answer

我假设使用补码的32位整数和像C#中一样的运算符。

如何进行转换？

fixed / 65536.0

是正确且易于理解的。

(fixed >> 16) + (fixed & 0xffff) / 65536.0

对于正整数来说，与上面的代码等价，但速度较慢，难以阅读。你基本上使用分配律将单个除法分成两个部分，并使用位移写出第一个除法。

对于负整数，fixed & 0xffff不能给出小数部分，因此对于负数不正确。

看一下原始整数-1，它应该映射到-1/65536。这段代码返回65535/65536。

根据您的编译器，执行以下操作可能会更快：

fixed * (1/65536.0)

但我认为现代大多数编译器已经进行了这种优化。

有符号定点数“-0.5”在内存中的表示是什么样子？

反转转换会给我们：

RoundToInt(float*65536)

设置float=-0.5会得到：-32768。