汇编语言如何打印浮点数？

Question

汇编语言如何打印浮点数？

3

我尝试通过调用printf打印浮点数，但似乎总是只打印pi值（3.1415），尽管结果应该是计算后移动到pi变量的圆的面积。

.section .data
    value:
        .quad 0
    result:
            .asciz "The result is %lf \n"
    pi:
        .double 3.14159

.section .bss
.section .text
.globl _start
.type area, @function
area:

    nop
    imulq %rbx, %rbx
    movq %rbx, value
    fildq value
    fmul pi                           # multiply r^2 by pi
    fst  pi                           # Store result to pi
    movupd pi, %xmm0                  # move result to xmm0
    nop
    ret

_start:

    nop
    movq $2, %rbx
    call area                 # calculate for radius 2
    leaq result, %rdi         
    movq $1, %rax             # specify only one float value
    call printf                 
    movq $0, %rdi             # Exit
    call exit                     
    nop

我总是会得到3.1415。我不知道为什么，因为它应该被fst指令覆盖掉。

- KMG

1

movupd 加载 16 字节，但你只有一个 .double。请使用 movsd。此外，你应该将值存储到 value 而不是覆盖你的 pi 常量。如果你坚持使用传统的 x87，请使用 fldpi 获取更精确的 pi 常量。标准调用约定将第一个参数传递给 RDI，而不是 RBX。你的面积函数非常奇怪。 - Peter Cordes

@PeterCordes 非常感谢，这个函数只是用来测试的，所以我没有注意太多细节，但为什么移动16个字节似乎没有引起任何问题，即使在更改.data部分中定义pi变量的顺序时。 - KMG

除非是未映射页面前的最后8个字节，否则它实际上不会出现故障。但如果您与其他文件链接并将内容放入.data中，则可能会发生这种情况。在XMM寄存器中接受“double”参数的函数不关心寄存器的顶半部分是否为零。加载高半部分通常只是低效的（8字节存储后的存储转发停顿，可能会导致缓存行分裂，在旧CPU上，movupd本质上比movsd甚至movapd慢）。 - Peter Cordes

2

@KhaledGaber 只因现在看起来没有问题并不意味着它是正确的。错误的代码可能仅会在你最不希望的时候显露出其缺陷。 - fuz

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- fuz · Accepted Answer

如果您的浮点运算使用了内存操作数，您需要在操作数后添加一个大小后缀。否则，GNU汇编器将隐式地使用单精度，这不是您想要的结果。为了修复您的代码，请更改

fmul pi                           # multiply r^2 by pi
fst  pi                           # Store result to pi

to

fmull pi                           # multiply r^2 by pi
fstl  pi                           # Store result to pi

关于你的代码，还有一些其他的注意事项：

use rip-relative addressing modes instead of absolute addressing modes if possible. Specifically, this means to replace foo with foo(%rip) in your memory operands, including for lea result(%rip), %rdi
make sure to leave a clean x87 stack at the end of your functions or other code may spuriously cause it to overflow. For example, use fstpl pi(%rip) to store the result and pop it off the stack.
use movsd, not movupd to load one double into an SSE register, not a pair.
consider using SSE instead of x87 if possible for all the math. It's the standard way to do scalar FP math in x86-64, that's why XMM registers are part of the calling convention. (Unless you need 80-bit extended precision, but you have a pi constant in memory that's far less accurate than x87 fldpi.)
```
   ...
   cvtsi2sd   %rbx, %xmm0
   mulsd      pi(%rip), %xmm0
   ret
```