使用门级图理解二进制乘法器

Question

使用门级图理解二进制乘法器

binaryverilogfpgamultiplicationsynthesis

3

我不太理解以下code（bimpy.v）的无符号2位乘法操作。

编辑：根据我的一个朋友的评论，以下修改使用更少的逻辑也可以完成相同的操作！

o_r <= (i_a[0] ? i_b : 2'b0) + ((i_a[1] ? i_b : 2'b0) << 1);

问题： bimpy.v 中的两个信号（w_r 和 c）有什么目的？

assign  w_r =  { ((i_a[1])?i_b:{(BW){1'b0}}), 1'b0 }
            ^ { 1'b0, ((i_a[0])?i_b:{(BW){1'b0}}) };

assign  c = { ((i_a[1])?i_b[(BW-2):0]:{(BW-1){1'b0}}) }
        & ((i_a[0])?i_b[(BW-1):1]:{(BW-1){1'b0}});

代码与2位乘2位二进制乘法器门级图不匹配，请纠正我如果我错了。

2-bit by 2-bit binary multiplier

我也附上了一个来自bimpy.v的工作波形，用于简单的2x2无符号乘法器。

bimpy.v waveform

我还为bimpy.v生成了门级表示图。

gate-level representation of bimpy.v

 ////////////////////////////////////////////////////////////////////////////////
//
// Filename:    bimpy
//
// Project: A multiply core generator
//
// Purpose: An unsigned 2-bit multiply based upon the fact that LUT's allow
//      6-bits of input, but a 2x2 bit multiply will never carry more
//  than one bit.  While this multiply is hardware independent, it is
//  really motivated by trying to optimize for a specific piece of
//  hardware (Xilinx-7 series ...) that has 4-input LUT's with carry
//  chains.
//
// Creator: Dan Gisselquist, Ph.D.
//      Gisselquist Technology, LLC
//
////////////////////////////////////////////////////////////////////////////////
//
// Copyright (C) 2015,2017-2019, Gisselquist Technology, LLC
//
// This program is free software (firmware): you can redistribute it and/or
// modify it under the terms of  the GNU General Public License as published
// by the Free Software Foundation, either version 3 of the License, or (at
// your option) any later version.
//
// This program is distributed in the hope that it will be useful, but WITHOUT
// ANY WARRANTY; without even the implied warranty of MERCHANTIBILITY or
// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
// for more details.
//
// You should have received a copy of the GNU General Public License along
// with this program.  If not, see <http://www.gnu.org/licenses/> for a
// copy.
//
// License: GPL, v3, as defined and found on www.gnu.org,
//      http://www.gnu.org/licenses/gpl.html
//
//
////////////////////////////////////////////////////////////////////////////////
module  bimpy(i_clk, i_reset, i_ce, i_a, i_b, o_r);
    parameter   BW=2, LUTB=2;
    input               i_clk, i_reset, i_ce;
    input       [(LUTB-1):0]    i_a;
    input       [(BW-1):0]  i_b;
    output  reg [(BW+LUTB-1):0] o_r;

    wire    [(BW+LUTB-2):0] w_r;
    wire    [(BW+LUTB-3):1] c;

    assign  w_r =  { ((i_a[1])?i_b:{(BW){1'b0}}), 1'b0 }
                ^ { 1'b0, ((i_a[0])?i_b:{(BW){1'b0}}) };
    assign  c = { ((i_a[1])?i_b[(BW-2):0]:{(BW-1){1'b0}}) }
            & ((i_a[0])?i_b[(BW-1):1]:{(BW-1){1'b0}});

    initial o_r = 0;
    always @(posedge i_clk)
        if (i_reset)
            o_r <= 0;
        else if (i_ce)
            o_r <= w_r + { c, 2'b0 };

endmodule

- kevin998x

这可能更适合 https://electronics.stackexchange.com/? - Andreas

@Andreas 有没有办法在 https://electronics.stackexchange.com/ 上进行交叉发布，而不必重新创建帖子内容？ - kevin998x

不清楚，但我知道这篇文章可以迁移到那里。 - Andreas

"代码不匹配"。我没有检查，但它很可能会产生相同的结果。要检查它们是否相同，您可以将它们投入正式验证工具或使用所有16个可能输入的并行模拟来运行。 - Oldfart

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- rascob · Accepted Answer

MUX的注意事项

请注意，?表示多路复用器（MUX），因此语句：

out = sel ? x : y

在门级实现中等效于：

out = (sel & x) | (~sel & y)

（当sel=1时，out <= x，当sel=0时，out <= y）

如果y=0，则MUX将简化为x和sel之间的AND运算：out = (sel & x) | (~sel & 0) = sel & x

推导 w_r

假设BW=2且LUTB=2，w_r是一个4位信号。让我们来分解一下：

w_r = w_rL ^ x_rR

w_rL = { ((i_a[1])?i_b:{(BW){1'b0}}), 1'b0 }

w_rR = { 1'b0, ((i_a[0])?i_b:{(BW){1'b0}}) }

请注意，MUX的“else”值都被清零，因此像上面的注释一样，MUX被简化为AND：

w_rL = { BW{i_a[1]} & i_b, 1'b0 } = { A1 & B1, A1 & B0, 0 }

w_rR = { 1'b0, BW{i_a[0]} & i_b } = { 0, A0 & B1, A0 & B0}

我替换了i_a = {A1, A0}和i_b = {B1, B0}以简化表示。最终，通过按位异或它们：

w_r[0] = 0 ^ (A0 & B0) = A0 & B0
w_r[1] = (A1 & B0) ^ (A0 & B1)
w_r[2] = (A1 & B1) ^ 0 = A1 & B1
w_r[3] = 0 （隐式设置）

推导 c

类似地，对于 1 位的 c 信号：

c = cL & cR

cL = i_a[1] ? i_b[(BW-2):0]:{(BW-1){1'b0}} = {A1 & B0}

cR = i_a[0] ? i_b[(BW-1):1]:{(BW-1){1'b0}} = {A0 & B1)

最终结果为：

c = {A1 & B0 & A0 & B1}

推导 o_r

如果我们将 o_r 位分解：

o_r[0] = 0 + w_r[0] = A0 & B0
o_r[1] = 0 + w_r[1] = (A1 & B0) ^ (A0 & B1)
o_r[2] = c + w_r[2] = (A1 & B0 & A0 & B1) + (A1 & B1) -- 如果我们将它们相加，那么和是它们的异或，进位是它们的与运算，即：o_r[2] = (A1 & B0 & A0 & B1) ^ (A1 & B1)
o_r[3] = <从o_r[2]加法得到的进位> = A1 & B0 & A0 & B1 & A1 & B1 = A1 & B0 & A0 & B1（记住，与自己进行与运算等于自己，即x & x = x）

门级图输出

您的门级图显示以下等式：

C0 = A0 & B0 (=o_r[0])

C1 = (A0 & B1) ^ (A1 & B0) (=o_r[1])

C2 = (A0 & B1 & A1 & B0) ^ (A1 & B1) (=o_r[2] sum)

C3 = (A0 & B1 & A1 & B0) & (A1 & B1) = A0 & B1 & A1 & B0 (=o_r[3] 进位)

为什么实现如此奇怪？！

代码注释表明乘法器单元是为特定的FPGA架构构建的，看起来原始编码者的意图是将每个乘法器单元适配到该架构的单个LUT中。因此，我认为原始编码者试图“引导”一个旧的、愚笨的工具以FPGA高效的方式构建乘法器，这通常不是门级有效的方式。我认为这样的“手动”RTL级优化在今天的EDA工具中是无用的（希望如此！）。