Prototype Fabrication of Field-Programmable Digital Filter LSIs Using Multiple-Valued Current-Mode Logic
— Device Scaling and Future Prospects —

KATSUHIKO DEGAWA†, TAKAFUMI AOKI† AND TATSUO HIGUCHI‡

†Department of Computer and Mathematical Sciences, Graduate School of Information Sciences, Tohoku University, Aoba-yama 05, Sendai 980-8579, Japan
‡Department of Electronic Engineering, Faculty of Engineering, Tohoku Institute of Technology, Sendai 982-8577, Japan
E-mail: degawa@aoki.ecei.tohoku.ac.jp

This paper presents prototype design and fabrication of Field-Programmable Digital Filter (FPDF) LSIs, which employ carry-propagation-free redundant arithmetic algorithms for faster operation and Multiple-Valued Current-Mode Logic (MV-CML) for high-density low-power implementation. The original contribution of this paper is to evaluate, through actual chip fabrication, the potential impacts of MV-CML circuit technology on hardware reduction in programmable LSIs. The prototype FPDF fabrication with 0.6μm and 0.35μm CMOS technology demonstrates that the chip area and power consumption can be significantly reduced, compared with the standard binary logic implementation. Major problems associated with device scaling are also analyzed to discuss future prospects of MV-CML technology.

Keywords: Multiple-valued logic, signal processor, FPGAs, FIR filters

1 INTRODUCTION

In many real-time Digital Signal Processing (DSP) applications, traditional microprocessor-based architectures are often inadequate to meet the requirements for intensive computation. Field-Programmable Gate Arrays (FPGAs) provide an alternative that maintains a rapid prototyping capability while providing performance levels significantly beyond that of programmable
processors. If we introduce domain-specific FPGA architectures, further performance improvement is expected in specific DSP applications. Recently, several DSP-oriented FPGA architectures have been reported [1]–[3]. The key to success in designing specialized FPGA architectures for DSP relies on hardware algorithms that make possible not only higher performance but also lower circuit complexity. The problem of interconnection complexity has been recognized as a basic limitation in applying the reconfigurable computing technique for DSP tasks.

Addressing this problem, this paper presents a technique for implementing high-performance DSP-oriented FPGAs exhibiting both low power consumption and reduced circuit complexity. The proposed technique employs (i) binary SD (Signed-Digit) arithmetic [4]–[6] for high-speed multiply-add operations, and (ii) Multiple-Valued Current-Mode Logic (MV-CML) for significant reduction in wiring complexity. Figure 1 summarizes the impacts of the proposed technique for realizing a high-performance configurable signal processing architecture.

In this paper, we demonstrate the potential of the proposed technique in a typical application example – design of a specialized FPGA architecture for high-speed FIR filtering. The reference [7] has already proposed a high-speed programmable filter LSI, called a Field-Programmable Digital Filter (FPDF), dedicated for signal processing and communication applications. The architecture was designed on the basis of ordinary binary logic circuits. In this paper, on the other hand, we propose a new design of FPDF based on the combination of MV-CML circuit technology and redundant arithmetic algorithms. The goal of this paper is to provide a case study to evaluate the impact of MV-CML on the reduction of hardware complexity required for DSP-oriented programmable LSIs. Prototype FPDF chips using MV-CML have been successfully designed and fabricated in 0.6µm CMOS technology.
technology [8] and in 0.35μm CMOS technology. Our initial observation shows that the chip area and power consumption can be significantly reduced, compared with the standard binary logic implementation. Major problems associated with device scaling are also analyzed to discuss future prospects of MV-CML technology.

2 BINARY SD ARITHMETIC USING MV-CML

This section describes binary Signed-Digit (SD) number representation, which allows high-speed arithmetic operations without carry propagation and is suitable for the implementation with MV-CML. The SD number representation is a redundant representation using a symmetrical digit set \{-1, 0, 1\}. Any integer \(X\) can be represented as a sequence of radix-2 signed digits \(X_i\) as follows:

\[
X = \left[ X_{n-1}X_{n-2} \cdots X_1X_0 \right]_{SD2}
= \sum_{i=0}^{n-1} X_i \cdot 2^i,
\]

(1)

where \(X_i \in \{-1, 0, 1\}\).

Consider two binary SD numbers:

\[
X = \left[ X_{n-1} \cdots X_i \cdots X_0 \right]_{SD2},
\]

(2)

\[
Y = \left[ Y_{n-1} \cdots Y_i \cdots Y_0 \right]_{SD2}.
\]

(3)

The addition of these two numbers is performed by the following three steps for every digit:

Step 1: \(Z_i = X_i + Y_i\),

Step 2: \(2C_i + W_i = Z_i\),

Step 3: \(S_i = W_i + C_{i-1}\),

where \(Z_i, W_i, C_i\) and \(S_i\) are the linear sum, the intermediate sum, the carry and the final sum, respectively. These variables take the following ranges:

\(X_i, Y_i, S_i, C_i \in \{-1, 0, 1\}\),

\(Z_i \in \{-2, -1, 0, 1, 2\}\),

\(W_i \in \{-1, 0\}\) (if \(Z_{i-1} > 0\)),

\(W_i \in \{0, 1\}\) (if \(Z_{i-1} \leq 0\)).

Step 2 decomposes the linear sum \(Z_i\) into the intermediate sum \(W_i\) and the carry \(C_i\) so that the dynamic range of the final sum \(S_i\) created in Step 3...
may fit within the range from $-1$ to $1$. Figure 2 shows how to generate $W_i$ and $C_i$ from $Z_i$ in Step 2. Since the carry $C_i$ can be computed without referring the lower-digit carry $C_{i-1}$, carry-propagation-free addition can be achieved. Figure 3 shows an example of binary SD addition, where $\overline{1}$ denotes $-1$. Figure 4 illustrates the structure of a parallel adder using binary SD arithmetic, where “Binary SDFA” denotes a one-digit binary SD full adder cell realizing the operations of Step 1, Step 2 and Step 3.

This section describes the implementation of the binary SD adder using MV-CML technology [5,9]. In MV-CML, each digit $\{-1, 0, 1\}$ for binary SD arithmetic can be represented by bidirectional current on a wire; the sign (+ or −) is represented by current direction (positive or negative) and the digit magnitude is represented by current level (amount).
Figure 5 shows the basic building blocks for MV-CML used in our design. Threshold detectors, which compare the amount of input current and that of reference current, are the most characteristic components in MV-CML circuits. In the following, we explain the operation of the pMOS Type-B threshold detector (Type-B pTD) in detail as an example (see the portion marked with “*” in Fig. 5). We assume that the two inputs indicated by \( I_{in} \) are identical current inputs produced by current mirrors in the Bidirectional Current Input (BCI) circuit. When \( I_{in} < k_1 I_0 \), the gate voltages of Tr1 and Tr2 become HIGH and hence \( I_{out} = 0 \). When \( k_1 I_0 < I_{in} < k_2 I_0 \), the gate voltage of Tr1 is HIGH and that of Tr2 is LOW resulting in \( I_{out} = m I_0 \). When \( k_2 I_0 < I_{in} \), the gate voltage of Tr1 is LOW and that of Tr2 is HIGH resulting in \( I_{out} = 0 \). Thus, Type-B pTD produces the current \( I_{out} = m I_0 \) if and only if the condition \( k_1 I_0 < I_{in} < k_2 I_0 \) is satisfied.

One of the most important features of MV-CML is that linear summation can be performed by wiring without any active devices. This makes possible drastic reduction in the number of transistors in arithmetic circuits. Figure 6 shows the MV-CML implementation of the binary SD full adder (SDFA) described in the last section. The linear summation operations for Step 1 and Step 3 are implemented by simple wiring points. The most of the circuit resources are devoted for Step 2, which accepts the bidirectional current sum \( Z_i \) and generates \( W_i \) and \( C_i \). The bidirectional current \( Z_i \) is copied to unidirectional currents \( I_1 \sim I_2 \) or to \( I_3 \sim I_5 \) depending on the current direction. The threshold detectors produce the binary voltage signals \( V_1 \sim V_5 \), which control the switching of 10 pass transistors. These pass transistors form a pair of series-parallel logic circuits to implement the transfer characteristics for \( W_i \) and \( C_i \) as illustrated in Fig. 2. For example, \( W_i = -1 \) (i.e., \( -I_0 \)) when \( \{ (V_2 \ OR \ V_3) \ AND \ V_6 \} = \text{HIGH} \).

The current sources are most important components in MV-CML circuits since they are used to produce the output signals of functional modules as well as the reference signals for threshold detectors. In our typical design, the unit current \( I_0 \) is programmed as 10μA by controlling the transistor sizes (\( W \) and \( L \)) of current sources, and their gate bias voltages \( V_p \) and
FIGURE 5
Basic MV-CML circuits, where $I_0$ is the unit current ($I_0 = 10\mu A$ in our design).

<table>
<thead>
<tr>
<th>Basic Circuits</th>
<th>Current Sources</th>
<th>Current Mirrors</th>
<th>Threshold Detectors (with current sources)</th>
<th>Bidirectional Current Input circuit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Symbol</td>
<td>$m_L$</td>
<td>$I_{in}$</td>
<td>$I_{out}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{outN}$</td>
<td></td>
</tr>
<tr>
<td>Schematic</td>
<td>$V_{in}$</td>
<td>$m_L$</td>
<td>$I_{in}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{out}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{outN}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td>$V_{in}$</td>
<td>$m_L$</td>
<td>$I_{in}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{out}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{outN}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td>$V_{in}$</td>
<td>$m_L$</td>
<td>$I_{in}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{out}$</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>$I_{outN}$</td>
<td></td>
</tr>
</tbody>
</table>

Function

PMOS
$I_{out} = 0$ (if $V_{in} = \text{high}$)
$I_{out} = m_L$ (if $V_{in} = \text{low}$)

NMOS
$I_{out} = m_L$ (if $V_{in} = \text{high}$)
$I_{out} = 0$ (if $V_{in} = \text{low}$)

$a_i$: Scale factor

Threshold Detectors

Type-A

$\frac{I_{in}}{I_{out}}$ $= \frac{k_L L_0}{I_{in} L_0}$

Type-B

$\frac{I_{in}}{I_{out}}$ $= \frac{k_L L_0}{I_{in} L_0}$

Bidirectional Current Input circuit

$\frac{I_{in}}{I_{out}}$ $= \frac{k_L L_0}{I_{in} L_0}$

*
Field-Programmable Digital Filter

We use transistors with the channel length $L = 1.5 \times L_{\text{min}} \sim 2 \times L_{\text{min}}$ as current sources, where $L_{\text{min}}$ denotes the minimum channel length of a transistor. Thus, we can use fairly small transistors in MV-CML circuits compared with ordinary analog circuits. The operation of MV-CML circuits is not so sensitive to $V_t$ variation, since a designer must guarantee the correct circuit operation only for a limited range of discrete operating points, such as $-20\mu A, -10\mu A, 0\mu A, 10\mu A$ and $20\mu A$.

Note that MV-CML can carry a single digit ($\in \{-1, 0, 1\}$) of binary SD representation on a single wire. On the other hand, ordinary binary logic implementation requires a pair of wires to convey a single digit of SD representation since the digit set $\{-1, 0, 1\}$ must be encoded into binary vectors, such as $\{10, 00, 01\}$. Thus, 50% reduction in the number of I/O interconnections is expected for the MV-CML SDFA compared with the binary logic implementation.

Also, the current summation technique of MV-CML, which does not require any active devices, makes possible further reduction in the number of interconnections. For example, assume that we would like to send a pair of SD operands $X_i$ and $Y_i$ ($\in \{-1, 0, 1\}$) that should eventually be added at a specific destination. In voltage-mode binary logic implementation, we need 4 wires to transmit the SD operands $X_i$ and $Y_i$ independently. In MV-CML circuits, on the other hand, the two SD operands are represented by a pair of bidirectional current signals. Assuming the two numbers will be summed up at the receiver end, it may be a good idea to pack $X_i$ and $Y_i$ by current summation before transmission; thus, we transmit a 5-level current signal ($\in \{-2, -1, 0, 1, 2\}$) on a single interconnection. (Note that we need an SDFA to reduce the signal range from $\{-2, -1, 0, 1, 2\} \rightarrow \{-1, 0, 1\}$ at the receiver end.) This technique can reduce the number of interconnects in MV-CML circuits to 25% ($=1/4$) compared with the binary logic implementation. This property is particularly useful for DSP-oriented
FPGA devices, in which efficient implementation of arithmetic operations is required.

3 ARCHITECTURE OF A FIELD-PROGRAMMABLE DIGITAL FILTER

3.1 Basic Architecture
In this section, we describe the design and implementation of a Field-Programmable Digital Filter (FPDF) – a programmable filter LSI for high-speed FIR filtering dedicated for signal processing and communication applications [10,11]. We choose FIR filter application in order to demonstrate the potential of the proposed technique – the combination of redundant arithmetic algorithms and MV-CML circuit technology – for high-speed, low-power and compact implementation of DSP-oriented programmable LSIs. This technique will also be effective in other applications, in which efficient implementation of multiply-add operation is required.

Figure 7 illustrates the overall architecture of the FPDF, which consists of Configurable Arithmetic Blocks (CABs), Connection Blocks (CBs), Switch Blocks (SBs) and Product Generators (PGs). The basic structure of FPDF is similar to those of conventional FPGAs – building blocks are connected by programmable interconnects. Major difference is that FPDF employs an 8-bit latched SD adder as a functional block (called the CAB) instead of Configurable Logic Blocks (CLBs) used in typical FPGAs. These CABs are connected by programmable current-mode interconnections according to configuration data downloaded into FPDF in advance. Each current-mode interconnection can convey a 3-valued signal \([-1, 0, 1]\) or a 5-valued signal \([-2, -1, 0, 1, 2]\) as described in the last section. The routing elements, CBs and SBs, are used to guide the bidirectional current signals from source CABs to destination CABs.

Figure 8 illustrates how to map an FIR filter structure onto basic components of FPDF, where the word length of the filter is 8 bits in this example. The filter coefficients are encoded by Canonical Signed-Digit (CSD) representation [12]. The multiplications between the input signal and CSD coefficients are performed by PGs followed by CABs. The pipelined accumulation of generated products is performed by CABs to produce the output of the FIR filter. For pipelining, a CAB contains an 8-bit register, which could be bypassed when it is used for CSD multiplication. Bidirectional current summation is fully utilized for pipelined accumulation leading to drastic reduction in interconnection complexity.

3.2 Functional Blocks for FPDF
This subsection describes the design of basic functional blocks for FPDF, where every functional block is newly designed using MV-CML circuit
FIELD-PROGRAMMABLE DIGITAL FILTER

FIGURE 7
MV-CML FPDF architecture.

FIGURE 8
FIR filter mapping.
technology. The use of MV-CML makes possible significant reduction in the number of programmable interconnects, which leads to compact implementation of CBs and SBs. Also, direct implementation of SD arithmetic using MV-CML without binary encoding allows high-speed low-power implementation of CABs. In the following, we describe the newly designed MV-CML functional blocks in detail.

CAB (Configurable Arithmetic Block) is an 8-bit latched binary SD adder with no carry propagation chain. Figure 9 (a) shows the 1-bit circuit for CAB. MV-CML circuit technology is used to implement the CAB circuit except for latches for pipelining; the latches employ conventional voltage-mode circuits and could be bypassed by selectors when combinational operation is required. Figure 9 (b) shows the layout of CAB using 0.6µm CMOS triple-metal technology. Note that this circuit implements Step 2 and Step 3 of binary SD addition, while Step 1 is realized by simple current summation on a input wire.

There are two routing elements in FPDF, that is, CB (Connection Block) and SB (Switch Block). Figure 10 shows the structure of CB, which consists of nMOS switches for wire routing and SRAM cells for controlling the nMOS switches. CBs are used not only for programmable routing, but also...
for current summation by wiring (Step 1 of binary SD addition); if two current signals are connected to a common interconnection, these signals are added together to form a 5-level current signal (∋ {−2, −1, 0, 1, 2}), resulting in 25% reduction in the number of interconnects. On the other hand, SB shown in Fig. 11 is a switch matrix placed at an intersection of vertical and horizontal data lines.

PG (Product Generator) shown in Fig. 12, which is placed at the first stage of FPDF architecture, generates a product between an 8-bit input signal and an 8-digit CSD coefficient (that contains four nonzero digits (−1 or 1) at most). The sign-vector conversion technique of Signed-Weight (SW) arithmetic [7] is used to control the sign of partial products so that we can pack every four partial products into a single 8-digit string of 5-valued digit (∋ {−2, −1, 0, 1, 2}).
FIGURE 12
Product Generator (PG).

This product generation algorithm is explained in Fig. 13. Let assume that the coefficient of multiplication is 0.10110112 (= 0.7109375) in two’s complement binary number system. This coefficient can be converted into CSD coefficient as 1.0T00T101CSDE, which contains four nonzero digits. Assume also that the input to the FIR filter is 0.10110102 in two’s
complement binary number system. Multiplication between the input and the CSD coefficient produces the four partial products (corresponding to the four nonzero digits in the CSD coefficient). Note that the first partial product has positive sign and the other three partial products have negative signs as shown in Fig. 13. For making efficient use of sign-symmetric number representation in SD notation, we need to balance the number of positive and negative partial products. Thus, in Fig. 13, the sign of the 4th partial product is flipped into positive by introducing negative bias quantity as a side effect (The bias thus introduced should be canceled at the final-stage adder in the FPDF architecture). As a result, we have two positive partial products and two negative partial products. By adding every pair of positive and negative partial products, we can convert the four partial products into two partial products in binary SD number representation. Note that these two binary SD products must be added at CAB in the succeeding stage as shown in Fig. 8. For this purpose, PG generates a single 5-valued signal by packing two binary SD products with current summation in advance. The CAB at the next stage receives this packed signal and converts it into binary SD notation.

The above example illustrates 8-bit CSD coefficient multiplication. The wordlength of internal datapath can be extended to any multiple of 8 bits by cascading the carry-out/carry-in signals of CABs. Input/output data length, on the other hand, cannot be extended in our present design due to the limited number of I/O pins.

4 PROTOTYPE CHIP DESIGN

This section describes prototype design examples of FPDF chips using 0.6μm CMOS and 0.35μm CMOS technologies. In 0.6μm CMOS implementation, we compare the proposed MV-CML FPDF with the corresponding voltage-mode binary logic implementation (using SW arithmetic) [7]. The binary logic FPDF considered here is based on the design described in [7], where each CAB employ a 4-to-2 counter with no carry propagation chain. Also, the binary logic FPDF does not employs PGs, since the partial product generation is performed by CBs through simple shift operations. Both FPDFs (binary logic and MV-CML) are dedicated to high-speed FIR filter realization and have the same function. The major design parameters, such as wordlength and data line width, are almost the same for both designs.

4.1 MV-CML FPDF Design in 0.6μm CMOS Technology

Figure 14 shows the layout and design specification of the MV-CML FPDF chip using 0.6μm CMOS technology. The test chip integrates about 150,000 transistors on a 4.4×4.4mm² die area. The size of configuration data required to define a specific FIR filter on the chip is 2,381 bits, which
Maximum filter tap 30 taps
Basic components 60 CABs, 30 PGs, 64 CBs, 80 SBs, 1 final-stage adder
Configuration data 2,381 bits
Transistor count 150,247 transistors
Power supply 5.0 V
Chip size 4.4mm×4.4mm
Effective size 3.4mm×3.5mm
Process ROHM 0.6μm CMOS with triple-metal layers

FIGURE 14
Chip layout and features of the MV-CML FPDF in 0.6μm CMOS technology.

must be downloaded from outside of the chip in advance. The maximum number of FIR filter taps that can be mapped on the MV-CML FPDF chip is 30.

To demonstrate the advantage of MV-CML, we designed two FPDFs based on binary (voltage-mode) logic and MV-CML using the same 0.6μm CMOS triple-metal technology. The binary-logic FPDF integrates 22 CABs on a chip and can implement an 11-tap FIR filter at most. On the other hand, the MV-CML FPDF integrates 60 CABs on a chip and can implement a 30-tap FIR filter at most. When mapping an 11-tap FIR filter, binary-logic FPDF requires active area of 9.0mm², while MV-CML FPDF requires only 3.7mm². Thus, the use of MV-CML technology makes possible to reduce the active area to 41% in comparison with binary logic implementation. This is mainly because the number of interconnects in the routing blocks, i.e., CBs and SBs, is significantly reduced by using MV-CML. The transistor count and power consumption (@40MHz operation) are also reduced to 47% and 71%, respectively. Figure 15 compares the power consumption of FPDFs using binary logic and MV-CML circuits.
A major disadvantage of MV-CML circuit technology is its limited range of operating frequency. In our design, the typical value of unit current (current for digit value 1) is 10\(\mu\)A. In this case, the maximum operating frequency is limited to 40MHz, while the binary-logic implementation reaches 90MHz. In order to achieve higher performance, we need to increase the unit current of MV-CML circuits. For example, if we introduce higher unit current, say 30\(\mu\)A, we can achieve 85MHz operation. We designed current sources so that we can change the unit current within the range of 10\(\mu\)A–40\(\mu\)A by changing the bias voltages \(V_p\) and \(V_n\).

Major problem of MV-CML is its relatively higher power dissipation in low-frequency region. In order to resolve the trade-off between speed and power, application specific consideration is required for selecting optimal value of unit current. Variable-unit-current approach should also be explored for future applications.

Figure 16 shows an example of chip measurement using an LSI tester (Advantest T6671E). The critical delay is measured by connecting the basic components: PG, SB, CB, CAB, CB, SB, and the final-stage adder, in series (bypassing all the latches in CAB). In this particular case, the measured critical delay is 90ns, which depends on the circuit configuration mapped onto the FPDF.

4.2 MV-CML FPDF Design in 0.35\(\mu\)m CMOS Technology
In order to evaluate the effects of device scaling in MV-CML technology, we have designed another test chip for MV-CML 0.35\(\mu\)m CMOS technology. Figure 17 shows the chip layout and design specification. Compared with the design in 0.6\(\mu\)m CMOS, the number of transistors increases about three times, and accordingly the maximum number of filter taps becomes 64.
K. Degawa, et al.

FIGURE 16
Chip measurement result of the MV-CML FPDF chip ($V_n = 2$ V) in 0.6μm CMOS.

<table>
<thead>
<tr>
<th>Maximum filter tap</th>
<th>64 taps</th>
</tr>
</thead>
<tbody>
<tr>
<td>Basic components</td>
<td>152 CABs, 64 PGs, 160 CBs, 140 SBs, 2 final-stage adders</td>
</tr>
<tr>
<td>Configuration data</td>
<td>5,450 bits</td>
</tr>
<tr>
<td>Transistor count</td>
<td>473,135 transistors</td>
</tr>
<tr>
<td>Power supply</td>
<td>3.3 V</td>
</tr>
<tr>
<td>Chip size</td>
<td>4.9mm × 4.9mm</td>
</tr>
<tr>
<td>Effective size</td>
<td>4.4mm × 4.4mm</td>
</tr>
<tr>
<td>Process</td>
<td>ROHM 0.35μm CMOS with triple-metal layers</td>
</tr>
</tbody>
</table>

FIGURE 17
Chip layout and features of the MV-CML FPDF in 0.35μm CMOS technology.
As the system size increases, the increase in interconnect delay and power consumption causes severe performance degradation. To reduce the delay in signal routing, we need to reduce the “ON resistance” of pass transistor switches in CBs and SBs. For this purpose, we employ the pass transistor switches of channel width $10\lambda \sim 20\lambda$. Also, “gate boosting” technique – a technique to reduce the “ON resistance” of pass transistor switches by applying gate bias voltages higher than the power supply voltage – is employed in CBs and SBs. To reduce the power consumption, we separate the power supply for upper portion of the gate array and that for lower portion so as to turn off the power supply independently. We designed current sources so that we can change the unit current within the range of $10\mu A \sim 40\mu A$ by changing the bias voltages.

The chip operation has been verified by HSPICE simulation and the chip design has already been submitted to VDEC chip fabrication service. HSPICE simulation shows that the power consumption of 1-bit CAB in the MV-CML FPDF chip using 0.35$\mu$m CMOS can be reduced by 15% for the case of unit current 40$\mu$mA and by 30% for the case of unit current 10$\mu$mA, in comparison with the 0.6$\mu$m CMOS chip.

5 FUTURE PROSPECTS OF MV-CML CIRCUIT TECHNOLOGY

In this section, we discuss the effects of device scaling in MV-CML circuit technology and provide future prospects of this emerging circuit technology.

5.1 Technological Challenges

We examine the challenges in MV-CML circuits posed by deep-submicron technologies. Although deep-submicron microelectronic technologies enable greater degrees of semiconductor integration, we must address major problems of MV-CML circuits related to the effects of (i) enhanced channel-length modulation, (ii) reduced power supply voltage, (iii) increased interconnect capacitance and resistance, and (iv) increased leakage current.

First of all, channel-length modulation in a MOS transistor is caused by the increase of the depletion layer width at the drain as the drain voltage is increased. This leads to a shorter channel length and an increased drain current. Actually, the drain current increases slightly as drain voltage increases. This channel-length-modulation effect typically increases in small devices with low-doped substrates, which may cause problems in some MV-CML components, such as current sources and current mirrors. (For these components, constant drain current in saturation region is important.) To address this problem, we need to employ transistors, whose channel lengths are $1.5 \sim 2$ times larger than the minimum geometry, for current sources and current mirrors. The use of longer channel length also makes possible to reduce threshold variation of transistors.
As for the item (ii) above, it is well-known that the reduction of supply voltage causes problems in analog circuit elements. Similar situation could be found in designing MV-CML circuits, since they employ current mirrors as essential components. We need to keep specific level of supply voltage so as to keep the current mirrors in operation. Also, low supply voltage may cause a problem when changing the magnitude of unit current adaptively by controlling gate bias voltage at the current source. If we reduce the supply voltage, the available range of current variation is limited. We need to introduce multiple-$V_{dd}$, substrate biasing or other techniques for realizing variable unit-current circuits.

The increase in interconnect capacitance and resistance – the item (iii) above – will become a serious problem especially in programmable devices such as FPGAs and PLDs. In order to drive long wires in MV-CML, we may need multi-level current buffers (or repeaters), which are similar to voltage buffers used in conventional FPGAs. Consequently, high performance current mirrors are essential in applications to programmable LSIs.

As for the item (iv), the problem of static power dissipation due to leakage current in transistors may be a problem just like in the standard voltage-mode binary logic circuits. However, the use of current-mode logic will make the problem of static-power dissipation more serious, since the current-mode logic uses static current to keep logic values in the circuit. Consequently, an advanced power management technique, which could turn off the supply voltage to the unused sections of the chip adaptively, is of essential importance in future MV-CML LSIs.

5.2 Design of MV-CML Circuits in Advanced CMOS Technology

In the following, we analyze the impact of device scaling on the design of MV-CML circuits. For this purpose, we designed a set of binary SDFAs in MV-CML using 0.35$m\mu$m, 0.25$m\mu$m and 0.18$m\mu$m CMOS technology. The SPICE model parameters used for circuit simulation are provided by MOSIS $^1$ (TSMC 0.35$m\mu$m, 0.25$m\mu$m and 0.18$m\mu$m CMOS parameters). We successfully confirmed correct SFA operation through circuit simulation for all the design parameters. Basic design strategy for MV-CML SFA circuits in this experiment is summarized below.

Let $I_0$ denote the unit current corresponding to the logic value 1 in MV-CML circuits, where we change the unit current $I_0$ from 5$m\mu$A to 30$m\mu$A. We first design the $I_0/2$ current sources by adjusting gate bias voltages ($V_p$ and $V_n$) and transistor sizes ($L$ and $W$). To keep the sufficient level of accuracy, all the current sources used in our circuits are implemented by connecting $I_0/2$ current sources in parallel.

$^1$ http://www.mosis.org/
2 Design current mirrors that can copy input current with the magnitude range $-2I_0 \sim 2I_0$. We need to control the channel width $W$ to keep a specific level of output voltage ($\sim V_{dd}/2$) for the given range of input/output current. Also, we need to optimize the trade-off between channel width $W$ and operating speed of the current mirrors.

3 Design a BCI (Bidirectional Current Input) circuit and threshold detectors for the given range of input current.

4 Design combinational logic circuits with pass transistors. When low voltage operation is required, we need control “$W$” of pass transistors carefully considering the ON resistance.

5 Combine the current-mode circuit components and combinational logic components to make an SDFA circuit.

Figure 18 shows the estimated delay time and power consumption of MV-CML SDFAs using 0.35$\mu$m, 0.25$\mu$m and 0.18$\mu$m CMOS design parameters from MOSIS/TSMC. We assume that the power supply voltage is 3.3V for 0.35$\mu$m CMOS, 2.5V for 0.25$\mu$m CMOS, and 1.8V for 0.18$\mu$m CMOS, respectively. As is observed in these plots, if we increase the unit current, we could reduce the delay time and achieve faster operation. However, increased unit current causes higher power consumption. Thus, the power-delay product of the SDFA circuit remains almost constant in every generation of circuit technology. As the device size scales, on the other hand, the power-delay product scales correspondingly. This clearly demonstrates the future potential of MV-CML circuit technology in deep submicron regime.
A major difficulty of future applications of MV-CML may be related to the lack of design automation tool for circuit synthesis. In the design of FPDF chips, for example, we have adopted full-custom design methodology starting from layout-level circuit design and HSPICE simulation. For system-level applications, especially for applications to system LSIs, high-level EDA tools for MV-CML circuits are essential. Establishment of a systematic design flow for MV-CML circuits still remains as future work.

6 CONCLUSION

In this paper, we have described prototype design and fabrication of Field-Programmable Digital Filter (FPDF) LSIs, which employ carry-propagation-free redundant arithmetic algorithms for faster operation and Multiple-Valued Current-Mode Logic (MV-CML) for high-density low-power implementation. The prototype FPDF fabrication with 0.6\(\mu\)m and 0.35\(\mu\)m CMOS technology demonstrates that the chip area and power consumption can be significantly reduced, compared with the standard binary logic implementation. We also have analyzed major problems of MV-CML associated with device scaling and have discussed future prospects of this emerging circuit technology.

ACKNOWLEDGMENTS

The VLSI chips in this study have been fabricated under the chip fabrication program of the VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with Rohm Corporation and Toppan Printing Corporation.

REFERENCES


