# Improved Mixed Digital-Analog Nearest-Match Circuit for Fully-Parallel Associative Memories

Kazi Mujibur Rahman, Kazuhiro Kamimura, Tetsushi Koide and Hans Jürgen Mattausch Research Center for Nanodevices and Systems, Hiroshima University, Japan Email: {kmr, kamimura, koide, hjm}@sxsys.hiroshima-u.ac.jp

Abstract - In fully parallel associative memories, the search time is primarily determined by the performance of the winner lineup amplifier (WLA). The prime goal of the winner lineup circuitry is to amplify the distance between the winner and the nearest loser sufficiently so that the following WTA circuits can decide the winner at binary logic level. Because of the internal capacitances and resistances in the MOS transistors, there is an inherent delay and the WLA circuit needs considerable time to set at the stable operating point. Moreover, due to use of a feedback loop, the WLA circuit is prune to instability and oscillations at higher amplification. Addition of pre-charge capacitances in the match lines improves the stability. The performances of the WLA network with different pre-charge conditions are investigative and reported in this article. It is observed that pre-charging the match lines to  $V_{\text{DD}}\text{-}V_{\text{threshold}}\text{,}$  instead of  $V_{\text{DD}}\text{,}$  improves the search speed by about 19.4%. The proposed WLA networks are discussed in detail and simulation results are presented for search word lengths of 512-bits with 128-match lines.

#### 1. Introduction

The basic operation of pattern recognition is to find the nearest match between input-data word of W bit length and a number R of reference data words [1]. In a hamming distance based search engine employing fully parallel search strategy, the mismatched bits contribute current to the match lines. In order to keep the average power of the whole network reasonably low, the current for the mismatched bits should be as low as possible (fraction of microamperes only). The WLA circuit primarily intends to magnify the distance in the narrow range of the winner and the nearest loser. For this purpose, the WLA circuit has to operate relatively at higher gain for the winner and nearby losers. A feedback loop is added in the WLA network that regulates the match lines and sets the desired operating point. The feedback regulation has to be done properly, as both under-regulation and overregulation deteriorates the network performance [2].

The difficult task of the WLA circuit is to guarantee stable operation in all search conditions, as the circuit has to operate in high gain mode. Recent WLA networks [3] use parallel capacitances in the match lines to damp out oscillations during the search period. Addition of capacitances delays the search time, hence optimum selection of the capacitances is required to have a good tradeoff between stability and speed. In this paper, alternate schemes of pre-charged match lines are investigated aiming at lesser search time without affecting network stability.

# 2. WLA Network with Parallel Capacitances in Match Lines

The basic WLA circuit with parallel capacitances ( $C_1$ , ...,  $C_R$ ) in the match lines is shown in Fig. 1. A pre-charge transistor ( $M_{21}$ , ...,  $M_{2R}$ ) is added in each match line, otherwise, the network will lead to erroneous decision because of the different initial voltages in the match lines. The pMOS transistors ( $M_{21}$ , ...,  $M_{2R}$ ) charge the match lines to  $V_{DD}$  as the *Enable* signal goes LOW. In the evaluation period with *Enable* = HIGH, the pre-charge transistors turn off and the capacitances discharge through the nMOS transistors ( $M_{11}$ , ...,  $M_{1R}$ ) controlled by the feedback loop.



Fig.1 WLA Circuit (WLA-A) with capacitors in the match lines.

## 3. Improved WLA Circuitry for Search Speed Enhancement

From Fig. 1, it is evident that the distance amplification part of the WLA-A network cannot start at the positive edge of the *Enable* signal, rather, it has to wait until the capacitance voltage falls to the threshold voltage of the pMOS transistors ( $M_{31}$ , ...,  $M_{3R}$ ). The search speed of the WLA-A will enhance adding separate charging source in each match line as shown in Fig. 2. The charging source has a pull-up pMOS diode ( $M_{p1}$ , ...,  $M_{pR}$ ) connected to a pull-down nMOS diode ( $M_{n1}$ , ...,  $M_{nR}$ ). It supplies a voltage of  $V_{DD}$ - $V_{threshold}$  to the charging transistors ( $M_{21}$ , ...,  $M_{2R}$ ) of the match lines, where,  $V_{threshold}$  is the threshold voltage of the pMOS transistors ( $M_{31}$ , ...,  $M_{3R}$ ). The charging circuit parameters are adjusted to have a current drain of 3.1 microampere for each match line.



Fig. 2 WLA circuit (WLA-B) with match line capacitances pre-charged to  $V_{\rm DD}\text{-}V_{\rm threshold.}$ 

For both WLA-A and WLA-B circuits, the feedback line voltage (gate voltage of ML) may have some trapped charges whose magnitude differs at different pre-search conditions. Although it does not have significant impact on the search, it is better to have same initial conditions in all parts of the WLA network before a search starts. A reset transistor (M<sub>NL</sub>) added in the feedback path as shown in Fig. 3, would discharge the trapped charges, if any, forcing the feedback line to start from same initial condition for all new searches. The three versions (WLA-A, WLA-B and WLA-C) of the winner lineup amplifier blocks are tested in the associative memory with hamming distance search strategy having standard WTA circuits [5] in the output stages. The search times of the winner and the nearest loser are evaluated for a search network of 512-bit word length and 128 match lines using a 350 nm technology.



Fig. 3 Reset transistors added in the feedback path in WLA-C network to ensure same initial conditions for all searches.

### 4. Simulation Results

Search results for the associative memory with the nearest loser set at 1-bit distance and all other losers at 20-bit apart are shown in Fig. 4. The search engine with WLA-B network has an average search time of 137 ns, which is 33 ns lesser than with WLA-A network that has an average search time of 170 ns. Furthermore, the strong dependence of the search time on the winner-input distance is significantly reduced. The improvement for transition from WLA-B to WLA-C network is insignificant (about 1 ns only). As the distance between the winner and the nearest loser increases to 10-bits or more as shown in Fig. 5, the winner search time remains nearly unchanged. Although WLA-B and WLA-C circuits have significant improvement in search speed over WLA-A, this comes at an increase in power consumption of the networks as shown in Figs. 6-7.



Fig. 4 Winner search times for the nearest loser set at 1-bit apart while other losers set at 20-bit distance.

The average power consumption with WLA-B network is 137 mW, 27 mW more than WLA-A (164 mW). WLA-C has further increase in power consumption of about 3 mW. From Figs. 6 and 7, it is evident that there is insignificant increase in power consumption as the winner and nearest loser distance increases from 1-bit to 10-bits.



Fig. 5 Search times for different configurations of the WLA network, the winner set at 10-bit apart from the nearest loser.



Fig. 6 Power consumption for different configurations of the WLA network, the winner set at 1-bit apart from the nearest loser.



Fig. 7 Power consumption for different configurations of the WLA network, the winner set at 10-bit apart from the nearest loser.

#### 5. Conclusions

Significant improvement in search speed is achieved by WLA-B network that charges the match line capacitances to  $V_{DD}$ - $V_{threshold}$ . The charging circuitry added for this purpose results in an overall increase in power consumption by about 19%, however, search speed has improved by 19.4%.

#### References

- D. R. Tveter, The Pattern Recognition Basis of Artificial Intelligence, Los Alamitos, CA: IEEE Computer Society, 1998.
- [2] H.J. Mattausch et al., IEEE J. Solid-State Circuits, vol. 37, pp.218-227, 2002.
- [3] H.J. Mattausch et al., Symposium on VLSI Circuits, pp. 252-255, 2002.
- [4] Yuji Yano et al., First Hiroshima International Workshop on Nanoelectronics for Terra-Bit Information Processing, pp. 18-19, 2003.
- [5] J. Lazzaro et al., in Advances in Neural Information Processing Systems, I.D.S. Touretzky Ed., San Mateo, CA: Morgan Kaufmann, 1989.