# Parameterized Block-Based Non-Gaussian Statistical Gate Timing Analysis

Soroush Abbaspour, Hanif Fatemi, Massoud Pedram Department of Electrical Engineering, University of Southern California {sabbaspo, fatemi, pedram}@usc.edu

# Abstract

As technology scales down, timing verification of digital integrated circuits becomes an increasingly challenging task due to the gate and wire variability. Therefore, statistical timing analysis (denoted by  $\sigma$ TA) is becoming unavoidable. This paper introduces a new framework for performing statistical gate timing analysis for non-Gaussian sources of variation in block-based  $\sigma$ TA. First, an approach is described to approximate a variational RC- $\pi$  load by using a canonical first-order model. Next, an accurate variationaware gate timing analysis based on statistical input transition, statistical gate timing library, and statistical RC- $\pi$  load is presented. Finally, to achieve the aforementioned objective, a statistical effective capacitance calculation method is presented. Experimental results show an average error of 6% for gate delay and output transition time with respect to the Monte Carlo simulation with 10<sup>4</sup> samples while the runtime is nearly two orders of magnitude shorter.

## 1. Introduction

Process technology and environment-induced variability of gates and wires in VLSI circuits makes timing analysis of such circuits a challenging task [1]. More precisely, advanced analysis tools must be developed that are capable of verifying changes in the circuit timing which stem from various sources of variations [2]. In block-based statistical timing analysis ( $\sigma$ TA), every timing quantity of interest (e.g., delay and slew, arrival time and required arrival time) is represented as a function of global sources of variation (denoted by  $X_i$ ) and independent random sources of variation (denoted by  $S_i$ ) in the canonical first-order (denoted by CFO) form. The advantages of such a formulation are that a) it can capture all correlations and b) it can produce delay sensitivities due to changes in various environmental and process-related parameters [2]. Sources of variations have often been assumed to be Gaussian, which in turn simplifies the block-based  $\sigma$ TA. However, it has been recently reported that certain process parameters exhibit non-Gaussian probability distributions [3].

Block-based  $\sigma$ TA breaks its analysis into two parts: 1) variational interconnect timing analysis [4][5] and 2) variational gate timing analysis. Unfortunately, block-based  $\sigma$ TA is lacking in variation-aware gate timing analysis. The authors in [7] propose a modeling technique for gate delay variability considering multiple input switching. In [8], a model for calculating statistical gate delay variation caused by intra-chip and inter-chip variability is presented. Recent works do not provide an accurate means of analyzing the gate propagation delay and output slew as a function of variational input transition, variation-aware gate timing library, and variational gate load. In this paper a new framework is proposed for determining variational gate timing behavior. This is achieved by performing the following steps:

1. Given the variational resistive-capacitive load (where all resistances and capacitances are represented in the CFO form), an efficient and accurate algorithm is presented to calculate variation-aware RC- $\pi$  load. To perform the analysis, we calculate the variation-aware admittance moments (cf. section 3), and as a result,

the resistance and capacitances in the RC- $\pi$  load can be written in the CFO form.

2. Based on the statistical RC- $\pi$  load obtained in step 1, we calculate the variation-aware effective capacitance in the CFO form. In order to achieve the aforementioned goal, a new approach for effective capacitance calculation in static timing analysis (STA) is proposed (cf. section 4.1.) This effective capacitance calculation method is used to calculate the variational effective capacitance considering non-Gaussian process and environmental sources of variation in the CFO form (cf. section 4.2.)

3. Given the variational input transition time, statistical gate timing library, and variational effective capacitance ( $c_{eff}$ ) load in the CFO form, we calculate variational gate delay and output transition time in the CFO form (cf. sections 2.2.1)

We point out that although, in the remainder of this paper, we will mainly focus on the CFO random variables to represent process and environmental sources of variation as well as the performance quantities of interest; the work itself is not limited to the first-order approximation of these quantities. In fact, it is straightforward to extend the approach to more complex (e.g., second-order) forms regardless of considering Gaussian or non-Gaussian parameter variations.

The remainder of this paper is as follows. In section 2, we review the background of block-based  $\sigma$ TA. We also show how to convert a quantity, which itself is a function of global and independent sources of variation, into a canonical first-order (CFO) form. The variationaware *RC*- $\pi$  calculation is presented in section 3. Section 4 explains the statistical gate timing analysis for the variational input rise time, variation-aware gate timing library, and variational *RC*- $\pi$  load. In this section a new statistical effective capacitance calculation will be proposed and used for gate timing analysis, which is the key contribution of this paper. Section 5 presents experimental results. Finally, conclusions are discussed in section 6. We use the notation shown in Table 1 throughout the paper.

| Table 1: Useful notation and descriptions |
|-------------------------------------------|
|-------------------------------------------|

| Notation           | Description                                                                                                                                                                                                                     |
|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Α                  | A deterministic variable (does not take into account any statistical variation)                                                                                                                                                 |
| $\overset{\wp}{A}$ | An arbitrary (non-CFO) random variable, which is a function of <i>m</i> global and <i>p</i> independent random sources of variation                                                                                             |
| ⊲⊳<br>A            | A CFO random variable, which is a function of <i>m</i> global and <i>p</i> independent random sources of variation<br>i.e., $\stackrel{\triangleleft}{A} = A_0 + \sum_{i=1}^m A_i \Delta X_i + \sum_{k=1}^p A_{m+j} \Delta S_j$ |

## 2. Background

As mentioned before, the sources of variation may exhibit non-Gaussian distributions. Therefore, in general, in addition to calculating the mean and variance of the electrical and timing parameters, we need to calculate the skewness of their distributions, i.e. using the first three moments of the parameters variations.

**Definition**: The degree of asymmetry of a distribution is called skewness (denoted by  $\kappa$ .) A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. The skewness for a normal distribution is zero. Negative values for the skewness indicate data that are skewed left whereas positive values for the skewness indicate data that are skewed left. By skewed left (right), we mean that the left (right) tail is heavier than the right (left) tail.

The *skewness* of a distribution is defined to be  $\kappa = \frac{\mu_3}{\sigma^3}$  where  $\mu_3$  is the 3<sup>rd</sup> central moment and  $\sigma^2$  is the variance (second central moment.)

*Lemma* 1: Suppose  $\Delta S_1, ..., \Delta S_n$  are *n* independent random variables with distribution  $\Delta S_i \sim Dist_i$  ( $\mu$ =0,  $\sigma^2$ =1,  $\kappa_i$ ). Then;

$$\sum_{i=1}^{n} a_i \Delta S_i = \sqrt{\sum_{i=1}^{n} a_i^2} \cdot \Delta S_{eq} \quad \text{where} \quad \Delta S_{eq} \sim Dist \left( \mu = 0, \sigma^2 = 1, \kappa = \frac{\sum_{i=1}^{n} a_i^3 \kappa_i}{\left(\sum_{i=1}^{n} a_i^2\right)^{3/2}} \right)$$

Proof: It is omitted for brevity.

## 2.1 Canonical first-order (CFO) model for timing and electrical parameters

In block-based statistical timing analysis tool, a first-order variational model is employed for all timing quantities such as the gate and wire delays, arrival times, required arrival times, slacks and slews, i.e., all timing quantities are expressed in the CFO form as:

$$\overset{\triangleleft \triangleright}{a} = a_0 + \sum_{i=1}^m a_i \Delta X_i + a_{m+1} \Delta S_a$$

where  $a_0$  is the nominal value;  $\Delta X_i$ 's represent the variation of m global sources of variation,  $X_i$ , from their nominal values,  $a_i$ 's are the sensitivities to each of the global sources of variation,  $\Delta S_a$  is the variation of independent random variable  $S_a$  and  $a_{m+1}$  is the sensitivity of the timing quantity to  $S_a$ . By scaling the sensitivity coefficients, we can assume that  $\Delta X_i$  and  $\Delta S_a$  have distributions with  $\mu=0$  and  $\sigma^2=1$  and skewness=  $\kappa$  denoted by  $Dist(\mu=0,\sigma^2=1,\kappa)$ .

Variation in the physical dimensions of the wire causes change in its resistance and capacitance, thereby, making the gate delay and slew as well as wire delay and slew to vary accordingly [9]. Therefore, we need to capture the effect of geometric variations on the electrical parameters. For instance, resistance and capacitance in the CFO form are calculated as follows:

$$\stackrel{\scriptscriptstyle \triangleleft \triangleright}{r} = r_0 + \sum_{i=1}^m r_i \Delta X_i + r_{m+1} \Delta S_r \qquad \qquad \stackrel{\scriptscriptstyle \triangleleft \triangleright}{c} = c_0 + \sum_{i=1}^m c_i \Delta X_i + c_{m+1} \Delta S_c$$

where  $r_0$  and  $c_0$  represent nominal resistance and capacitance values, computed when the wire dimensions are at their nominal or typical values. The other parameters are as explained above.

**Observation:** Invariant Functional Form Property: This property states that:  $y = f(x) \Leftrightarrow \overset{\rho}{Y} = f(\overset{\rho}{X})$ , which follows from the fact that form of function f is independent of its input type (deterministic or variational.)

#### 2.2 Converting a variational function into CFO form

It is important to represent timing and electrical quantities in the CFO form. This in turn enables one to propagate first order sensitivities to different sources of variation through timing graph [2][9]. In addition, it makes statistical computations efficient and practical and provides timing diagnostics at a very small cost in run time. The remaining question is how to convert a quantity of interest (which itself is a function of different CFO variables) into the CFO form.

The following subsection presents a method to answer the above question. We use an example to show the procedure. The problem we address is how to convert the gate output transition time into the CFO form. However, this method can be easily applied to any other quantity of interest.

#### 2.2.1 Gate timing analysis for lumped capacitive load

**Problem Statement I:** Given is a variational CMOS driver where its input rise time,  $t_{in}$ , is in the CFO form and drives an output capacitive load, also, in the CFO form. Note that the distribution characteristics of all global and independent sources of variation ( $\mu$ =0,  $\sigma^2$ =1,  $\kappa$ ) are given. The objective is to calculate the output transition time,  $t_r$ , in the CFO form:

$$t_r^{\triangleleft \triangleright} = t_{r,0} + \sum_{i=1}^m t_{r,i} \Delta X_i + t_{r,m+1} \Delta S_t$$

i.e., calculate the nominal value  $(t_{r,0})$  and the sensitivity coefficients  $(t_{r,i} \text{ and } t_{r,m+1})$  as well as the skewness of distribution of  $\Delta S_{t_n}$ .

The gate output transition time is a function of the input transition time, the logic gate characteristics (e.g., the W/L ratio, threshold voltage of transistors,  $V_{dd}$ , and temperature), and the output load. In commercial ASIC cell libraries, it is possible to characterize various output transition times (e.g. 10%, 50%, and 90%) as a function of above variables; i.e.;

$$t_r = TF(t_{in}, c_l, z) \quad \text{where} \quad z = \left\{\frac{W}{L}, V_T, V_{dd}, Temp, \ldots\right\}$$
(1)

where  $t_r$  is the output transition time and *TF* is the corresponding output transition time function. *z* captures the gate characteristics and environmental factors,  $t_{in}$  is the input transition time, and  $c_l$  is the output *capacitive* load. Based on the Invariant Functional Form Property, the form of function *TF* is independent of its input type (deterministic or variational.) Hence, we extend the above equation to the variational case. In block-based  $\sigma$ TA,  $t_{in}$ ,  $c_l$ , and every parameter *z* is given in the CFO form as a function of *m* global and exactly one independent random sources of variations. Therefore,  $t_r$ itself is a complex (non-CFO) random variable. Hence, to represent the complex  $t_r$  in the CFO form, we replace  $t_{in}$ ,  $c_l$ , and *z* with their corresponding CFO models and collect terms. Hence, by differentiating with respect to global and independent random sources of variation,  $t_r$  as a function of *m* global sources of variation and *p* independent random sources of variation can be approximated as:

Considering  $\Delta S_j$ 's having  $Dist_j(\mu=0, \sigma^2=1, \kappa_j)$ , Eqn.(2) can be re-written as:

$$t_{r}^{\triangleleft \triangleright} = TF \Big|_{\Delta X_{i}=0}^{\Delta X_{i}=0} + \sum_{i=1}^{m} \frac{\partial TF}{\partial \Delta X_{i}} \Big|_{\Delta X_{i}=0}^{\Delta X_{i}=0} \cdot \Delta X_{i} + \sqrt{\sum_{j=1}^{p} \left(\frac{\partial TF}{\partial \Delta S_{j}} \Big|_{\Delta X_{i}=0}^{\Delta X_{i}=0}\right)^{2}} \cdot \Delta S_{t_{r}}$$

By using Lemma 1:

$$\Delta S_{t_r} \sim Dist\left(\mu = 0, \sigma^2 = 1, \kappa = \sum_{j=1}^{p} \left(\frac{\partial TF}{\partial \Delta S_j}\Big|_{\Delta X_i = 0}\right)^3 \kappa_j / \left(\sum_{j=1}^{p} \left(\frac{\partial TF}{\partial \Delta S_j}\Big|_{\Delta X_i = 0}\right)^2\right)^{3/2}\right)$$

In Lemma 2, we present how to calculate addition, multiplication, and division of two CFO forms in a new CFO form.

Lemma 2: Suppose, a and b are two given CFO random variables as :

$$\overset{\scriptscriptstyle \ll}{a} = a_0 + \sum_{i=1}^m a_i \Delta X_i + a_{m+1} \Delta S_a \qquad \overset{\scriptscriptstyle \ll}{b} = b_0 + \sum_{i=1}^m b_i \Delta X_i + b_{m+1} \Delta S_a$$

Therefore, for addition, subtraction, multiplication and division of a and b, we have;

a) Addition and subtraction:

$$\overset{\triangleleft \triangleright}{c} = \overset{\triangleleft \triangleright}{a} \pm \overset{\triangleleft \triangleright}{b} = (a_0 \pm b_0) + \sum_{i=1}^m (a_i \pm b_i) \Delta X_i + \sqrt{a_{m+1}^2 + b_{m+1}^2} \Delta S_a$$

b) Multiplication:

$$\overset{\text{\tiny de}}{c} \cong \overset{\text{\tiny de}}{a} \times \overset{\text{\tiny de}}{b} = a_0 b_0 + \sum_{i=1}^m \left( a_0 b_i + a_i b_0 \right) \Delta X_i + \sqrt{\left( a_0 b_{m+1} \right)^2 + \left( a_{m+1} b_0 \right)^2} \Delta S_c$$
  
c) Division:  
$$\overset{\text{\tiny de}}{a} = a_0 + \sum_{i=1}^m a_i b_0 - a_0 b_i + X_i + \sqrt{\left( a_{m+1} \right)^2 + \left( a_0 b_{m+1} \right)^2} + c_0$$

$$c^{\text{ab}}_{c} \simeq \frac{a}{c^{\text{ab}}} = \frac{a_0}{b_0} + \sum_{i=1}^{m} \frac{a_i b_0 - a_0 b_i}{b_0^2} \Delta X_i + \sqrt{\left(\frac{a_{m+1}}{b_0}\right)} + \left(\frac{a_0 b_{m+1}}{b_0^2}\right) \Delta S_0$$

Proof: It is omitted for brevity.

## 3. RC- $\pi$ Load Calculation in the CFO Form

In VDSM technologies, one cannot neglect the effect of interconnect resistance of the load on the gate delay and output transition time. In STA, an adequate approximation of an  $n^{th}$  order load seen by the gate (i.e., a load with *n* distributed capacitances to ground) is obtained by replacing the load by a second order RC- $\pi$  model [10]. Equating the first, second, and third moments of the admittance of the real load with the first, second, and third moments of the RC- $\pi$  load, one can compute  $c_n$ ,  $r_{\pi}$ , and  $c_f$  as [11]:

$$c_n = Y_{1,in} - \frac{Y_{2,in}^2}{Y_{3,in}} \qquad r_\pi = -\frac{Y_{3,in}^2}{Y_{2,in}^3} \qquad c_f = \frac{Y_{2,in}^2}{Y_{3,in}}$$
(3)

where  $Y_{k,in}$  is the  $k^{th}$  moment of the admittance of the real load. In  $\sigma$ TA, it is required to consider the effect of variability of the load on the gate timing analysis, as detailed below.

**Problem Statement II:** Given is an *RC* network representation of the load of a logic gate in a design as exemplified in Figure 1(a), where each *r* and *c* is in the CFO form. Note that the distribution characteristics of all global and independent sources of variation ( $\mu$ =0,  $\sigma^2$ =1,  $\kappa$ ) are given. The objective is to calculate an equivalent variational *RC*- $\pi$  load (i.e.,  $c_n$ ,  $r_\pi$ , and  $c_f$  of Figure 1(b) are in the CFO form), while its admittance matches the admittance of the real load in the frequency range of interest.

 $c_n$ ,  $r_{\pi}$ , and  $c_f$  are functions of the admittance moments as seen from Eqn. (3). Hence, by calculating the variational admittance moments, we can calculate the CFO parameters of RC- $\pi$  load (using the technique explained in section 2.2.) This can be done by differentiating the expressions in Eqn. (3) with respect to the sources of variation (cf. section 2.2.) However, as it will be shown next, a recursive operation is utilized to calculate the variational admittance moments and since in each recursion step, we have a complex (non-CFO) random variable which will feeds in the next step and this may increase the complexity of the calculations;

We represent the admittance moments in the CFO form throughout the recursion. This helps us by controlling the complexity of presenting the moments as the recursive function proceeds. Following shows how to calculate the input admittance moments of the real load in the CFO form. Consider the *RCY* segment shown in Figure 2. Assume that the admittances at nodes i and j are represented by infinite series using the admittance moments:



Figure 1: (a) a variational *RC* network representation of a net in a design. (b) the equivalent variational RC- $\pi$  model.

$$Y_i(s) = sY_{1,i} + s^2Y_{2,i} + \dots + s^kY_{k,i} + \dots$$
  
$$Y_j(s) = sY_{1,j} + s^2Y_{2,j} + \dots + s^kY_{k,j} + \dots$$

where  $Y_{k,l}$ , which is the coefficient of  $s^k$ , denotes the  $k^{th}$  moment of the admittance of node *i*. Thus, in STA, the admittance at node *i* is recursively computed in terms of the admittance at node *j* [11]:

$$\begin{aligned} \mathbf{Y}_{1,i} &= \mathbf{Y}_{1,j} + c_i \\ \mathbf{Y}_{k,i} &= \mathbf{Y}_{k,j} - r_i \sum_{l=1}^{k-1} \mathbf{Y}_{l,i} \mathbf{Y}_{k-l,j} - r_i c_i \mathbf{Y}_{k-1,i} \quad \text{for } k \ge 2 \end{aligned}$$
(4)

Using the Invariant Functional Form Property, we extend the above equation to the variational case. Assume the admittance moments of node *j* are written in the CFO form. Thus, by differentiating  $Y_{k,i}$  with respect to the sources of variations, the  $Y_{k,i}$  moments can be also represented in the CFO form (cf. section 2.2.)

$$\overbrace{\mathbf{Y}_{i}}^{\mathbf{i}} \overbrace{\mathbf{Y}_{i}}^{r_{i}} \overbrace{\mathbf{C}_{i}}^{\mathbf{j}} \overbrace{\mathbf{T}}^{\mathbf{j}} \mathbf{Y}_{i}}^{\mathbf{j}}$$

Figure 2: an *RCY* segment model for recursive admittance moment calculation.

By using the above recursive operations, we easily compute the moments of  $Y_{in}=Y_1$  in the CFO form, and hence we calculate the values of  $c_n$ ,  $r_{\pi}$ , and  $c_f$  in the CFO form using Eqn. (3).

# Gate Timing Analysis for the *RC*-π Load in Block-Based σTA

**Problem statement III:** Given is a variational CMOS driver, whose input rise time,  $t_{in}$ , is in the CFO form and drives a variational *RC*- $\pi$  load. The resistance and capacitances of this load are also in the CFO forms. The distribution characteristics of all global and independent sources of variation ( $\mu$ =0,  $\sigma^2$ =1,  $\kappa$ ) are given. The objective is to calculate the output transition time,  $t_r$ , in the CFO form:

$$\overset{\triangleleft \triangleright}{t_r} = t_{r,0} + \sum_{i=1}^m t_{r,i} \Delta X_i + t_{r,m+1} \Delta S_{t_r}$$

i.e., calculate the nominal value  $(t_{r,0})$  and the sensitivity coefficients  $(t_{r,i} \text{ and } t_{r,m+1})$  as well as the skewness of distribution of  $\Delta S_{t_m}$ 

Section 2.2.1 solves the same problem where the gate drives a variational purely-capacitive load in the CFO form. (cf. Eqn. (1)) Therefore, if we substitute the RC- $\pi$  load with its equivalent variational effective capacitance,  $c_{eff}$ , in the CFO form, then the solution to problem statement I is an acceptable solution to problem statement III. Based on this reasoning, the following subsections propose a solution for calculating the effective capacitance in the CFO form. Section 4.1 presents a new effective capacitance

calculation in static timing analysis. This approach is used in section 4.2 where statistical effective capacitance is calculated.

## 4.1 A new approach for effective capacitance calculation in static timing analysis

By definition, the effective capacitance is a pure capacitance that replaces an RC- $\pi$  load and has the property that it gives the most accurate result from a timing model that is characterized with lumped capacitance. Typically, the effective capacitance stores the same amount of charge as the RC- $\pi$  load until a certain point of the output voltage transition [11][12][13] (e.g., the 50% point of the output transition.) Figure 3(a) depicts a typical CMOS driver with its input waveform and RC- $\pi$  load. The output voltage waveform may be modeled as a weighted linear sum of ramp and exponential waveforms as shown in Figure 3(b). We therefore assume that the *actual*  $c_{eff}$  can be obtained as a weighted average of that obtained for the ramp output waveform.

In the following, we calculate  $c_{eff}$  for ramp and exponential waveforms of the gate output voltage.



Figure 3: (a) A gate, which drives an RC- $\pi$  calculated load. (b) Gate output waveform is neither ramp nor exponential.

**Theorem 1:** Suppose that output voltage of a gate is approximated with an exponential waveform:

$$V_N(t) = V_{dd} \left( 1 - e^{-pt} \right)$$
 where  $p = \frac{\ln \left( \frac{1 - \alpha}{1 - \beta} \right)}{t_r}$ 

where  $V_N(t)$  is the gate output voltage waveform in time domain and  $t_r$  is the output rise time from  $\alpha$ % transition to  $\beta$ % transition of this waveform. Note that  $t_r$  is a function of the input transition time  $(t_{in})$  and the output load. Thus, the iterative effective capacitance equation for matching any  $\theta$ % point of the gate output transition time can be written as:

$$\begin{aligned} c_{\text{eff}}^{\text{Exp}}\left(\theta\right) &= G\left(t_{r}, c_{n}, r_{\pi}, c_{f}\right) = c_{n} + k_{\text{Exp}}\left(\theta\right)c_{f} \quad \text{where} \\ k_{\text{Exp}}\left(\theta\right) &= \left[1 + \frac{y}{\theta}\left(e^{\ln(1-\theta)/y} - 1\right)\right] \quad \text{and} \quad y = \ln\left(\frac{1-\alpha}{1-\beta}\right) \times \frac{r_{\pi}c_{f}}{t_{r}\left(t_{in}, c_{\text{eff}}^{\text{Exp}}\left(\theta\right)\right)} \end{aligned}$$

Similarly for the ramp output voltage waveform, we have:

$$c_{eff}^{Ramp}(\theta) = H(t_r, c_n, r_\pi, c_f) = c_n + k_{Ramp}(\theta) c_f \quad \text{where}$$

$$k_{Ramp}(\theta) = \left[1 - \frac{x}{\theta} (1 - e^{-\theta/x})\right] \quad \text{and} \quad x = (\beta - \alpha) \frac{r_\pi c_f}{t_r(t_{in}, c_{eff}^{Ramp}(\theta))}$$

Proof: It is omitted for brevity.

Now, based on the assumption made above, an iterative equation for actual  $c_{eff}$  calculation for any  $\theta\%$  point of the output transition time may be written as:

$$c_{eff}^{Exp} \left( \theta \right) = G \left( t_r \left( t_{in}, c_{eff}^{Exp} \left( \theta \right) \right), c_n.r_{\pi}, c_f \right)$$

$$c_{eff}^{Ramp} \left( \theta \right) = H \left( t_r \left( t_{in}, c_{eff}^{Ramp} \left( \theta \right) \right), c_n.r_{\pi}, c_f \right) \right)$$

$$c_{eff} \left( \theta \right) = F \left( t_r \left( t_{in}, c_{eff} \left( \theta \right) \right), c_n.r_{\pi}, c_f \right) = \zeta \cdot G + (1 - \zeta) H$$

$$(5)$$

where  $0 \le \zeta \le 1$  is the weighting factor for the linear combination of exponential and ramp waveforms. However, we have observed that when  $\theta\%=50\%$ , then  $\zeta=0.5$  results in the minimum error between the iterative  $c_{eff}$  equation in Eqn. (5) and the actual sign-off  $c_{eff}$  value.

## 4.2 Calculating $c_{eff}$ in the CFO form

Suppose  $t_{in}$ ,  $c_n$ ,  $r_{\pi}$ , and  $c_f$  in the CFO form are given as:

$$\sum_{i_{in}}^{\oplus} = t_{in,0} + \sum_{i=1}^{m} t_{in,i} \Delta X_{i} + t_{in,m+1} \Delta S_{t_{in}}$$
(6)

$$r_{\pi}^{\Rightarrow} = r_{\pi,0} + \sum_{i=1}^{m} r_{\pi,i} \Delta X_i + r_{\pi,m+1} \Delta S_{r_{\pi}}$$
(8)

$$c_{f}^{\triangleleft \flat} = c_{f,0} + \sum_{i=1}^{m} c_{f,i} \Delta X_{i} + c_{f,m+1} \Delta S_{c_{f}}$$
(9)

$$\Delta S_{t_{in}} \sim Dist\left(\mu = 0, \sigma^2 = 1, \kappa_{t_{in}}\right) \qquad \Delta S_{c_n} \sim Dist\left(\mu = 0, \sigma^2 = 1, \kappa_{c_n}\right)$$

$$\Delta S_{r_{\pi}} \sim Dist\left(\mu = 0, \sigma^2 = 1, \kappa_{r_{\pi}}\right) \qquad \Delta S_{c_{\ell}} \sim Dist\left(\mu = 0, \sigma^2 = 1, \kappa_{c_{\ell}}\right) \qquad (10)$$

The effective capacitance for this problem generally becomes a complex random variable, i.e.  $c_{eff}^{\wp}$ . Therefore, we approximate it with its CFO form and the objective becomes to calculate the coefficients of  $c_{eff}$  in the CFO form as well as the skewness of  $\Delta S_{c_{eff}}$  as:

$$\mathcal{L}_{eff}^{\triangleleft \triangleright} = c_{eff,0} + \sum_{i=1}^{m} c_{eff,i} \Delta X_i + c_{eff,m+1} \Delta S_{c_{eff}}$$
(11)

Such that 
$$E\left[\left(c_{eff}^{\oplus} - F\left(t_r\left(t_{in}^{\oplus}, c_{eff}^{\oplus}\right), c_n, r_{\pi}, c_f\right)\right)^2\right]$$
 is minimized

where F is given in Eqn. (5). Theorem 2 presents the solution for calculating these unknown values.

**Theorem 2:** For a variational circuit, where  $t_{in}$ ,  $c_n$ ,  $r_{\pi}$ , and  $c_f$  in the CFO form are written as in Eqns. (6)-(10), the coefficients of  $c_{eff}$  in the CFO form (Eqn. (11)), can be calculated as:

$$c_{eff,0} = F\left(t_r\left(t_{in,0}, c_{eff,0}\right), c_{n,0}, r_{\pi,0}, c_{f,0}\right)$$
(12)  
$$\left(\partial t\right)^{nom} \left(\partial F\right)^{nom} \left(\partial F\right)^{nom}$$

$$E_{eff,i} = \frac{\left(\frac{\partial I_r}{\partial t_m}\right) \cdot \left(\frac{\partial I}{\partial t_r}\right) \cdot t_{in,i} + \left(\frac{\partial I}{\partial c_n}\right) \cdot c_{n,i}}{1 - \left(\frac{\partial F}{\partial t_r}\right)^{nom} \cdot \left(\frac{\partial I_r}{\partial c_{eff}}\right)^{nom}}$$
(13)

$$+\frac{\left(\frac{\partial F}{\partial r_{\pi}}\right)^{nom} \cdot r_{\pi,i} + \left(\frac{\partial F}{\partial c_{f}}\right) \cdot c_{f,i}}{1 - \left(\frac{\partial F}{\partial t_{r}}\right)^{nom} \cdot \left(\frac{\partial t_{r}}{\partial c_{eff}}\right)^{nom}}$$

$$eff, m+1 = \sqrt{\left(\frac{c_{eff,m+1}}{r_{m}}\right)^{2} + \left(\frac{c_{eff,m+1}}{r_{m}}\right)^{2}} + \left(\frac{c_{eff,m+1}}{r_{m}}\right)^{2} + \left(\frac{c_{eff,m+1}}{r_{m}}\right)^{2}}$$
(14)

С

$$\Delta S_{c_{eff}} \sim Dist \left( \mu = 0, \sigma^2 = 1, \kappa = \frac{\sum_{u \in U} (c_{eff, m+1}^{\ u})^3 \kappa_u}{\left(\sum_{u \in U} (c_{eff, m+1}^{\ u})^2\right)^{3/2}} \right)$$
(15)

and  $U = \{ t_{in}, c_{n}, r_{\pi}, c_{f} \}$ 

where;



*Proof:* It is omitted for brevity.

Eqn. (12) is the iterative  $c_{eff}$  calculation under the nominal conditions of the circuit. Hence,  $c_{eff,0}$  can be evaluated by using the conventional effective capacitance calculation [12][13].

 $t_{in,i}$ ,  $c_{n,i}$ ,  $r_{\pi i}$ ,  $c_{f,i}$ , are given (cf. Eqns. (6)-(9).) To evaluate Eqns. (13) and (14), we must calculate the derivatives of function F (function F is given in Eqn. (5)) with respect to  $t_r$ ,  $c_n$ ,  $r_{\pi r}$ ,  $c_{fr}$  and evaluate these derivatives for the nominal values of the circuit parameters (when all sources of variation are set to zero i.e.,  $(\partial F/\partial t_r)^{nom}$ ,  $(\partial F/\partial c_n)^{nom}$ ,  $(\partial F/\partial r_n)^{nom}$ , and  $(\partial F/\partial c_f)^{nom}$ ) These terms are easy to evaluate. For the remaining terms, we need to calculate the derivatives of the output transition time  $(t_r)$  with respect to  $t_{in}$  and  $c_{eff}$  and evaluate them under the nominal condition of the circuit (i.e.,  $(\partial t_r/\partial t_{in})^{nom}$  and  $(\partial t_r/\partial c_{eff})^{nom}$ .) Therefore, we propose two different solutions:

1. Updating the gate library look-up table and utilizing the additional data during  $\sigma$ TA: The revised tables now provide not only the timing quantity for each combination of  $t_{in}$  and  $c_l$ , but also the derivatives of the timing quantity  $(t_r)$  with respect to  $t_{in}$  and  $c_l$  for each combination of  $t_{in}$  and  $c_l$ .

2. Using the existing gate library look-up table, but performing additional calculations during  $\sigma$ TA: To approximately calculate  $(\partial t_r/\partial t_{in})^{nom}$ , we read  $t_r$  (from the gate library) for  $\langle t_{in,0}$ ;  $c_{l,0} \rangle$  and  $\langle t_{in,0}+\delta, c_{l,0} \rangle$ . Next, we calculate  $\Delta t_r/\delta$  as the approximation.  $(\partial t_r/\partial c_{eff})^{nom}$  can be similarly calculated.

Using any of the above solutions, Eqns. (13) and (14) become closed form expressions, which can be evaluated in constant time. Note that we calculate  $(\partial F/\partial t_r)^{nom}$ ,  $(\partial F/\partial c_n)^{nom}$ ,  $(\partial F/\partial r_n)^{nom}$ , and  $(\partial F/\partial c_f)^{nom}$  only once and in a constant time. Therefore, complexity of our method is dominated by the iterative effective capacitance calculation under the nominal conditions.

## 5. Experimental Results

Our experiments use 90nm CMOS process parameters to model gates and interconnect parasitics. We assumed two different configurations for the experimental setup. The first one consists of two inverters connected in series whereas the second one is a CMOS inverter followed by a 2-input NAND gate. For both configurations, we apply a ramp input to the first inverter while its nominal value is chosen from the set  $(t_{in})^{nom}$ ={10ps,80ps,150ps,220ps,300ps}. For the first configuration, size of the first inverter is fixed at  $W_p/W_n$  =30/15 $\mu$ m whereas size of the second inverter is chosen to be one of  $W_p/W_n$ ={20/10, 50/25, 70/35, 100/50} $\mu$ m. For the second configuration, size of the first inverter is again fixed at  $W_p/W_n$  =30/15 $\mu$ m whereas this time the size of the succeeding 2-input NAND gate is chosen to be one of  $W_p/W_n$ ={40/40, 50/50, 100/100} $\mu$ m.

To characterize the timing behavior of the gate, a look-up table based library is employed which represents the gate delay and output transition time as a function of input rise time, output capacitive load,  $V_{dd}$ , and temperature. We apply different loading scenarios for the second-stage gate as explained in the following subsections, i.e., pure capacitive load, and general RC load. We have also considered four different global sources of variation ( $V_{dd}$ , temperature, Metal layer 1 width, and ILD) and one independent random sources of variation for each electrical parameter (i.e., r and c) and timing parameter (for instance  $t_{in}$ ) in the circuit. The sensitivity of each given data to the sources of variation is chosen randomly, while the total  $\sigma$  variation for each data is chosen to be 10% and 15% of their nominal value. We also assumed that the sources of variation are skewed with different skewness values as explained in each subsection. Mean, variance, and skewness of effective capacitance, the gate 50% propagation delay, and 10%-90% output transition time (slew) are calculated using the approaches presented in this paper.

To compare the results, we ran Monte Carlo simulation with  $10^4$  samples on each test scenario and derived mean, variance, and skewness of the effective capacitance, gate 50% propagation delay, and 10%-90% output transition time. Average percentage errors for the mean, variance, and skewness of effective capacitance, the gate 50% propagation delay, and 10%-90% output transition time between the obtained results from the Monte Carlo and the calculated results based on using statistical gate timing analysis approach are reported.

#### A. Purely Capacitive Load

The load in this section is considered to be purely capacitive. Its nominal value is chosen to be  $(C)^{nom} = \{400, 500, 800, 1400\} fF$ . The scaled distribution of the sources of variation is considered to have a skewness of 0.4, 0.6, and 0.8. We performed our experiments on both circuit configurations explained above. The results for the first configuration (where the second gate is an inverter) are presented in Table 2 (the skewness of the given data is 0.4) and Table 3 (for the skewness of 0.8). The results for the second configuration are provided in Table 4 (for the skewness of 0.6). Experimental results indicate an average error of about 3% for two different  $\sigma$  values, i.e. 10% and 15%. As we increase the  $\sigma$  value (i.e. the total  $\sigma$  variation for each data; e.g.  $\sigma$  variation of  $t_{in}$ , and  $c_l$ ) from 10% to 15%, the error in calculated mean, variance, and skewness of the delay and slew increase, but slightly. The sources of error can be mainly classified into two groups: 1) the inaccuracy of the gate library table lookup and 2) the linear first order approximation of the timing and electrical parameters with respect to the sources of variation. Note that, the runtime of the proposed algorithm in average is 89 times faster than the Monte Carlo based approach.

Table 2: Average error for the inverter driving pure capacitive load (Skewness=0.4)

|               | σ=1   | 10%  | σ=15% |      |  |
|---------------|-------|------|-------|------|--|
| Average error | Delay | Slew | Delay | Slew |  |
| Mean          | 1.5%  | 1.7% | 2.2%  | 2.3% |  |
| Variance      | 1.2%  | 1.3% | 1.8%  | 1.9% |  |
| Skewness      | 1.0%  | 1.1% | 1.4%  | 1.3% |  |

| (Bite (intebs of o)) |       |      |       |      |  |  |  |  |
|----------------------|-------|------|-------|------|--|--|--|--|
|                      | σ=10% |      | σ=15% |      |  |  |  |  |
| Average error        | Delay | Slew | Delay | Slew |  |  |  |  |
| Mean                 | 1.9 % | 2.3% | 2.5%  | 2.9% |  |  |  |  |
| Variance             | 1.6%  | 1.7% | 1.9%  | 2.1% |  |  |  |  |
| Skewness             | 1.4%  | 1.5% | 1.5%  | 1.9% |  |  |  |  |

Table 3: Average error for the inverter driving pure capacitive load (Skewness=0.8)

Table 4: Average error for the 2-input NAND gate driving pure capacitive load (Skewness=0.6)

|               | σ=1   | 10%  | σ=15% |      |  |
|---------------|-------|------|-------|------|--|
| Average error | Delay | Slew | Delay | Slew |  |
| Mean          | 3.0 % | 3.1% | 3.2%  | 3.1% |  |
| Variance      | 2.5%  | 2.7% | 2.8%  | 2.9% |  |
| Skewness      | 2.2%  | 2.3% | 2.5%  | 2.6% |  |

#### B. General RC Load

For this section, the load is considered to be an *RC* tree of varying topology. The nominal value of total load resistance is chosen from the set  $(R)^{nom}$  = {150, 260, 300, 710, 1000} $\Omega$  and the nominal value of the total capacitance of the load is chosen to be from the set  $(C)^{nom}$  ={400, 500, 800, 1400}*fF*. The scaled distribution of the sources of variation is considered to have a skewness of 0.5, 0.75, and 1.

Again, we performed the experiment on both circuit configurations as explained before. The results for the first configuration (where the second gate is an inverter) are presented in Table 5 (the skewness of the given data is 0.5) and Table 6 (the skewness of the given data is 0.75). The results for the second configuration are also provided in

Table 7 (the skewness of the given data is 1). Experimental results indicate an average error of about 6% for different  $\sigma$  values. As we increase the  $\sigma$  value (i.e. the total  $\sigma$  variation for each data; e.g.  $\sigma$  variation of  $t_{in}$ ,  $c_n$ ,  $r_{\pi}$ , and  $c_f$ ) from 10% to 15%, the error in calculated mean, variance, and skewness of  $c_{eff}$ , the gate delay, and output transition time increase, but slightly. Similarly, as skewness increases (e.g. skewness of  $t_{in}$ ,  $c_n$ ,  $r_{\pi}$ , and  $c_f$ ) from 0.5 to 0.75, the error in calculated mean, variance, and skewness of the  $c_{eff}$ , as well as the error in delay and slew increases, but slightly. The sources of error can be mainly classified into four groups: 1) the inaccuracy of the gate library table lookup, 2) the linear first order approximation of the timing and electrical parameters with respect to the sources of variation, 3) the error in calculating the variational RC- $\pi$  load and 4) the error in the effective capacitance iterative equation proposed in section 4.1. The runtime of the proposed algorithm is, on average, 95 times faster than the Monte Carlo based approach.

Table 5: Average error for the inverter driving general *RC* load (Skewness=0.5)

|               | σ=10% |       |      | σ=15% |       |      |  |
|---------------|-------|-------|------|-------|-------|------|--|
| Average error | Ceff  | Delay | Slew | Ceff  | Delay | Slew |  |
| Mean          | 3.2%  | 3.5%  | 4.9% | 3.5%  | 5.4%  | 5.8% |  |
| Variance      | 2.4%  | 3.3%  | 4.5% | 2.6%  | 5.9%  | 5.2% |  |
| Skewness      | 2.5%  | 3.3%  | 4.9% | 2.0%  | 5.5%  | 5.5% |  |

Table 6: Average error for the inverter driving general *RC* load (Skewness=0.75)

| (She (Hess of e)) |       |       |      |       |       |      |  |  |
|-------------------|-------|-------|------|-------|-------|------|--|--|
|                   | σ=10% |       |      | σ=15% |       |      |  |  |
| Average error     | Ceff  | Delay | Slew | Ceff  | Delay | Slew |  |  |
| Mean              | 3.5%  | 5.1 % | 5.3% | 3.8%  | 5.9%  | 6.1% |  |  |
| Variance          | 2.9%  | 4.3%  | 5.5% | 3.6%  | 6.2%  | 6.2% |  |  |
| Skewness          | 2.8%  | 4.1%  | 4.9% | 3.1%  | 5.9%  | 5.9% |  |  |

 Table 7: Average error for the 2-input NAND gate driving general

 RC load (Skewness=1)

|               | σ=10% |       |      | σ=15% |       |      |
|---------------|-------|-------|------|-------|-------|------|
| Average error | Ceff  | Delay | Slew | Ceff  | Delay | Slew |
| Mean          | 4.1%  | 5.2 % | 5.1% | 4.2%  | 6.1%  | 6.7% |
| Variance      | 3.9%  | 5.4%  | 5.2% | 4.3%  | 6.1%  | 6.1% |
| Skewness      | 4.0%  | 6.1%  | 5.6% | 4.2%  | 6.5%  | 6.3% |

# 6. Conclusion

In this paper we presented a framework to handle the variation-aware gate timing analysis in block-based  $\sigma$ TA considering non-Gaussian sources of variation. First, we proposed an approach to calculate variational *RC*- $\pi$  load, which can be utilized in place of the actual variational *RC* load for the gate timing analysis purposes. Next, we presented a new approach for calculating effective capacitance in STA. We used this technique to calculate the statistical  $c_{eff}$  in the CFO form, and thereby, calculated the gate delay and output slew in the that form. Experimental results show an average error of 6% with respect to Monte Carlo with 10<sup>4</sup> samples simulation.

## 7. References

- S. Nassif, "Modeling and Analysis of Manufacturing Variations," CICC, 2001, pp. 223-228.
- [2] C. Visweswariah, K. Ravindran, K. Kalafala, S.G. Walker, S. Narayan, "First-order incremental block-based statistical timing analysis," *DAC*, 2004, pp. 331-336.
- [3] H. Chang, V. Zolotov, S. Narayan, and C. Visweswariah "Parameterized Block-Based Statistical Timing Analysis with Non-Gaussian and Non-Linear Parameters," *Int'l Workshop on Timing Issues* (TAU), 2005.
- [4] Y. Liu, L. T. Pileggi, and A. J. Strojwas, "Model Order Reduction of RC(L) Interconnect Including Variational Analysis," DAC, 1999, pp. 201-206.
- [5] J.D. MA and R.A. Rutenbar, "Interval-Valued Reduced Order Statistical Interconnect Modeling," *ICCAD*, 2004, pp. 460-467.
- [6] K. Agarwal, D. Sylvester, D. Blaauw, F. Liu, S. Nassif, and S. Vrudhula, "Variational delay metrics for interconnect timing analysis," DAC, 2004 pp. 381 – 384.
- [7] A. Agarwal, F. Dartu, D.Blaauw, "Statistical Gate Delay Model Considering Multiple Input Switching," DAC, 2004, pp. 658 – 663.
- [8] K. Okada, K. Yamaoka, and H. Onodera, "A statistical gatedelay model considering intra-gate variability" *ICCAD*, 2003, pp. 908 – 913.
- [9] V. Mehrotra, S. Nassif, D. Boning, and J. Chung, "Modeling the Effects of Manufacturing Variation on High-Speed Microprocessor Interconnect Performance," *IEEE Electron Devices Meetings*, 1998, pp. 767-770.
- [10] P.R. O'Brien and T. L. Savarino, "Modeling the Driving-Point Characteristics of Resistive Interconnect for Accurate Delay Estimation," *ICCAD*, 1989, pp.512-515.
- [11] A.B. Kahng, S. Muddu, "Improved effective capacitance computations for use in logic and layout optimization," VLSI Design, 1999, pp. 578-582.
- [12] F. Dartu, N. Menezes, and L. Pillegi, "Performance Computation for Precharacterized Gates with RC Loads," *IEEE Trans. On Computer Aided Design* 15(5):544-533, 1996.
- [13] S. Abbaspour, M. Pedram, "Calculating the Effective Capacitance for the RC Interconnect in VDSM Technologies," *ASPDAC*, 2003.