# Parameterized Block-Based Non-Gaussian Variational Gate Timing Analysis 

Soroush Abbaspour, Hanif Fatemi, and Massoud Pedram<br>University of Southern California<br>Department of Electrical Engineering<br>Los Angeles CA


#### Abstract

As technology scales down, timing verification of digital integrated circuits becomes an extremely difficult task due to the gate and wire variability. Therefore, statistical timing analysis (denoted by $\sigma T A)$ is becoming unavoidable. In this paper, two new approaches for doing statistical gate timing analysis for Gaussian and non-Gaussian sources of variation in block-based $\sigma$ TA are presented. To start, a variational RC- $\pi$ load is approximated by using a canonical first-order model. Next, an accurate variational gate timing analysis (VGTA) technique, which accounts for variational $R C-\pi$ loads, statistical input transitions, and a variation-aware gate library, is introduced. The proposed method relies on a novel static effective capacitance calculation method and its variational form. Experimental results demonstrate that VGTA exhibits an average error of only $4 \%$ for gate delay and output transition time with respect to the Monte Carlo simulation with $10^{4}$ samples. Next, a more efficient variational gate timing analysis (called F-VGTA) based on a single-iteration variational effective capacitance calculation is presented. Experimental results show F-VGTA achieves an average error of $7 \%$ for gate delay and output slew time with respect to the Monte Carlo simulation with $10^{4}$ samples, but with runtimes that are about two times faster than VGTA.


[^0]
## 1. Introduction

Process technology and environment-induced variability of gates and wires in VLSI circuits makes timing analysis of such circuits a challenging task [1]. More precisely, advanced analysis tools must be developed which are capable of verifying the changes in the circuit timing that stem from various sources of variations. These sources are in turn due to the following: imperfect CMOS manufacturing processes (e.g., variations in $L, T_{O X}, V_{T}$ or $I L D$ thickness), environmental factors such as drops in $V_{d d}$ (resistive drop and ground bounce), substrate temperature changes (due to migration of local hot spots over the chip area), and device fatigue phenomena (e.g., electro-migration, hot electron effects, and negative bias temperature instability) [2].
$\sigma$ TA approaches can be classified into two major groups: path-based and block-based. . In the path-based algorithms, a selected set of paths is submitted to the statistical timer for detailed analysis. Path-based statistical timing is accurate and has the ability to realistically capture correlations, but suffers from other weaknesses. For instance, it is not clear how to select paths for the detailed analysis since one of the paths that is omitted may be critical in some part of the process space. In addition, path-based statistical timing often does not provide the diagnostics necessary to improve the robustness of the design [2]. Due of shortages of path-based $\sigma$ TA, block-based $\sigma$ TA has received a lot of attention. In block-based $\sigma \mathrm{TA}$, every timing quantity of interest (e.g., delay and slew time, arrival time and required arrival time) is represented as a function of global sources of variation (denoted by $\mathrm{X}_{\mathrm{i}}$ ) and independent random sources of variation (denoted by $S_{i}$ ) in a canonical firstorder (denoted by CFO) form. The advantages of such a formulation are that it can capture many of the key correlations and can produce delay sensitivities due to changes in a variety of environmental and process-related parameters [2]. Sources of variations have often been assumed to be Gaussian, which in turn simplifies the block-based $\sigma$ TA. However, it has been recently reported that certain process parameters exhibit non-Gaussian probability distributions [3].

Block-based $\sigma$ TA breaks its analysis into two parts: 1) variational interconnect timing analysis, and 2) variational gate timing analysis. Variational interconnect timing analysis has been studied by a
number of researchers. References [4] and [5] presented reduced order modeling approaches for interconnect propagation delay calculation, which accounts for manufacturing variations. These approaches are computationally expensive due to the lack of closed form expressions. The authors of [6] expressed the canonical first-order model of the interconnect delay in closed form and showed how to propagate it through the interconnect. The authors of [7] described a modeling technique for gate delay variability considering multiple input switching. In [8], a model for calculating statistical gate delay variations caused by intra-chip and inter-chip variabilities was presented. These works, however, do not provide an accurate means of analyzing the gate propagation delay and output slew time as a function of variational $R C-\pi$ loads, statistical input transitions, and a variation-aware gate library. In this paper, two new techniques are presented for determining the variational gate timing behavior.

The first technique, called VGTA (for Variational Gate Timing Analysis), performs the following steps. Given the variational resistive-capacitive load (where all resistances and capacitances are represented in the CFO form), an efficient and accurate algorithm will be presented to calculate the variational $R C-\pi$. load. To perform the analysis, we calculate the variation-aware admittance moments (cf. section 0), and as a result, the resistance and capacitances in the $R C-\pi$ load can be written in the CFO form. Based on the statistical RC- $\pi$ load obtained in this way, we calculate the variational effective capacitance in the CFO form. To accomplish this goal, first a new approach for effective capacitance calculation in static timing analysis (STA) is presented (cf. section 4.1.) This effective capacitance calculation method is used to calculate the variational effective capacitance considering non-Gaussian process and environmental sources of variation in the CFO form (cf. section 4.2.) Given statistical input transition times, variation-aware gate library, and variational effective capacitance ( $c_{\text {eff }}$ ) load in the CFO form, we calculate variational gate delay and output transition time in the CFO form (cf. sections 2.2.1.)

The second technique, which is called F-VGTA for Fast Variational Gate Timing Analysis works as follows. The first step of F-VGTA is similar to the first step of VGTA algorithm. To determine the variational gate delay and output slew time in the CFO form, a "variation-aware effective
capacitance" technique is proposed in section 4.3, which is based on the single-iteration $C_{\text {eff }}$ calculation approach of section 4.1.

We point out that although, in this paper, we focus on the random variables in CFO form to represent process and environmental sources of variation as well as the performance quantities of interest, the work itself is not limited to the first-order approximation of these sources of variation. In fact, it is straightforward to extend the approach to more complex (e.g., second-order) forms for both Gaussian and non-Gaussian parameter variations.

The remainder of this paper is organized as follows. In section 2, we review the background of parameterized block-based $\sigma \mathrm{TA}$. We also show how to convert a quantity, which itself is a function of global and independent sources of variation, into a canonical first-order (CFO) form. The variation-aware $R C-\pi$ calculation is presented in section 0 . Section 4 explains two new statistical gate timing analysis techniques which handle statistical input rise times, variation-aware gate library, and variational $R C-\pi$ load. Section 5 presents experimental results. Finally, conclusions are discussed in section 6 . We use the notation shown in Table 1 throughout the paper.

Table 1: Useful notation and terminology

| Notation | Description |
| :---: | :---: |
| A | A deterministic variable (which does not take into account any statistical variation) |
| $\stackrel{\wp}{A}$ | An arbitrary (non-CFO) random variable, which is a function of $m$ global and $p$ independent random sources of variation |
| $\stackrel{\triangle}{A}$ | A CFO random variable, which is a function of $m$ global and $p$ independent <br> random sources of variation i.e., $\stackrel{\Delta \Delta}{A}=A_{0}+\sum_{i=1}^{m} A_{i} \Delta X_{i}+\sum_{k=1}^{p} A_{m+j} \Delta S_{j}$ |

## 2. Background

In $\sigma \mathrm{TA}$, it is required to evaluate the distribution of the delay and slew time of the critical paths. Until now, this goal has typically been achieved by calculating the mean and variance of the distributions of the delay and slew time. However, as mentioned earlier, sources of variation may have an arbitrary (i.e., non-Gaussian) distribution. Therefore, in general, in addition to calculating the mean and variance of the electrical and timing parameters, one must calculate at least the skewness of their distributions, i.e. one must at the minimum calculate the first three moments of the circuit parameter variations.

Definition 1: The degree of asymmetry of a probability distribution function is called its skewness (denoted by $\kappa$.) A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. The skewness of a normal distribution is zero. Negative values for the skewness indicate distributions which are skewed to the left whereas positive values for the skewness indicate distributions which are skewed to the right. By left (right) skew, we mean that the left (right) tail is heavier than the right (left) tail. The skewness of a distribution is defined to be $\kappa=\frac{\mu_{3}}{\sigma^{3}}$ where $\mu_{3}$ is the $3^{\text {rd }}$ central moment and $\sigma^{2}$ is the variance (second central moment.)

Definition 2: We say $X$ is equal to $Y$ in the first three moments $\left(X \stackrel{d_{3}}{=} Y\right.$ ) if the mean, variance, and skewness of $X$ and $Y$ are equal. (i.e., they have the same first three central moments.)

Lemma 1: Suppose $\Delta S_{1}, \ldots, \Delta S_{n}$ are $n$ independent random variables with distribution $\Delta S_{i} \sim$ Dist $_{i}$ $\left(\mu=0, \sigma^{2}=1, \kappa_{i}\right)$. Then,

$$
\sum_{i=1}^{n} a_{i} \Delta S_{i}=\sqrt[d_{3}]{\sum_{i=1}^{n} a_{i}^{2}} \cdot \Delta S_{e q} \quad \text { where } \quad \Delta S_{e q} \sim \text { Dist }\left(\mu=0, \sigma^{2}=1, \kappa=\frac{\sum_{i=1}^{n} a_{i}^{3} \cdot \kappa_{i}}{\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}}\right)^{3}}\right)
$$

Proof: Using expectation value properties, we have

$$
E\left(\sum_{i=1}^{n} a_{i} \Delta S_{i}\right)=\sum_{i=1}^{n} E\left(a_{i} \Delta S_{i}\right)=\sum_{i=1}^{n} a_{i} E\left(\Delta S_{i}\right)=\sum_{i=1}^{n} a_{i} \cdot 0=0
$$

and

$$
E\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}} \cdot \Delta S_{e q}\right)=\sqrt{\sum_{i=1}^{n} a_{i}^{2}} \cdot E\left(\Delta S_{e q}\right)
$$

From Definition 2, because the mean value of the two side of the equality should be equal

$$
\sqrt{\sum_{i=1}^{n} a_{i}^{2}} E\left(\Delta S_{e q}\right)=0 \Rightarrow E\left(\Delta S_{e q}\right)=0
$$

For the variance of $\Delta S_{\text {eq }}$, we have

$$
E\left(\sum_{i=1}^{n} a_{i} \Delta S_{i}\right)^{2}=\sum_{i=1}^{n} E\left(a_{i} \Delta S_{i}\right)^{2}+2 \sum_{i=1}^{n} \sum_{j=1 \neq i}^{n} E\left(a_{i} \Delta S_{i} \cdot a_{j} \Delta S_{j}\right)
$$

Since $\Delta S i$ 's are independent, we have

$$
=\sum_{i=1}^{n} E\left(a_{i} \Delta S_{i}\right)^{2}+0=\sum_{i=1}^{n} a_{i}^{2} E\left(\Delta S_{i}\right)^{2}=\sum_{i=1}^{n} a_{i}^{2}
$$

and

$$
E\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}} \cdot \Delta S_{e q}\right)^{2}=\sum_{i=1}^{n} a_{i}^{2} \cdot E\left(\Delta S_{e q}\right)^{2}
$$

From Definition 2 and since the mean value of $\Delta S_{\text {eq }}$ is 0 , then

$$
\begin{aligned}
& \sum_{i=1}^{n} a_{i}^{2} E\left(\Delta S_{e q}\right)^{2}=\sum_{i=1}^{n} a_{i}^{2} \Rightarrow E\left(\Delta S_{e q}\right)^{2}=0 \\
& \text { Since } E\left(\Delta S_{e q}\right)=0 \Rightarrow{\sigma_{\Delta S_{e q}}{ }^{2}=1}^{\text {a }}=1 \text {. }
\end{aligned}
$$

In addition, for the skewness of $\Delta S_{\text {eq }}$ distribution, we have

$$
E\left(\sum_{i=1}^{n} a_{i} \Delta S_{i}\right)^{3}=\sum_{i=1}^{n} E\left(a_{i} \Delta S_{i}\right)^{3}+3 \sum_{i=1}^{n} \sum_{j=1 \neq i}^{n} E\left(\left(a_{i} \Delta S_{i}\right)^{2} \cdot a_{j} \Delta S_{j}\right)
$$

Since $\Delta S_{i}$ 's are independent, we have

$$
=\sum_{i=1}^{n} E\left(a_{i} \Delta S_{i}\right)^{3}+0=\sum_{i=1}^{n} a_{i}{ }^{3} E\left(\Delta S_{i}\right)^{3}=\sum_{i=1}^{n} a_{i}{ }^{3} \kappa_{i}
$$

and

$$
E\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}} \cdot \Delta S_{e q}\right)^{3}=\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}}\right)^{3} \cdot E\left(\Delta S_{e q}\right)^{3}
$$

From Definitions 1 and 2

$$
\begin{aligned}
& \left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}}\right)^{3} E\left(\Delta S_{e q}\right)^{3}=\sum_{i=1}^{n} a_{i}^{3} \kappa_{i} \Rightarrow E\left(\Delta S_{e q}\right)^{3}=\frac{\sum_{i=1}^{n} a_{i}^{3} \kappa_{i}}{\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}}\right)^{3}} \\
& \text { Since } E\left(\Delta S_{e q}\right)=0 \text { and } \sigma_{\Delta S_{e q}}^{2}=1 \Rightarrow \kappa_{\Delta S_{e q}}=\frac{\sum_{i=1}^{n} a_{i}^{3} \kappa_{i}}{\left(\sqrt{\sum_{i=1}^{n} a_{i}^{2}}\right)^{3}}
\end{aligned}
$$

### 2.1 Canonical First-Order (CFO) Representation for Timing and Electrical Parameter Modeling

In block-based statistical timing analysis tool, a first-order variational model is employed for all timing quantities such as the gate and wire delays, arrival times, required arrival times, slacks and slew times, i.e., any timing quantity, $a$, is expressed in the CFO form as:

$$
\stackrel{\triangleleft}{a}=a_{0}+\sum_{i=1}^{m} a_{i} \Delta X_{i}+a_{m+1} \Delta S_{a}
$$

where $a_{0}$ is the nominal value of the timing quantity of interest; $\Delta X_{i}$ 's represent the variation of $m$ global sources of variation, $X_{i}$, from their nominal values, $a_{i}$ 's are the sensitivities to each of the global sources of variation, $\Delta S_{a}$ is the variation of independent random variable $S_{a}$, and $a_{m+1}$ is the sensitivity of the timing quantity to $S_{a}$. By scaling the sensitivity coefficients, we can assume that $\Delta X_{i}$ and $\Delta S_{a}$ have distributions with $\mu=0$ and $\sigma^{2}=1$ and skewness $=\kappa$ denoted by $\operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa\right)$. Moreover, we define $a_{i} / a_{0}$ as the normalized sensitivity coefficient (denoted by NSC.)

Variation in the physical dimensions of the wire causes change in its resistance and capacitance, thereby, making the gate delay and slew time as well as interconnect propagation delay and slew time to vary accordingly [9]. Therefore, we need to capture the effect of geometric variations on the electrical parameters of the interconnect. For instance, resistance and capacitance in the CFO form are calculated as follows:

$$
\stackrel{\triangleleft \triangleright}{r}=r_{0}+\sum_{i=1}^{m} r_{i} \Delta X_{i}+r_{m+1} \Delta S_{r} \quad \stackrel{\triangleleft}{c}=c_{0}+\sum_{i=1}^{m} c_{i} \Delta X_{i}+c_{m+1} \Delta S_{c}
$$

where $r_{0}$ and $c_{0}$ represent nominal resistance and capacitance values, computed when the wire dimensions are at their nominal (or typical) values. $\Delta X_{i}$ 's are the global sources of variation and $\Delta S_{r}$ and $\Delta S_{c}$ represent the independent random sources of variation for the resistance and capacitance, respectively $r_{i}$ and $c_{i}$ are the sensitivity coefficients of resistance and capacitance with respect to the sources of variations, respectively. Again we have the assumption for the distribution of $\Delta X_{i}, \Delta S_{r}$, and $\Delta S_{c}$.

Observation: Invariant Functional Form Property: This property states that: $y=f(x) \Leftrightarrow \stackrel{\mathscr{R}}{Y}=f(\stackrel{\wp}{X})$, which simply underlies the fact that the form of function $f$ operating on some input variable $x$ to
produce output variable $y$ is independent of its input/output type (i.e., whether $x$ and $y$ are deterministic or variational.)

### 2.2 Converting a Variational Function into a CFO Form

As mentioned before, it is important to represent timing and electrical quantities in the CFO form. This in turn enables one to propagate first order sensitivities to different sources of variation through a timing graph [2][9]. Additionally, it makes variational computations efficient and practical and provides timing diagnostics at a very small cost in terms of the cpu time. The remaining question is how to convert a quantity of interest (which itself is a function of different CFO variables) into the CFO form.

The following subsection presents a technique to answer the above question. We use an important example to illustrate the various steps of the proposed procedure. The problem we address is how to convert the gate output transition time into the CFO form. However, this method can be easily applied to any other quantity of interest.

### 2.2.1 Example: Gate timing analysis for lumped capacitive load in block-based $\sigma$ TA

Problem Statement I: Given is a variational CMOS driver where its input rise time, $t_{i n}$, is in the CFO form and drives an output capacitive load, also, in the CFO form. Note that the distribution characteristics of all global and independent sources of variation ( $\mu=0, \sigma^{2}=1, \kappa$ ) are given. The objective is to calculate the output transition time, $t_{r}$, in the CFO form:

$$
t_{r}=t_{r, 0}+\sum_{i=1}^{m} t_{r, i} \Delta X_{i}+t_{r, m+1} \Delta S_{t_{r}}
$$

i.e., calculate the nominal value $\left(t_{r, 0}\right)$ and the sensitivity coefficients ( $t_{r, i}$ and $t_{r, m+1}$ ) as well as the skewness of distribution of $\Delta S_{t r}$.

The gate output transition time is a function of the input transition time, the logic gate characteristics (e.g., the $W / L$ ratio, threshold voltage of transistors, $V_{d d}$, and temperature), and the
output load. In commercial ASIC cell libraries, it is possible to characterize various output transition times (e.g. $10 \%, 50 \%$, and $90 \%$ ) as a function of above variables; i.e.;

$$
\begin{equation*}
t_{r}=T F\left(t_{i n}, c_{l}, z\right) \text { where } z=\left\{\frac{W}{L}, V_{T}, V_{d d}, \text { Temp }, \ldots\right\} \tag{1}
\end{equation*}
$$

where $t_{r}$ is the output transition time and $T F$ is the corresponding output transition time function. $z$ captures the gate characteristics and environmental factors, $t_{i n}$ is the input transition time, and $c_{l}$ is the output capacitive load. Based on the Invariant Functional Form Property, the form of function $T F$ is independent of its input type (deterministic or variational.) Hence, we extend the above equation to the variational case. In block-based $\sigma \mathrm{TA}, t_{i n}, c_{l}$, and every parameter $z$ is given in the CFO form as a function of $m$ global and exactly one independent random sources of variations. Therefore, $t_{r}$ itself is a non-CFO random variable. Hence, to represent the non-CFO $t_{r}$ in the CFO form, we replace $t_{i n}$, $c_{l}$, and $z$ with their corresponding CFO models and collect terms. Now, by differentiating with respect to the global and independent random sources of variation, $t_{r}$ as a function of $m$ global sources of variation and $p$ independent random sources of variation can be approximated as:

$$
\begin{align*}
& t_{r}^{\wp}=T F\left(\Delta X_{1} \ldots \Delta X_{m}, \Delta S_{1} \ldots \Delta S_{p}\right) \Rightarrow \\
& \left.\left.t_{r} \cong T F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} ^{\wp_{i=1}^{m}} \frac{\partial T F}{\partial \Delta X_{i}}\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \cdot \Delta X_{i}+\left.\sum_{j=1}^{p} \frac{\partial T F}{\partial \Delta S_{j}}\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \cdot \Delta S_{j} \quad \text { where } \quad\left\{\begin{array}{l}
l=1 \ldots m \\
k=1 \ldots p
\end{array}\right. \tag{2}
\end{align*}
$$

Considering that $\Delta S_{j}$ 's are $\operatorname{Dist}_{j}\left(\mu=0, \sigma^{2}=1, \kappa_{j}\right)$, Eqn.(2) can be re-written as:

$$
\stackrel{\Delta}{t_{r}}=\left.T F\right|_{\substack{\Delta X_{i}=0 \\
\Delta S_{k}=0}}+\left.\sum_{i=1}^{m} \frac{\partial T F}{\partial \Delta X_{i}}\right|_{\substack{\begin{array}{c}
\Delta X_{l}=0 \\
\Delta S_{k}=0
\end{array}}} \cdot \Delta X_{i}+\sqrt{\sum_{j=1}^{p}\left(\left.\frac{\partial T F}{\partial \Delta S_{j}}\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}}\right.} \cdot \Delta S_{t_{r}}
$$

From Lemma 1,

$$
\Delta S_{t_{r}} \sim \operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa=\frac{\sum_{j=1}^{p}\left(\left.\frac{\partial T F}{\partial \Delta S_{j}}\right|_{\mid \Delta X_{t}=0} ^{\Delta S_{k}=0}\right)^{2}}{}\left(\sqrt{\sum_{j=1}^{p}\left(\left.\frac{\partial T F}{\partial \Delta S_{j}}\right|_{\substack{\Delta X_{i}=0 \\ \Delta S_{k}=0}}\right)^{2}}\right)^{3}\right)
$$

In Lemma 2, we present the key results, which enable us to do addition, multiplication, and division of two CFO forms and putting the result in a new CFO form. (Notice that this lemma allows us to evaluate the above equation.)

Lemma 2: Suppose $a$ and $b$ are two given CFO random variables as follows:

$$
\stackrel{\triangleleft \triangleright}{a}=a_{0}+\sum_{i=1}^{m} a_{i} \Delta X_{i}+a_{m+1} \Delta S_{a} \quad \stackrel{\triangleleft}{b}=b_{0}+\sum_{i=1}^{m} b_{i} \Delta X_{i}+b_{m+1} \Delta S_{b}
$$

The following describes the result of various operations performed on $a$ and $b$.
a) Addition and subtraction:

$$
\stackrel{\triangleleft}{c}=\stackrel{\triangleleft}{a} \pm \stackrel{\triangleleft}{b}=\left(a_{0} \pm b_{0}\right)+\sum_{i=1}^{m}\left(a_{i} \pm b_{i}\right) \Delta X_{i}+\sqrt{a_{m+1}^{2}+b_{m+1}^{2}} \Delta S_{c}
$$

b) Multiplication:

$$
\stackrel{\triangleleft}{c} \cong \stackrel{\triangleleft}{a} \times \stackrel{\Delta}{b}=a_{0} b_{0}+\sum_{i=1}^{m}\left(a_{0} b_{i}+a_{i} b_{0}\right) \Delta X_{i}+\sqrt{\left(a_{0} b_{m+1}\right)^{2}+\left(a_{m+1} b_{0}\right)^{2}} \Delta S_{c}
$$

c) Division:

$$
\stackrel{\triangleleft}{c} \cong \frac{\stackrel{\rightharpoonup}{\bullet}}{\stackrel{\rightharpoonup}{\triangleright}}=\frac{a_{0}}{b_{0}}+\sum_{i=1}^{m} \frac{a_{i} b_{0}-a_{0} b_{i}}{b_{0}{ }^{2}} \Delta X_{i}+\sqrt{\left(\frac{a_{m+1}}{b_{0}}\right)^{2}+\left(\frac{a_{0} b_{m+1}}{b_{0}{ }^{2}}\right)^{2}} \Delta S_{c}
$$

Proof: Based on the aforesaid operations, we have

$$
\begin{aligned}
& =a_{0} \pm b_{0}+\sum_{i=1}^{m} a_{i} \Delta X_{i}+\sum_{i=1}^{m} b_{i} \Delta X_{i}+\sqrt{a_{m+1}^{2}+b_{m+1}^{2}} \Delta S_{c}
\end{aligned}
$$

which proves part (a). For part (b), we have

$$
\begin{aligned}
& =a_{0} \times b_{0}+\sum_{i=1}^{m} a_{i} b_{0} \Delta X_{i}+\sum_{i=1}^{m} a_{0} b_{i} \Delta X_{i}+\sqrt{a_{m+1}{ }^{2} b_{0}{ }^{2}+a_{0}{ }^{2} b_{m+1}{ }^{2}} \Delta S_{c}
\end{aligned}
$$

Part (c) can be proved similarly. Therefore, we can write

$$
\begin{aligned}
& \stackrel{\triangleleft \triangleright}{c} \cong \frac{a}{\Delta \triangleright}=\frac{a_{0}}{b_{0}}+\sum_{i=1}^{m} \frac{a_{i} b_{0}-a_{0} b_{i}}{b_{0}{ }^{2}} \cdot \Delta X_{i}+\sqrt{\left(\frac{a_{m+1}}{b_{0}}\right)^{2}+\left(\frac{a_{0} b_{m+1}}{b_{0}{ }^{2}}\right)^{2}} \cdot \Delta S_{c} \\
& \stackrel{\Delta \triangleright}{c}=\frac{\left.\stackrel{a}{\Delta \triangleright}\right|_{\substack{\Delta \triangleright \\
b \\
\Delta X_{l}=0 \\
\Delta S_{k}=0}}+\left.\sum_{i=1}^{m} \frac{\partial\left(\begin{array}{c}
\Delta \triangleright \\
\frac{a}{\Delta \triangleright} \\
b
\end{array}\right)}{\partial \Delta X_{i}}\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \cdot \Delta X_{i}+\sqrt{\sum_{j=1}^{2}\left(\left.\frac{\partial\left(\begin{array}{c}
\Delta \triangleright \\
\frac{a}{\Delta \triangleright} \\
b
\end{array}\right)}{\partial \Delta S_{j}}\right|_{\substack{ \\
\Delta X_{l}=0 \\
\Delta S_{k}=0}}\right)^{2}} \cdot \Delta S_{c}}{} \\
& =\frac{a_{0}}{b_{0}}+\sum_{i=1}^{m} \frac{a_{i} b_{0}-a_{0} b_{i}}{b_{0}{ }^{2}} \cdot \Delta X_{i}+\sqrt{\left(\frac{a_{m+1}}{b_{0}}\right)^{2}+\left(\frac{a_{0} b_{m+1}}{b_{0}{ }^{2}}\right)^{2}} \cdot \Delta S_{c}
\end{aligned}
$$

## 3. $\mathrm{RC}-\pi$ Load Calculation in the CFO Form

In VDSM technologies, one cannot neglect the effect of interconnect resistance of the load on the gate delay and output transition time. In STA, an adequate approximation of an $n^{\text {th }}$ order load seen by the gate (i.e., a load with $n$ distributed capacitances to ground) is obtained by replacing the load by a second order $R C-\pi$ model [10]. Equating the first, second, and third moments of the admittance of the real load with the first, second, and third moments of the $R C-\pi$ load, we can compute $\mathrm{c}_{n}, r_{\pi}$, and $c_{f}$ as follows [11]:

$$
\begin{equation*}
c_{n}=Y_{1, i n}-\frac{Y_{2, i n}{ }^{2}}{Y_{3, i n}} \quad r_{\pi}=-\frac{Y_{3, i n}{ }^{2}}{Y_{2, i n}{ }^{3}} \quad c_{f}=\frac{Y_{2, i n}{ }^{2}}{Y_{3, i n}} \tag{3}
\end{equation*}
$$

where $Y_{k, i n}$ denotes the $k^{\text {th }}$ moment of the admittance of the real load. In $\sigma$ TA, it is necessary to consider the effect of variability of the load on the gate timing analysis.

Problem Statement II: Given is an $R C$ network representation of the load of a logic gate in a design as exemplified in Figure 1(a), where each $r$ and $c$ is in the CFO form. Note that the distribution characteristics of all global and independent sources of variation $\left(\mu=0, \sigma^{2}=1, \kappa\right)$ are given. The
objective is to calculate an equivalent variational $R C-\pi$ load (where $c_{n}, r_{\pi}$, and $c_{f}$ are in the CFO form), while its admittance matches the admittance of the real load in the frequency range of interest.


Figure 1: (a) a variational $R C$ network representation of a net in a design. (b) the equivalent variational $R C-\pi$ model.
$c_{n}, r_{\pi}$, and $c_{f}$ are functions of the admittance moments as seen from Eqn. (3). Hence, by calculating the variational admittance moments, we can calculate the CFO parameters of the $R C-\pi$ load (by using the equations given in section 2.2.) This can be done by differentiating expressions in Eqn. (3) with respect to the sources of variation (c.f. section 2.2.) However, as it will be shown next, a recursive operation is utilized to calculate the variational admittance moments. In each recursion step returns a non-CFO random variable which will feed in the next recursion step and this may increase the complexity of the calculations.

Therefore, instead, we represent the admittance moments in the CFO form throughout the recursion. This helps us by controlling the complexity of representing the moments as the recursive function proceeds. The following shows how to calculate the input admittance moments of a real load in the CFO form.

Consider the $R C Y$ segment shown in Figure 2. Assume that the admittances at nodes $j$ and $i$ are represented by infinite series using the admittance moments:

$$
\begin{aligned}
& Y_{j}(s)=s Y_{1, j}+s^{2} Y_{2, j}+\ldots+s^{k} Y_{k, j}+\ldots \\
& Y_{i}(s)=s Y_{1, i}+s^{2} Y_{2, i}+\ldots+s^{k} Y_{k, i}+\ldots
\end{aligned}
$$

where $Y_{k, j}$ denotes the $k^{\text {th }}$ moment of the admittance of the node $j$. In STA, the admittance at node $i$ is recursively computed in terms of the admittance at node $j$ as follows [11]:

$$
\begin{align*}
& Y_{1, i}=Y_{1, j}+c_{i} \\
& Y_{k, i}=Y_{k, j}-r_{i} \sum_{l=1}^{k-1} Y_{l, i} Y_{k-l, j}-r_{i} c_{i} Y_{k-1, i} \quad \text { for } k \geq 2 \tag{4}
\end{align*}
$$



Figure 2: an $R C Y$ segment model for recursive admittance moment calculation.
Using the Invariant Functional Form Property, we extend the above equation to the variational case. Assume the admittance moments of node $j$ are written in the CFO form. Thus, by differentiating $Y_{k, i}$ with respect to the sources of variations, the $Y_{k, i}$ moments can be also represented in the CFO form (c.f. section 2.2.)

As an example, consider the circuit shown in Figure 1. To calculate the admittance moments of $Y_{i n}=Y_{1}$ in the CFO form, we need to start from the far end nodes of the $R C$ tree ( $Y_{2}$ and $Y_{4}$ ) and recursively apply Eqn. (4). Therefore, we calculate the first three moments of $Y_{4}$ in the CFO form as follows:

1) $\stackrel{\triangleleft}{Y_{1,4}}=\stackrel{\triangleleft \triangleright}{c} c_{4}$;
2) $\stackrel{\curvearrowleft}{Y_{2,4}}=-\stackrel{\Delta \Delta}{r_{4}} c_{4} Y_{1,4}^{\triangleleft \triangleright}=-\stackrel{r_{4}}{ } c_{4}$;
3) Calculate $Y_{2,4}^{\triangle D}$;
4) $\stackrel{\wp}{Y_{3,4}}=-\stackrel{\triangleleft}{r_{4}} \stackrel{\rightharpoonup}{4}_{4} Y_{2,4}^{\varangle \triangleright}$;
5) Calculate $Y_{3,4}^{\varangle \triangleright}$;

Based on the problem statement assumption, $c_{4}$ is in the CFO form, thereby, $Y_{1,4}$ is also in the CFO form. However, since $Y_{2,4}$ and $Y_{3,4}$ are nonlinear functions of the CFO variables and as a result they are complex random variables, we ought to use the techniques described in section 2.2 to transform $Y_{2,4}$ and $Y_{3,4}$ to the CFO form. Similarly, the first three admittance moments of $Y_{3}$ as a function of the moments of $Y_{4}$ are obtained as:


3) Calculate $Y_{2,3}$;

5) Calculate $Y_{3,3}^{\triangleleft}$;

By using the above recursive operations, we easily compute the moments of $Y_{\text {in }}=Y_{1}$ in the CFO form, and hence, calculate the values of $c_{n}, r_{\pi}$, and $c_{f}$ in the CFO form using Eqn. (3).

## 4. Gate Timing Analysis for the RC- $\pi$ Load in Block-Based $\sigma$ TA

Problem statement III: Given is a variational CMOS driver, whose input rise time, $t_{i n}$, is in the CFO form and drives a variational $R C-\pi$ load. The resistance and two capacitances of this load are also in the CFO forms. Note that the distribution characteristics of all global and independent sources
of variation $\left(\mu=0, \sigma^{2}=1, \kappa\right)$ are given. The objective is to calculate the output transition time, $t_{r}$, in the CFO form:

$$
t_{r} \stackrel{\Delta}{c}=t_{r, 0}+\sum_{i=1}^{m} t_{r, i} \Delta X_{i}+t_{r, m+1} \Delta S_{t_{r}}
$$

i.e., calculate the nominal value $\left(t_{r, 0}\right)$ and the sensitivity coefficients $\left(t_{r, i}\right.$ and $\left.t_{r, m+1}\right)$ as well as the skewness of distribution of $\Delta S_{t r}$.

Section 2.2.1 solves the same problem where the gate drives a variational purely-capacitive load in the CFO form. (cf. Eqn. (1)) Therefore, if we substitute the $R C-\pi$ load with its equivalent variational effective capacitance, $c_{e f f}$, in the CFO form, then the solution to problem statement I is an acceptable solution to problem statement III. Based on this reasoning, the following subsections propose a solution for calculating the effective capacitance in the CFO form. Section 4.1 presents a new effective capacitance calculation method in static timing analysis. This method is used in section 4.2 where a technique for statistical effective capacitance calculation is presented. Section 4.3 utilizes a heuristic combined with the technique presented in section 4.1 to present the second technique for faster variational gate timing analysis in section 4.3.

### 4.1 A New Effective Capacitance Calculation Method in STA

The effective capacitance is a pure capacitance that replaces an $R C-\pi$ load and has the property that it gives the most accurate result from a timing model that is characterized with lumped capacitance. Typically, the effective capacitance stores the same amount of charge as the $R C-\pi$ load until a certain point of the output voltage transition [11][12][13] (e.g., the $50 \%$ trip point of the output transition.) Figure 3(a) depicts a typical CMOS driver with its input waveform and $R C-\pi$ load. The output voltage waveform may be modeled as a weighted linear sum of ramp and exponential waveforms as shown in Figure 3(b). We therefore assume that the actual $c_{\text {eff }}$ can be obtained as a weighted average of that obtained for the ramp output waveform and that obtained for the exponential output waveform.

In the following, we calculate $c_{\text {eff }}$ for ramp and exponential waveforms of the gate output voltage. Suppose that output voltage of a gate is approximated with an exponential waveform:

$$
V_{N}(t)=V_{d d}\left(1-e^{-p t}\right) \quad \text { where } \quad p=\frac{\ln \left(\frac{1-\alpha}{1-\beta}\right)}{t_{r}}
$$

where $V_{N}(t)$ is the gate output voltage waveform in time domain and $t_{r}$ is the output rise time from $\alpha \%$ trip point to $\beta \%$ trip point of this waveform.

(a)

(b)

Figure 3: (a) A gate, which drives an $R C-\pi$ calculated load. (b) Gate output waveform is neither ramp nor exponential.
$t_{r}$ is a function of the input transition time $\left(t_{i n}\right)$ and the output load. Thus, the iterative effective capacitance equation for matching any $\theta \%$ trip point of the gate output transition time may be written as:

$$
\begin{aligned}
& c_{e f f}^{E x p}(\theta)=G\left(t_{r}, c_{n}, r_{\pi}, c_{f}\right)=c_{n}+k_{E x p}(\theta) c_{f} \quad \text { where } \\
& k_{E x p}(\theta)=\left[1+\frac{y}{\theta}\left(\mathrm{e}^{\ln (1-\theta) / y}-1\right)\right] \text { and } \\
& \left.y=\ln \left(\frac{1-\alpha}{1-\beta}\right) \cdot \frac{r_{\pi} c_{f}}{t_{r}\left(t_{i n}, c_{e f f}^{E x p}\right.}(\theta)\right)
\end{aligned}
$$

Similarly, for the ramp output voltage waveform, we have:

$$
\begin{aligned}
& c_{e f f}^{\text {Ramp }}(\theta)=H\left(t_{r}, c_{n}, r_{\pi}, c_{f}\right)=c_{n}+k_{\text {Ramp }}(\theta) \cdot c_{f} \quad \text { where } \\
& k_{\text {Ramp }}(\theta)=\left[1-\frac{x}{\theta} \cdot\left(1-\mathrm{e}^{-\theta / x}\right)\right] \text { and } \\
& x=(\beta-\alpha) \cdot \frac{r_{\pi} c_{f}}{t_{r}\left(t_{t_{i n}}, c_{e f f}^{\text {Ramp }}(\theta)\right)}
\end{aligned}
$$

Now, based on the assumption made above, an iterative equation for actual $c_{\text {eff }}$ calculation for any $\theta \%$ trip point of the output transition may be written as:

$$
\left.\left.\begin{array}{rl}
c_{e f f}^{\text {Exp }}(\theta)=G\left(t_{r}\left(t_{i n}, c_{e f f}^{\text {Exp }}(\theta)\right), c_{n} \cdot r_{\pi}, c_{f}\right)  \tag{5}\\
c_{e f f}^{\text {Ramp }}(\theta)=H\left(t_{r}\left(t_{i n}, c_{e f f}^{\text {Ramp }}(\theta)\right), c_{n} \cdot r_{\pi}, c_{f}\right)
\end{array}\right\} \Rightarrow \Rightarrow \begin{array}{rl}
c_{e f f}(\theta) & =F\left(t_{r}\left(t_{i n}, c_{e f f}(\theta)\right), c_{n} \cdot r_{\pi}, c_{f}\right)
\end{array}\right)=\zeta \cdot G+(1-\zeta) \cdot H
$$

where $0 \leq \zeta \leq 1$ is the weighting factor for the linear combination of exponential and ramp waveforms. In practice, we have observed that when $\theta \%=50 \%$, then $\zeta=0.5$ results in the minimum error between the iterative $c_{\text {eff }}$ equation in Eqn. (5) and the actual sign-off $c_{e f f}$ value.

### 4.2 Variational Gate Timing Analysis (VGTA)

Suppose $t_{i n}, c_{n}, r_{\pi}$, and $c_{f}$ are given in the CFO form as:

$$
\begin{align*}
& t_{i n}=t_{i n, 0}+\sum_{i=1}^{m} t_{i n, i} \Delta X_{i}+t_{i n, m+1} \Delta S_{t_{i n}}  \tag{6}\\
& c_{n}=c_{n, 0}+\sum_{i=1}^{m} c_{n, i} \Delta X_{i}+c_{n, m+1} \Delta S_{c_{n}} \tag{7}
\end{align*}
$$

$$
\begin{align*}
& r_{\pi}=r_{\pi, 0}+\sum_{i=1}^{m} r_{\pi, i} \Delta X_{i}+r_{\pi, m+1} \Delta S_{r_{\pi}}  \tag{8}\\
& \triangleleft \triangleright  \tag{9}\\
& c_{f}=c_{f, 0}+\sum_{i=1}^{m} c_{f, i} \Delta X_{i}+c_{f, m+1} \Delta S_{c_{f}}
\end{align*}
$$

$$
\begin{array}{ll}
\Delta S_{t_{i n}} \sim \operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa_{t_{t_{n}}}\right) & \Delta S_{c_{n}} \sim \operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa_{c_{n}}\right)  \tag{10}\\
\Delta S_{r_{\pi}} \sim \operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa_{r_{\pi}}\right) & \Delta S_{c_{f}} \sim \operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa_{c_{f}}\right)
\end{array}
$$

The effective capacitance for this problem generally becomes an arbitrary (non-CFO) random variable, i.e. ${ }^{c_{\text {eff }}}$. Thus, we approximate it with its CFO form and the objective becomes to calculate the coefficients of $c_{\text {eff }}$ in the CFO form as well as the skewness of $\Delta S_{\text {ceff }}$ as:

$$
\begin{gather*}
c_{e f f}^{\triangleleft}=c_{\text {eff }, 0}+\sum_{i=1}^{m} c_{\text {eff }, i} \Delta X_{i}+c_{\text {eff }, m+1} \Delta S_{c_{e f f}}  \tag{11}\\
\text { such that } E\left[\left(c_{\text {eff }}-F\left(t_{r}\left(\begin{array}{c}
\Delta \triangleright \\
t_{i n}
\end{array} c_{\text {eff }}\right), c_{n}, r_{\pi}, c_{f}\right)\right)^{2}\right] \text { is minimized. }
\end{gather*}
$$

Function $F$ is given in Eqn. (5) and $E($.) denotes the expectation value.

Theorem: For a variational circuit, where $t_{i n}, c_{n}, r_{\pi}$, and $c_{f}$ in the CFO form are written as in Eqns. (6)-(10), the coefficients of $c_{e f f}$ in the CFO form (Eqn. (11)), can be calculated as:

$$
\begin{equation*}
c_{e f f, 0}=F\left(t_{r}\left(t_{i n, 0}, c_{e f f, 0}\right), c_{n, 0}, r_{\pi, 0}, c_{f, 0}\right) \tag{12}
\end{equation*}
$$

$$
\begin{align*}
& c_{e f f, i}=\frac{\left(\frac{\partial t_{r}}{\partial t_{i n}}\right)^{n o m} \cdot\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m} \cdot t_{i n, i}+\left(\frac{\partial F}{\partial c_{n}}\right)^{n o m} \cdot c_{n, i}}{1-\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m} \cdot\left(\frac{\partial t_{r}}{\partial c_{e f f}}\right)^{n o m}}+\frac{\left(\frac{\partial F}{\partial r_{\pi}}\right)^{n o m} \cdot r_{\pi, i}+\left(\frac{\partial F}{\partial c_{f}}\right)^{n o m} \cdot c_{f, i}}{1-\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m} \cdot\left(\frac{\partial t_{r}}{\partial c_{e f f}}\right)^{n o m}}  \tag{13}\\
& c_{e f f, m+1}=\sqrt{\left(c_{e f f, m+1}^{t_{i n}}\right)^{2}+\left(c_{e f f, m+1}^{c_{n}}\right)^{2}+\left(c_{e f f, m+1}^{r_{\pi}}\right)^{2}+\left(c_{e f f, m+1}^{c_{f}}\right)^{2}}  \tag{14}\\
& \Delta S_{c_{e f f}} \sim \operatorname{Dist}\left(\mu=0, \sigma^{2}=1, \kappa=\frac{\sum_{u \in U}\left(c_{e f f, m+1}^{u}\right)^{3} \kappa_{u}}{\left(\sum_{u \in U}\left(c_{e f f, m+1}^{u}\right)^{2}\right)^{3 / 2}}\right)  \tag{15}\\
& \text { where } U=\left\{{ }^{\prime} t_{i n}{ }^{\prime}, ' c_{n}{ }^{\prime}, ' r_{\pi}{ }^{\prime}, ' c_{f}{ }^{\prime}\right\}
\end{align*}
$$

and

$$
\begin{aligned}
& c_{e f f, m+1}^{t_{\text {in }}}=\frac{\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m}\left(\frac{\partial t_{r}}{\partial t_{i n}}\right)^{n o m} \cdot t_{i n, m+1}}{1-\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m}\left(\frac{\partial t_{r}}{\partial c_{e f f}}\right)^{n o m}} \quad c_{e f f, m+1}^{c_{n}}=\frac{\left(\frac{\partial F}{\partial c_{n}}\right)^{n o m} \cdot c_{n, m+1}}{1-\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m}\left(\frac{\partial t_{r}}{\partial c_{e f f}}\right)^{n o m}} \\
& c_{e f f, m+1}^{r_{\pi}}=\frac{\left(\frac{\partial F}{\partial r_{\pi}}\right)^{n o m} \cdot r_{\pi, m+1}}{1-\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m}\left(\frac{\partial t_{r}}{\partial c_{e f f}}\right)^{n o m}} \\
& c_{e f f, m+1}^{c_{f}}=\frac{\left(\frac{\partial F}{\partial c_{f}}\right)^{n o m} \cdot c_{f, m+1}}{1-\left(\frac{\partial F}{\partial t_{r}}\right)^{n o m}\left(\frac{\partial t_{r}}{\partial c_{e f f}}\right)^{n o m}}
\end{aligned}
$$

Proof: Based on the proposed effective capacitance equations in section 4.1, the $c_{\text {eff }}$ iterative equation can be rewritten as:

$$
\begin{aligned}
c_{\text {eff }}=F\left(t_{r}\left(t_{i n}, c_{\text {eff }}\right), \stackrel{\Delta \rightharpoonup}{ }, r_{n}, c_{f}\right) & =\zeta \cdot G\left(t_{r}\left(\begin{array}{l}
\triangleleft \triangleright \\
t_{i n}
\end{array}, c_{\text {eff }}\right), c_{n}, r_{\pi}, c_{f}\right) \\
& +(1-\zeta) \cdot H\left(t_{r}\left(t_{i n}, c_{e f f}\right), c_{n}, r_{\pi}, c_{f}\right)
\end{aligned}
$$

Next, we need to compute $c_{\text {eff }}^{\triangleleft \triangleright}$ such that:

$$
\begin{align*}
& E\left[\left(c_{e f f}^{\triangleleft}-F\left(t_{r}\binom{\triangleleft \triangleright}{t_{i n}, c_{e f f}}, c_{n}, r_{\pi}, c_{f}\right)\right)^{\Delta}\right] \text { is minimized }  \tag{16}\\
& c_{e f f}^{\Delta \triangleright}=c_{e f f, 0}+\sum_{i=1}^{m} c_{\text {eff }, i} \Delta X_{i}+c_{e f f, m+1} \Delta S_{c_{e f f}} \\
& =c_{e f f, 0}+\sum_{i=1}^{m} c_{\text {eff }, i} \Delta X_{i}+c_{\text {eff }, m+1}^{t_{i n}} \Delta S_{t_{\text {in }}}+c_{\text {eff }, m+1}^{c_{n}} \Delta S_{c_{n}} \\
& +c_{e f f, m+1}^{r_{t}} \Delta S_{r_{\pi}}+c_{e f f, m+1}^{c_{f}} \Delta S_{c_{f}}
\end{align*}
$$

Using the partial derivations technique, we can expand the non-linear function $F$ around the global and independent sources of variation as:

$$
\begin{aligned}
& F\left(t_{r}\left(\begin{array}{cc}
\Delta \triangleright & \Delta \triangleright \\
t_{i n}, c_{\text {eff }}
\end{array}\right), c_{n}^{\Delta \triangleright}, \stackrel{\Delta \triangleright}{ },,_{f}, c_{f}\right) \cong F\left(t_{r}\left(t_{i n, 0}, c_{e f f, 0}\right), c_{n, 0}, r_{\pi, 0}, c_{f, 0}\right) \\
& +\left.\sum_{i=1}^{m} \frac{\partial}{\partial \Delta X_{i}} F\right|_{\substack{\Delta X_{1}=0 \\
\Delta S_{k}=0}} \Delta X_{i}+\left.\frac{\partial}{\partial \Delta S_{t_{i n}}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \Delta S_{t_{\text {in }}}+\left.\frac{\partial}{\partial \Delta S_{c_{n}}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \Delta S_{c_{n}} \\
& +\left.\frac{\partial}{\partial \Delta S_{r_{\pi}}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \Delta S_{r_{\pi}}+\left.\frac{\partial}{\partial \Delta S_{c_{f}}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \Delta S_{c_{f}}
\end{aligned}
$$

Therefore, to satisfy Eqn. (16), we need to have:

$$
\begin{align*}
& c_{e f f, 0}=F\left(t_{r}\left(t_{i n, 0}, c_{e f f, 0}\right), c_{n, 0}, r_{\pi, 0}, c_{f, 0}\right) \\
& c_{e f f, i}=\left.\frac{\partial}{\partial \Delta X_{i}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \quad \forall i \in\{1 \ldots m\} \\
& \text { and } c_{e f f, m+1}^{U}=\left.\frac{\partial}{\partial \Delta S_{U}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \text { for } U \in\left\{'_{i n}^{\prime} '^{\prime}, c_{n}^{\prime} '^{\prime} r_{\pi}^{\prime},^{\prime} c_{f}^{\prime}\right\} \tag{17}
\end{align*}
$$

Expanding Eqn. (17) will give us the following:

$$
\begin{aligned}
& =\left(\frac{\partial F}{\partial t_{r}} \cdot \frac{\partial t_{r}}{\partial c_{e f f}} \cdot c_{e f f, i}+\frac{\partial F}{\partial t_{r}} \cdot \frac{\partial t_{r}}{\partial t_{i n}} \cdot t_{i n, i}+\frac{\partial F}{\partial c_{n}} \cdot c_{n, i}+\frac{\partial F}{\partial r_{\pi}} \cdot r_{\pi, i}+\frac{\partial F}{\partial c_{f}} \cdot c_{f, i}\right)_{\substack{\Delta x_{i}=0 \\
\Delta s_{k}=0}}
\end{aligned}
$$

Therefore, $c_{e f f, i}$ value can be calculated as Eqn. (13). Using the same method, we can derive $c_{e f f, m+1}{ }^{\text {tin }}$.

$$
\begin{aligned}
& c_{\text {eff }, m+1} t_{\text {in }}=\left.\frac{\partial}{\partial \Delta S_{t i n}} F\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}} \\
& =\left.\left(\frac{\partial F}{\partial t_{r}} \cdot \frac{\partial t_{r}}{\frac{\Delta \triangleright}{\partial c_{e f f}}} \cdot \frac{\partial c_{e f f}}{\partial \Delta S_{t_{i n}}}+\frac{\partial F}{\partial t_{r}} \cdot \frac{\partial t_{r}}{\Delta \triangleright} \cdot \frac{\partial t_{\text {in }}}{\partial \Delta S_{t_{i n}}}\right)\right|_{\begin{array}{c}
\Delta \triangleright \\
\Delta X_{l}=0 \\
\Delta S_{k}=0
\end{array}} \\
& =\left.\left(\frac{\partial F}{\partial t_{r}} \cdot \frac{\partial t_{r}}{\partial c_{e f f}} \cdot c_{\text {eff }, m+1} t_{\text {in }}+\frac{\partial F}{\partial t_{r}} \cdot \frac{\partial t_{r}}{\partial t_{i n}} \cdot c_{e f f, m+1}^{t_{\text {in }}}\right)\right|_{\substack{\Delta X_{l}=0 \\
\Delta S_{k}=0}}
\end{aligned}
$$

where $c_{e f f, m+1}^{t_{i n}}$ will be calculated after collecting terms. Similarly $c_{e f f, m+1}^{c_{n}}, c_{e f f, m+1}^{r_{r}}$ and $c_{e f f, m+1}^{c_{f}}$ are calculated and Eqn. (14) is proved. Finally Lemma 1 proves Eqn.(15).

Eqn. (12) is the iterative $c_{\text {eff }}$ calculation under the nominal conditions of the circuit. Hence, $c_{e f f, 0}$ can be evaluated by using the effective capacitance calculation presented in section 4.1 or any conventional effective capacitance calculation[12][13].
$t_{i n, i}, c_{n, i}, r_{\pi i, i}, c_{f, i}$, are given (cf. Eqns. (6)-(9).) To evaluate Eqns. (13) and (14), we must calculate the derivatives of function $F$ (function $F$ is given in Eqn. (5)) with respect to $t_{r}, c_{n}, r_{\pi}, c_{f}$, and evaluate these derivatives for the nominal values of the circuit parameters (when all sources of variation are set to zero i.e., $\left(\partial F / \partial t_{r}\right)^{n o m},\left(\partial F / \partial c_{n}\right)^{n o m},\left(\partial F / \partial r_{\pi}\right)^{n o m}$, and $\left(\partial F / \partial c_{f}\right)^{n o m}$.) These terms are easy to evaluate. For the remaining terms, we need to calculate the derivatives of the output transition time $\left(t_{r}\right)$ with respect to $t_{i n}$ and $c_{e f f}$ and evaluate them under the nominal condition of the circuit (i.e., $\left(\partial t_{r} / \partial t_{i n}\right)^{n o m}$ and $\left(\partial t_{r} / \partial c_{e f f}\right)^{n o m}$.) To do this, we propose two different solutions.

1) Updating the gate library look-up table and utilizing the additional data during $\sigma \mathbf{T A}$ : The revised tables now provide not only the timing quantity for each combination of $t_{i n}$ and $c_{l}$, but also the derivatives of the timing quantity $\left(t_{r}\right)$ with respect to $t_{i n}$ and $c_{l}$ for each combination of $t_{i n}$ and $c_{l}$.
2) Using the existing gate library look-up table, but performing additional calculations during $\sigma$ TA: To approximately calculate $\left(\partial t_{r} / \partial t_{i n}\right)^{n o m}$, we read $t_{r}$ (from the gate library) for $\left\langle t_{i n, 0} ; c_{l, 0}>\right.$ and $\left\langle t_{i n, 0}+\delta ; c_{l, 0}\right\rangle$. Next, we calculate $\Delta t_{r} / \delta$ as the approximation. $\left(\partial t_{r} / \partial c_{e f f}\right)^{n o m}$ can be similarly calculated.

Using any of the above solutions, Eqns. (13) and (14) become closed form expressions, which can be evaluated in constant time. Note that we calculate $\left(\partial F / \partial t_{r}\right)^{n o m},\left(\partial F / \partial c_{n}\right)^{n o m},\left(\partial F / \partial r_{\pi}\right)^{n o m}$, and $\left(\partial F / \partial c_{f}\right)^{n o m}$ only once in constant time. The complexity of our method is thus dominated by the iterative effective capacitance calculation under the nominal conditions. It is therefore important to try and improve the efficiency of the statistical $c_{\text {eff }}$ calculation as is done in the next sections

### 4.3 Fast Variational Gate Timing Analysis (F-VGTA)

As mentioned earlier, to perform accurate gate delay and output slew time calculation, an iterative calculation of $c_{\text {eff }}$ is inevitable [12][13]. However, as the number of sources of variations increases, the number of required $c_{\text {eff }}$ runs rises exponentially (it is proportional to the number of corner points in static timing analysis), which becomes very CPU-intensive very quickly. In the previous section, we presented a statistical $c_{\text {eff }}$ calculation technique. Here, we present another, more efficient, technique to find $c_{\text {eff }}$ in the CFO form.

Suppose the actual $c_{\text {eff }}$ in the CFO from can be represented as:

$$
\begin{equation*}
\stackrel{\Delta}{c_{e f f}}=c_{e f f, 0}+\sum_{i=1}^{m} c_{e f f, i} \Delta X_{i}+c_{e f f, m+1} \Delta S_{c_{e f f}}=c_{e f f, 0}\left(1+\sum_{i=1}^{m} \frac{c_{e f f, i}}{c_{e f f, 0}} \Delta X_{i}+\frac{c_{e f f, m+1}}{c_{e f f, 0}} \Delta S_{c_{e f f}}\right) \tag{18}
\end{equation*}
$$

 which is obtained from the first $k$ iterations of the statistical iterative $c_{\text {eff }}$ algorithm as follows:

$$
\begin{equation*}
\stackrel{\Delta}{\stackrel{\rightharpoonup}{e}} \stackrel{c_{e f f}}{k}=c_{e f f, 0}^{k}+\sum_{i=1}^{m} c_{e f f, i}^{k} \Delta X_{i}+c_{e f f, m+1}^{k} \Delta S_{c_{e f f}^{k}}=c_{e f f, 0}^{k}\left(1+\sum_{i=1}^{m} \frac{c_{e f f, i}^{k}}{c_{e f f, 0}^{k}} \Delta X_{i}+\frac{c_{e f f, m+1}^{k}}{c_{e f f, 0}^{k}} \Delta S_{c_{e f f}^{k}}\right) \tag{19}
\end{equation*}
$$

$c_{\text {eff }}{ }^{0}$ means representing $c_{\text {eff }}$ using the total capacitance (i.e. $c_{n}+c_{f}$ ), $c_{\text {eff }}{ }^{1}$ means the value of the effective capacitance obtained by using a single iteration, and so on. We define $c_{\text {eff }, i}^{k} / c_{\text {eff }, 0}^{k}$ and $c_{e f f, i} / c_{\text {eff, }, 0}$ as iterative and actual normalized sensitivity coefficients (denoted by NSC's), respectively. The NSC's capture the effect of the load variation on the $c_{\text {eff }}$ value. It can be shown that in each iteration, the iterative NSC's change slightly (for $k \geq 1$ ), and they converge to their actual NSC values; i.e.;

$$
\begin{equation*}
\frac{c_{e f f, i}}{c_{e f f, 0}} \cong \frac{c_{e f f, i}^{k}}{c_{e f f, 0}^{k}} \quad 1 \leq i \leq m, \tag{20}
\end{equation*}
$$

Using the above observation, problem statement III can be solved by the following steps:

1) Evaluate $c_{e f f}^{\stackrel{\rightharpoonup}{k}}$ (section 4.1) and therefore find $c_{e f f, 0}^{k}$ and $c_{e f f, i}^{k}$ for $1 \leq i \leq m+1$.
2) Find the actual $c_{e f f, 0}$ by performing conventional iterative effective capacitance algorithm for the nominal conditions of the circuit.
3) Using Eqn. (20) and the results of steps 1 and 2 , determine

$$
c_{e f f, i}=c_{e f f, 0} \cdot \frac{c_{e f f, i}^{k}}{c_{e f f, 0}^{k}} \quad \forall i, 1 \leq i \leq m+1
$$

4) Having found $c_{\text {eff }, 0}$ and $c_{\text {eff }, i}$, for $1 \leq i \leq m+1$, calculate $c_{\text {eff }} \downarrow$. Using the method presented in section 2.2, determine the gate delay and output slew in the CFO form and the skewness of $\Delta S_{t_{\alpha}}$.

Figure 4: (a) A gate, which drives an $R C$ - $\pi$ calculated load. (b) Gate output waveform is neither ramp nor exponential

Step 2 is performed by using STA-based (non-variational) $C_{\text {eff }}$ algorithm presented in section 4.1 or any other conventional effective capacitance calculation [12][13]. Step 3 is a simple algebraic equation while step 4 is performed as per section 2.3. For step 1 , the following sections show how to calculate the ${ }_{\substack{\stackrel{\triangleleft}{0} \\ c_{\text {eff }}^{0}}}^{\stackrel{\text { and }}{ } c_{\text {eff }}^{1}}$.

### 4.3.1 Finding ${ }^{\stackrel{\rightharpoonup}{b}} c_{\text {eff }}$ from ${ }^{\stackrel{d i}{c}} c_{e f f}^{0}$

As we mentioned before, $c_{\text {eff }}^{0}$ approximates $c_{\text {eff }}$ with the sum of the total capacitance (i.e., $c_{n}+c_{f}$ ). Thus, the $\stackrel{\stackrel{\rightharpoonup}{c}}{c_{\text {eff }}^{0}}$ is equal to the sum of ${ }^{\triangleleft \triangleright}{ }_{n}{ }^{\triangleleft}$ and $c_{f}$, i.e. if

$$
\begin{equation*}
c_{n}^{\triangleleft}=c_{n, 0}+\sum_{i=1}^{m} c_{n, i} \Delta X_{i}+c_{n, m+1} \Delta S_{c_{n}} \quad \stackrel{\triangleleft \triangleright}{c_{f}}=c_{f, 0}+\sum_{i=1}^{m} c_{f, i} \Delta X_{i}+c_{f, m+1} \Delta S_{c_{f}} \tag{21}
\end{equation*}
$$

Therefore,

$$
\begin{equation*}
\stackrel{\measuredangle \triangleright}{c_{e f f}^{0}}=\left(c_{n, 0}+c_{f, 0}\right) \cdot\left(1+\sum_{i=1}^{m} \frac{\left(c_{n, i}+c_{f, i}\right)}{\left(c_{n, 0}+c_{f, 0}\right)} \cdot \Delta X_{i}+\frac{\sqrt{c_{n, m+1}^{2}+c_{f, m+1}^{2}}}{\left(c_{n, 0}+c_{f, 0}\right)} \cdot \Delta S_{c_{e f f}}^{0}\right) \tag{22}
\end{equation*}
$$

We must calculate $c_{\text {eff }}$ for the nominal condition of the circuit (i.e., any quantity in the circuit is at its nominal value) to get $c_{\text {eff, }, \text {. Therefore, by using Eqns. (18), (20), and (22) the variational effective }}$ capacitance can be written as:

$$
\begin{equation*}
c_{e f f}^{\triangleleft}=c_{e f f, 0}+\sum_{i=1}^{m} \frac{\left(c_{n, i}+c_{f, i}\right)}{\left(c_{n, 0}+c_{f, 0}\right)} \cdot c_{e f f, 0} \Delta X_{i}+\frac{\sqrt{c_{n, m+1}^{2}+c_{f, m+1}^{2}}}{\left(c_{n, 0}+c_{f, 0}\right)} \cdot c_{e f f, 0} \Delta S_{c_{e f f}} \tag{23}
\end{equation*}
$$

Now, we can use ${ }^{\triangleleft} c_{\text {eff }}$ in Eqn. (23) and the method presented in section 2.2 to generate the gate propagation delay and output slew time in the CFO form. However, this approach may not capture the effect of the variations of the resistance in the $R C-\pi$ load on the gate timing analysis. Therefore, the next approach, finds $N S C$ 's based on a reasonably accurate single-iteration $c_{\text {eff }}$ calculation.

### 4.3.2 Finding ${ }^{\stackrel{\rightharpoonup}{b}} c_{e f f}$ from ${ }^{\stackrel{d i}{d}}$

In this section we find the nominal value of the effective capacitance by performing iterative $c_{\text {eff }}$ calculation for the nominal conditions of the circuit. Next we find NSC's by applying a singleiteration effective capacitance method. $c_{e f f}{ }^{1}$ means using single-iteration of Eqn. (5) as the gate load. Thus, ${ }^{\substack{\triangleleft \perp \\ c_{e f f}}} \begin{aligned} & \text { may be obtained by differentiating Eqn. (5) with respect to the sources of variations (c.f. }\end{aligned}$ section 2.2).

Subsequently, using the same approach as in section 4.1, we can find the ${ }^{\substack{~ \\ c_{e f f}}}$ while the NSC's are calculated using the above single-iteration $c_{\text {eff }}$ technique. Experimental results confirm that evaluating variational $c_{\text {eff }}$ using the above approach shows an average error of $7 \%$ in the final delay and output slew time calculation with respect to Monte Carlo simulation.

## 5. Experimental Results

Our experiments use 90 nm CMOS process parameters to model gates and interconnect parasitics. We assumed two different configurations for the experimental setup. The first one consists of two inverters connected in series whereas the second one is a CMOS inverter followed by a 2 -input NAND gate. For both configurations, we apply a ramp input to the first inverter while its nominal value is chosen from the set $\left(t_{i n}\right)^{n o m}=\{10 \mathrm{ps}, 80 \mathrm{ps}, 150 \mathrm{ps}, 220 \mathrm{ps}, 300 \mathrm{ps}\}$. For the first configuration, size of the first inverter is fixed at $W_{p} / W_{n}=30 / 15 \mu \mathrm{~m}$ whereas size of the second inverter is chosen to be one of $W_{p} / W_{n}=\{20 / 10,50 / 25,70 / 35,100 / 50\} \mu m$. For the second configuration, size of the first inverter is again fixed at $W_{p} / W_{n}=30 / 15 \mu m$ whereas this time the size of the succeeding 2-input NAND gate is chosen to be one of $W_{p} / W_{n}=\{40 / 40,50 / 50,100 / 100\} \mu m$.

To characterize the timing behavior of the gate, a look-up table based library is employed which represents the gate delay and output transition time as a function of input rise time, output capacitive load, $V_{d d}$, and temperature. We apply different loading scenarios for the second-stage gate as explained in the following subsections, i.e., pure capacitive load and general $R C$ load. We have also
considered four different global sources of variation ( $V_{d d}$, temperature, Metal layer 1 width, and ILD) and one independent random sources of variation for each electrical parameter (i.e., $r$ and $c$ ) and timing parameter (for instance $t_{i n}$ ) in the circuit. The sensitivity of each given data to the sources of variation is chosen randomly, while the total $\sigma$ variation for each data is chosen to be $10 \%$ and $15 \%$ of their nominal value. We also assumed that the sources of variation are skewed with different skewness values as explained in each subsection. Mean, variance, and skewness of effective capacitance, the gate $50 \%$ propagation delay, and $10 \%-90 \%$ output transition time (slew time) are calculated using the approaches presented in this paper.

To compare the results, we ran Monte Carlo simulation with $10^{4}$ samples on each test scenario and derived mean, variance, and skewness of effective capacitance, the gate $50 \%$ propagation delay, and $10 \%-90 \%$ output transition time. The average percentage errors for the mean, variance, and skewness of effective capacitance, the gate $50 \%$ propagation delay, and $10 \%-90 \%$ output transition time between the obtained results from the HSPICE and the calculated results based on using both VGTA and F-VGTA algorithm are reported.

## A. Pure Capacitive Load

The load in this section is considered to be purely capacitive. Its nominal value is chosen to be $(C)^{n o m}=\{400,500,800,1400\} f F$. The scaled distribution of the sources of variation is considered to have a skewness of $0.4,0.6$, and 0.8 . We performed our experiments on both circuit configurations explained above. The results for the first configuration (where the second gate is an inverter) are presented in Table 2 (the skewness of the given data is 0.4 ) and Table 3 (for the skewness of 0.8 ). The results for the second configuration are provided in Table 4 (for the skewness of 0.6). Experimental results indicate an average error of about $3 \%$ for two different $\sigma$ values, i.e. $10 \%$ and $15 \%$. As we increase the $\sigma$ value (i.e. the total $\sigma$ variation for each data; e.g. $\sigma$ variation of $t_{i n}$, and $c_{l}$ ) from $10 \%$ to $15 \%$, the error in calculated mean, variance, and skewness of the delay and slew time increase, but slightly. The sources of error can be mainly classified into two groups: 1) the inaccuracy of the gate library table lookup and 2) the linear first order approximation of the timing
and electrical parameters with respect to the sources of variation. Note that, the runtime of the proposed algorithm in average is 129 times faster than the Monte Carlo based approach.

Table 2: Average error for the inverter driving pure capacitive load (Skewness=0.4)

|  | $\sigma=10 \%$ |  | $\sigma=15 \%$ |  |
| :---: | :---: | :---: | :---: | :---: |
| Average error | Delay | Slew <br> time | Delay | Slew <br> time |
| Mean | $1.5 \%$ | $1.7 \%$ | $2.2 \%$ | $2.3 \%$ |
| Variance | $1.2 \%$ | $1.3 \%$ | $1.8 \%$ | $1.9 \%$ |
| Skewness | $1.0 \%$ | $1.1 \%$ | $1.4 \%$ | $1.3 \%$ |

Table 3: Average error for the inverter driving pure capacitive load (Skewness=0.8)

|  | $\sigma=10 \%$ |  | $\sigma=15 \%$ |  |
| :---: | :---: | :---: | :---: | :---: |
| Average error | Delay | Slew <br> time | Delay | Slew <br> time |
| Mean | $1.9 \%$ | $2.3 \%$ | $2.5 \%$ | $2.9 \%$ |
| Variance | $1.6 \%$ | $1.7 \%$ | $1.9 \%$ | $2.1 \%$ |
| Skewness | $1.4 \%$ | $1.5 \%$ | $1.5 \%$ | $1.9 \%$ |

Table 4: Average error for the 2-input NAND gate driving pure capacitive load (Skewness=0.6)

|  | $\sigma=10 \%$ |  | $\sigma=15 \%$ |  |
| :---: | :---: | :---: | :---: | :---: |
| Average error | Delay | Slew <br> time | Delay | Slew <br> time |
| Mean | $3.0 \%$ | $3.1 \%$ | $3.2 \%$ | $3.1 \%$ |
| Variance | $2.5 \%$ | $2.7 \%$ | $2.8 \%$ | $2.9 \%$ |
| Skewness | $2.2 \%$ | $2.3 \%$ | $2.5 \%$ | $2.6 \%$ |

## B. General RC Load:

For this section, the load is considered to be an $R C$ tree of varying topology. The nominal value of the total resistance of the load is chosen to be from the set $(R)^{n o m}=\{150,260,300,710,1000\} \Omega$ and the nominal value of the total capacitance of the load is chosen to be from the set $(C)^{n o m}=\{400,500$, $800,1400\} f F$. The scaled distribution of the sources of variation is considered to have a skewness of $0.5,0.75$, and 1 .

Again, we performed the experiment on both circuit configurations as explained before. The results for the first configuration (where the second gate is an inverter) are presented in Table 5 (the skewness of the given data is 0.5 ) and Table 6 (the skewness of the given data is 0.75 ). The results for the second configuration are also provided in Table 7 (the skewness of the given data is 1 ). Experimental results indicate an average error of about $6 \%$ for different $\sigma$ values. As we increase the $\sigma$ value (i.e. the total $\sigma$ variation for each data; e.g. $\sigma$ variation of $t_{i n}, c_{n}, r_{\pi}$, and $c_{f}$ ) from $10 \%$ to $15 \%$, the error in calculated mean, variance, and skewness of $c_{\text {eff }}$, the gate delay, and output transition time increase, but slightly. Similarly, as skewness increases (e.g. skewness of $t_{i n}, c_{n}, r_{\pi}$, and $c_{f}$ ) from 0.5 to 0.75 , the error in calculated mean, variance, and skewness of the $c_{\text {eff }}$, as well as the error in delay and slew time increases, but slightly. The sources of error can be mainly classified into four groups: 1) the inaccuracy of the gate library table lookup, 2) the linear first order approximation of the timing and electrical parameters with respect to the sources of variation, 3) the error in calculating the variational $R C-\pi$ load and 4) the error in the effective capacitance iterative equation proposed in section 4.1. Note that, the runtime of the proposed algorithm is, in average, 95 times faster than the Monte Carlo based approach.

Table 5: Average error for the inverter driving general $R C$ load (Skewness=0.5)

|  | $\sigma=10 \%$ | $\sigma=15 \%$ |
| :--- | :---: | :---: |


| Average error | $\boldsymbol{C}_{\text {eff }}$ | Delay | Slew <br> time | $\boldsymbol{C}_{\text {eff }}$ | Delay | Slew <br> time |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Mean | $3.2 \%$ | $3.5 \%$ | $4.9 \%$ | $3.5 \%$ | $5.4 \%$ | $5.8 \%$ |
| Variance | $2.4 \%$ | $3.3 \%$ | $4.5 \%$ | $2.6 \%$ | $5.9 \%$ | $5.2 \%$ |
| Skewness | $2.5 \%$ | $3.3 \%$ | $4.9 \%$ | $2.0 \%$ | $5.5 \%$ | $5.5 \%$ |

Table 6: Average error for the inverter driving general $\boldsymbol{R C}$ load (Skewness=0.75)

|  | $\sigma=10 \%$ |  |  | $\sigma=15 \%$ |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Average error | $\boldsymbol{C}_{\text {eff }}$ | Delay | Slew <br> time | $\boldsymbol{C}_{\text {eff }}$ | Delay | Slew <br> time |
| Mean | $3.5 \%$ | $5.1 \%$ | $5.3 \%$ | $3.8 \%$ | $5.9 \%$ | $6.1 \%$ |
| Variance | $2.9 \%$ | $4.3 \%$ | $5.5 \%$ | $3.6 \%$ | $6.2 \%$ | $6.2 \%$ |
| Skewness | $2.8 \%$ | $4.1 \%$ | $4.9 \%$ | $3.1 \%$ | $5.9 \%$ | $5.9 \%$ |

Table 7: Average error for the 2-input NAND gate driving general $\boldsymbol{R C}$ load (Skewness=1)

|  | $\sigma=10 \%$ |  |  | $\sigma=15 \%$ |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Average error | $C_{\text {eff }}$ | Delay | Slew <br> time | $\boldsymbol{C}_{\text {eff }}$ | Delay | Slew <br> time |
| Mean | $4.1 \%$ | $5.2 \%$ | $5.1 \%$ | $4.2 \%$ | $6.1 \%$ | $6.7 \%$ |
| Variance | $3.9 \%$ | $5.4 \%$ | $5.2 \%$ | $4.3 \%$ | $6.1 \%$ | $6.1 \%$ |
| Skewness | $4.0 \%$ | $6.1 \%$ | $5.6 \%$ | $4.2 \%$ | $6.5 \%$ | $6.3 \%$ |

For F-VGTA algorithm, again, we performed the experiment on both circuit configurations as explained before. The results for the first configuration (where the second gate is an inverter) are presented in Table 8 (when the $C_{\text {total }}$ is used for calculating the $N S C$ ) and Table 9 (when the single iteration $C_{e f f}$ is used for calculating the NSC). The results for the second configuration are also
provided in Table 10 (when the $C_{\text {total }}$ is used for calculating the $N S C$ ) and Table 11 (when the $C_{\text {total }}$ is used for calculating the NSC). Experimental results indicate an average error of about $19 \%$ for different $\sigma$ values when the $c_{\text {total }}$ is used for calculating the NSC. It also shows an average error of about $7 \%$ for different $\sigma$ values when the single iteration $C_{\text {eff }}$ is used for calculating the NSC. As we increase the $\sigma$ value (i.e. the total $\sigma$ variation for each data; e.g. $\sigma$ variation of $t_{i n}, c_{n}, r_{\pi}$, and $c_{f}$ ) from $10 \%$ to $15 \%$, the error in calculated mean and variance of $C_{e f f}$, the gate delay, and output transition time increase, but slightly. The sources of error can be mainly classified into five groups: 1) the inaccuracy of the gate library table lookup, 2) the linear first order approximation of the timing and electrical parameters with respect to the sources of variation, 3) the error in calculating the variational $R C-\pi$ load and 4) the error in the effective capacitance iterative equation. 5) the error in NSC approximation (Eqn. (20)). Note that, the runtime of the proposed algorithm is, in average, 185 times faster than the Monte Carlo based approach.

Table 8: Average error for the inverter driving general $R C$ load when $C_{\text {total }}$ is used for calculating NSC

|  | $\sigma=10 \%$ |  | $\sigma=15 \%$ |  |
| :---: | :---: | :---: | :---: | :---: |
| Average error | Delay | Slew <br> time | Delay | Slew <br> time |
| Mean | $14.6 \%$ | $15.8 \%$ | $18.1 \%$ | $18.3 \%$ |
| Variance | $15.4 \%$ | $16.3 \%$ | $16.9 \%$ | $17.9 \%$ |
| Skewness | $15.9 \%$ | $17.5 \%$ | $17.3 \%$ | $18.5 \%$ |

Table 9: Average error for the inverter driving general $R C$ load when single iteration $C_{e f f}$ is used for calculating NSC

|  | $\sigma=10 \%$ |  |  | $\sigma=15 \%$ |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Average error | Ceff | Delay | Slew <br> time | Ceff | Delay | Slew <br> time |
| Mean | $4.1 \%$ | $6.5 \%$ | $6.7 \%$ | $4.2 \%$ | $6.4 \%$ | $6.4 \%$ |
| Variance | $3.9 \%$ | $5.6 \%$ | $6.0 \%$ | $4.3 \%$ | $6.5 \%$ | $6.3 \%$ |
| Skewness | $3.7 \%$ | $5.1 \%$ | $5.5 \%$ | $4.4 \%$ | $6.9 \%$ | $6.4 \%$ |

Table 10: Average error for the 2-input NAND gate driving general $R C$ load when $\boldsymbol{C}_{\text {total }}$ is used for calculating NSC

|  | $\sigma=10 \%$ |  | $\sigma=15 \%$ |  |
| :---: | :---: | :---: | :---: | :---: |
| Average error | Delay | Slew <br> time | Delay | Slew <br> time |
| Mean | $16.6 \%$ | $16.8 \%$ | $19.1 \%$ | $18.2 \%$ |


| Variance | $16.4 \%$ | $17.3 \%$ | $17.9 \%$ | $18.8 \%$ |
| :---: | :---: | :---: | :---: | :---: |
| Skewness | $16.1 \%$ | $17.7 \%$ | $17.5 \%$ | $19.0 \%$ |

Table 11: Average error for the 2-input NAND gate driving general $\boldsymbol{R C}$ load when single iteration $\boldsymbol{C}_{\text {eff }}$ is used for calculating NSC

|  | $\sigma=10 \%$ |  |  | $\sigma=15 \%$ |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Average <br> error | Ceff | Dela <br> $\mathbf{y}$ | Slew <br> time | Ceff | Delay | Slew <br> time |
| Mean | $3.7 \%$ | $5.6 \%$ | $5.8 \%$ | $4.6 \%$ | $6.1 \%$ | $6.2 \%$ |
| Variance | $4.1 \%$ | $5.4 \%$ | $5.3 \%$ | $4.5 \%$ | $5.9 \%$ | $5.8 \%$ |
| Skewness | $4.5 \%$ | $5.3 \%$ | $5.2 \%$ | $4.3 \%$ | $5.6 \%$ | $5.3 \%$ |

## 6. Conclusion

In this paper we presented two frameworks to handle the variation-aware gate timing analysis in block-based $\sigma$ TA considering non-Gaussian sources of variation. To perform any of these frameworks, first, we proposed an approach to calculate variational $R C-\pi$ load, which can be utilized instead of the actual variational $R C$ load for the gate timing analysis purposes. Next, we presented a new approach for calculating effective capacitance in STA. We used this technique to calculate the statistical $c_{\text {eff }}$ in canonical first-order (CFO) form, and thereby, calculated the gate delay and output slew time in the CFO form. Experimental results show an average error of $4 \%$ with respect to HSPICE Monte Carlo simulation.

## 7. References

[1] R. Nassif, "Modeling and Analysis of Manufacturing Variations," CICC, pp. 223-228, 2001.
[2] C. Visweswariah, K. Ravindran, K. Kalafala, S.G. Walker, S. Narayan, "First-order incremental blockbased statistical timing analysis", Design Automation Conference, 2004. 41st , June 7-11, 2004, Pages:331-336
[3] H. Chang, V. Zolotov, S. Narayan, and C. Visweswariah "Parameterized Block-Based Statistical Timing Analysis with Non-Gaussian and Non-Linear Parameters," International Workshop on Timing Issues (TAU), 2005
[4] Y. Liu, L. T. Pileggi, and A. J. Strojwas, "Model Order Reduction of RC(L) Interconnect Including Variational Analysis," DAC, pp. 201-206, 1999.
[5] J.D. MA and R.A. Rutenbar, "Interval-Valued Reduced Order Statistical Interconnect Modeling", ICCAD, Pages:460-467, 2004.
[6] K. Agarwal, D. Sylvester, D. Blaauw, F. Liu, S. Nassif, S. Vrudhula, "Variational delay metrics for interconnect timing analysis," Design Automation Conference, 41st , June 7-11, 2004 Pages:381 - 384, 2004.
[7] Agarwal, A.; Dartu, F.; Blaauw, D.;"Statistical gate delay model considering multiple input switching", DAC Pages:658-663, 2004
[8] K. Okada, K. Yamaoka, H. Onodera, "A statistical gate-delay model considering intra-gate variability" Computer Aided Design, 2003. ICCAD-2003. International Conference on , 9-13 Nov. 2003 Pages:908913
[9] V. Mehrotra, S. Nassif, D. Boning, and J. Chung, "Modeling the Effects of Manufacturing Variation on High-Speed Microprocessor Interconnect Performance," IEEE Electron Devices Meetings, pp. 767-770, 1998.
[10] P.R. O'Brien and T. L. Savarino, "Modeling the Driving-Point Characteristics of Resistive Interconnect for Accurate Delay Estimation," Proc. of IEEE int'l Conf. on Computer Aided Design, pp.512-515, 1989
[11] A.B. Kahng, S. Muddu, "Improved effective capacitance computations for use in logic and layout optimization," VLSI Design, pp. 578 - 582, 1999.
[12] F. Dartu, N. Menezes, and L. Pillegi, "Performance Computation for Precharacterized Gates with RC Loads", IEEE Trans. On Computer Aided Design 15(5):544-533, 1996.
[13] S. Abbaspour, M. Pedram, "Calculating the Effective Capacitance for the RC Interconnect in VDSM Technologies," ASPDAC, 2003.


[^0]:    * This paper combines and extends results of our works, which were presented at the 2005 International Conference on Computer Design and the 2006 Asia-South Pacific Design Automation Conference.

