

# Home assignment 3

**MCC092**  
**Introduction to**  
**Integrated Circuit Design**  
Chalmers University of Technology

**2018-10-15**  
**Sequencing, metastability**  
**& prefix adders**

**Task #1 Sequencing 1**

You have the following data for one type of flip-flops from a cell library:

|                                                |       |
|------------------------------------------------|-------|
| Setup time, $t_{\text{setup}}$                 | 65 ps |
| clk-to-Q propagation delay, $t_{\text{p}cq}$   | 50 ps |
| clk-to-Q contamination delay, $t_{\text{c}cq}$ | 35 ps |
| Hold time, $t_{\text{hold}}$                   | 30 ps |

For your convenience we have included part of Figure 11.4 from Weste & Harris, which shows these delays. See Fig. 1 below.



Figure 1. This is part of Figure 11.4 from Weste and Harris, which shows the six relevant delays: two for the combinational logic and four for the flip-flops.

You are to use this type of flip-flops for sequencing in a system. Your task is to design the combinational logic, CL, which is to be placed between the flip-flops; therefore you need to determine the timing constraints for this logic.

- Determine the sequencing overhead for this type of flip-flops.
- Determine the logic propagation delay,  $t_{\text{pd}}$ , allowed if the system is clocked with a clock frequency,  $f_c$ , of 2 GHz.
- Determine the logic contamination delay,  $t_{\text{cd}}$ , allowed with the same clock frequency as in task b).
- What if there is a clock skew of a maximum 50 ps between any two flip-flops in the system? What is the sequencing overhead then? With the same clock frequency as before, what is then the logic propagation delay,  $t_{\text{pd}}$ , and the logic contamination delay,  $t_{\text{cd}}$ , allowed for the logic you are to design?

**Task #2 Sequencing 2**

You have the following data for another type of flip-flops from a cell library:

|                                                |       |
|------------------------------------------------|-------|
| Setup time, $t_{\text{setup}}$                 | 60 ps |
| clk-to-Q propagation delay, $t_{\text{p}cq}$   | 70 ps |
| clk-to-Q contamination delay, $t_{\text{c}cq}$ | 50 ps |
| Hold time, $t_{\text{hold}}$                   | 20 ps |

The circuit shown in Figure 2 is implemented using this type of flip-flops, as well as three XOR gates. The data for the XOR gate is:  $t_{pd} = 100$  ps and  $t_{cd} = 55$  ps.



Figure 2. A sequential circuit comprising four flip-flops and three XOR gates.

- If there is no clock skew, what is the maximum operating frequency of the circuit?
- How much clock skew can the circuit tolerate if it must operate at 2 GHz?
- How much clock skew can the circuit tolerate before it might experience a hold violation?
- Alice points out that she can redesign the combinational logic between the registers to be faster and tolerate more clock skew. Her improved circuit also uses three XOR gates, but they are arranged differently. What is her circuit? With this new circuit, what is now the maximum frequency if there is no clock skew? How much clock skew can the circuit tolerate before it might experience a hold violation?

### Task #2 Metastability

*In this task you are to use data from the article “Metastability challenges for 65nm and beyond” by Beer and co-authors for a 65 nm process. This article is available from the PingPong page for assignment HA3. The background on metastability can be found in the article “Metastability and Synchronizers: A tutorial” by Ran Ginosar, which is also available from the assignment page. Read pages 23-28 up to “two-flip-flop synchronizer”.*

You are working as a hardware designer for a satellite-payload system for future telecom systems. Through a project for the European space agency, ESA, a company is to develop an ASIC in a 65-nm process for this purpose. The hardware is to act as huge software-reconfigurable switch for lots of telecom data coming from the earth. The company has previously demonstrated the ASIC concept in a 0.35  $\mu$ m process and a 180 nm process. However, now they are moving to the 65 nm process while they are increasing the clock frequency and data frequency to increase the throughput.

Your task is to investigate the reliability of the new chips due to metastability.

- Investigate **one** flip-flop in the data-path of the processor. Use data from the paper for  $T_W$  and  $\tau$ <sup>1</sup>. Assume that the processor runs with full  $V_{DD}$ , 1.25 V, but that the temperature can vary (as it does in space). Assume that the clock frequency,  $f_c$  is 5 GHz and the data rate,  $f_D$ , is 200 MHz. If we set the resolution time,  $S$ , to one clock cycle, what is the resulting MTBF?
- There are 4000 flip-flops as the one in task a) in each chip and a full system can comprise 20 chips. What is the resulting MTBF for the entire system if we assume that all the flip-flops are properly designed?
- Some flip-flops in the control structure must fail much more seldom. Investigate **one** such flip-flop in the processor. Again, use the same data from the paper for  $T_W$  and  $\tau$ . These flip-flops, fortunately, do not switch as often. Assume that  $f_c = 100$  MHz and  $f_D = 10$  MHz. How

<sup>1</sup> Note that in this context  $\tau$  denotes the time constant for metastability resolution – note the time constant for

long resolution time for metastability,  $S$ , is required to have a failure rate of 1 per 10 years; that is an MTBF of not less than 10 years? Express the resolution time,  $S$ , in the number of clock periods,  $T_c$ , that are required.

d) The control logic should work correctly even if the operating conditions are less than ideal. In space the power supply and temperature may vary. Assume that the control logic should work correctly over the entire temperature range (space can be cold!) and also when  $V_{DD}$  is as low as 1 V. Use other data for  $\tau$  and  $T_W$  from the paper that capture these worse operating conditions. What is the required resolution time then to achieve an MTBF of 10 years? Express this time in the number of clock periods,  $T_c$ , that are required.

#### Task #4

The fundamental insight behind the prefix adders is that the logical functions for block generate and block propagate can be performed for smaller blocks and then combined to form all the necessary block generate and propagate signals. In this task you are to investigate this feature further. See Figure 3 of a prefix adder, which is not one of the usual ones.



Figure 3. The PG-tree part of an unknown prefix adder. Part of the tree is missing.

a) Write down the spans for the nine black and six grey PG cells that are numbered in Figure 3. As an example for the black cell, which is above grey cell number 2, the span is 3:2. See also Figure 11.29 in Weste & Harris.

Note that equations 11.4 in Weste & Harris for forming the group signals recursively can be more generally formulated as:

$$G_{i:j} = G_{i:m} + P_{i:m} \cdot G_{k:j}$$

$$P_{i:j} = P_{i:m} \cdot P_{k:j}$$

for  $(i \geq k \geq m-1 \geq j)$

That is, it is OK for the upper block and the lower block to overlap; when they are recursively combined the result will be correct anyway. For the AND function used in the propagate signal, this property (call idempotency) is easily seen, but for the generate function it is less obvious (at least in my opinion).<sup>2</sup>

<sup>2</sup> For those of you who might be especially interested in the formal proofs for prefix adder functions I have put an article on this subject in the “Extra readings for the interested” folder in PingPong.

**Home assignment #3****Due Monday October 22 2018 @ 23.59 (midnight)**

- b) What is the difference between the black and the grey cells in the PG tree?
- c) What is the function to form one of the sums? A hint is that in the full adder the sum is formed as  $S = A \oplus B \oplus C_{in}$ . (Study section 11.2.2.2 in Weste & Harris if necessary).
- d) If we assume just straight lines in the white space at the bottom of the tree in Figure 3, which of the 32 sums cannot be formed from the tree as shown in Figure 3?
- e) Complete the adder schematics below with grey cells so that all sums can be formed. Any schematic that creates all the necessary signals is considered correct. Draw your solution in the attached bigger schematic. You may find it helpful to write out the spans as is done in Figure 11.29 in Weste & Harris.
- f) If we (incorrectly) assume that all black and grey PG cell have the same propagation delay, which sum(s) will be done last in your solution from task e)? List ALL sums that have this propagation delay. How many PG cells do these longest (critical) paths contain? Ignore the buffers (triangles) shown Figure 3 when you find these paths. It may be helpful to draw the paths in attached schematic and submit it with your solution.

---

P,G Setup cells 0-31



Sum cells 0-31

|      |      |      |      |      |      |      |      |      |      |      |     |     |     |     |     |
|------|------|------|------|------|------|------|------|------|------|------|-----|-----|-----|-----|-----|
| 30:0 | 28:0 | 26:0 | 24:0 | 22:0 | 20:0 | 18:0 | 16:0 | 14:0 | 12:0 | 10:0 | 8:0 | 6:0 | 4:0 | 2:0 | 0:0 |
| 31:0 | 29:0 | 27:0 | 25:0 | 23:0 | 21:0 | 19:0 | 17:0 | 15:0 | 13:0 | 11:0 | 9:0 | 7:0 | 5:0 | 3:0 | 1:0 |