Knowledge Base: Logic Synthesis Using Synopsys Design Compiler

The basic flow of using any logic synthesis tool would be:

Part I

Setup search path and design library
Setup technology libraries
Read RTL files
Link design
Check design quality
Define design environment

Part II

Define system interface
Setup design constraints and goals

Part III

Compile the design
Analyze results and generate reports
Write out the netlist and associated files

Part II

Define system interface

All the logic external that would drive or receive signals from our design is unknown. This information needs to be provided to accurately model system level environment under which our design would be exercised. These information are:

Input drive strength - Drive characteristics for input ports
Capacitive load - Loads on input and output ports
Output Fanout load - Fanout loads on output ports

Input Drive Strength: By default, the tool assumes zero drive resistance on input ports, meaning infinite drive strength. To avoid this unrealistic approach, driving cell can be placed on the input ports that will cause the tool to calculate the actual transition time as though the specified library cell was driving it.

This can be done by using set_driving_cell command:

set_driving_cell -library ex28nm_ss0p95v125c -lib_cell NBUFFX8 -pin Y [get_ports {in1}]

If for some reason the input port drive capability can't be modeled with a cell in the technology library, you can use the set_drive and set_input_transition commands. Together these commands can represent the drive resistance of the top-level ports; however, they are not as accurate for non-linear models as the set_driving_cell is.

Capacitive Load: By default, the tool assumes that the external load on the ports is zero. To avoid this unrealistic approach, total capacitance driven by the output cells has to be accurately estimated. This would help tool to select the correct cell drive strength of an output pad.

External capacitive load can be specified using set_load command:

set_load 1.5 [get_ports {in1}]
set_load [load_of ex28nm_ss0p95v125c/NBUFFX8/A] [get_ports {in1}]

Output Fanout Load: It is a unit less value that represents a numerical contribution to the total fanout. The tool uses fanout load primarily to measure the fanout presented by each input pin. An input pin normally has a fanout load of 1, but it can have a higher value.

The fanout load is set using set_fanout_load command:

set_fanout_load 4 [get_ports {out1}]

The tool will try to make sure that the sum of the fanout load on the output port plus the fanout load of cells connected to the output port driver is less than maximum fanout limit of the library, library cell and design.

Setup design constraints and goals

Design constraints provides a mechanism by which we can provide hints to the tool, so that it can choose between more than one available optimization paths based on timing, area and power specifications.

Timing Constraints: To accurately set up timing constraints, the tool needs to know the following:

Clocks
I/O timing requirements
Combinational path delay requirements
Exceptions

Clocks: Creating accurate specification of clocks and relationship among them is one of most important task. An incorrect specification may sometime lead to silicon failure.

Clock Definition: The clock definition includes information such as source, period, duty cycle, skew and clock name, among others. create_clock command is used to define a clock. create_generated_clock is used to define an internally generated clock.

create_clock -name SYS_CLK -period 5 [get_ports {CLK}]
create_generated_clock -name SYS_CLK_DIV_2X -source {CLK} -divide_by 2 [get_pins {FD_DIV/Y}]

Clock Latency and Skew: By default, the tool assumes clock networks to be ideal. To avoid unrealistic assumptions, clock latency and skew have to be estimated.

Clock latency is the propagation time from the actual clock origin to the register clock pin. This can be divided into two broader categories.

Source Latency: This refers to the latency from the actual clock origin to the clock definition point.
Network Latency: This refers to the latency from the clock definition point to the register clock pin.

These can be provided using set_clock_latency command:

set_clock_latency -source 1 [get_clocks {SYS_CLK}]

The -source switch is used to infer source latency.

set_clock_latency 2 [get_clocks {SYS_CLK}]

If -source switch is omitted, it infers network latency.

Uncertainty accounts for varying delays between the clock network branches. There are two types of uncertainty: simple and inter-clock. Simple uncertainty means that setup and hold uncertainty applies to all paths to the endpoint. Inter-clock uncertainty allows for specification of different skew between various clock domains. These values can be over-estimated to account for additional margin for setup and hold. Moreover, setup uncertainty should include skew contribution because of both jitter and network skew as it involves analysis between two different clock edges. However, hold uncertainty should only account for jitter values as it involves analysis between same clock edges. This can be specified using set_clock_uncertainty command:

set_clock_uncertainty -setup 0.250 [get_clocks {SYS_CLK}]
set_clock_uncertainty -hold 0.040 [get_clocks {SYS_CLK}]
set_clock_uncertainty -from [get_clocks {SYS_CLK}] -to [get_clocks {PHI_CLK}]
set_clock_uncertainty -from [get_clocks {PHI_CLK}] -to [get_clocks {SYS_CLK}]

Once the clock tree synthesis is performed, the actual network skew data is available. In such cases, it is recommended to reduce the uncertainty values to accommodate only jitter variations and tool should be instructed to compute the network skew. The set_propagated_clock command specifies that delays to be propagated through the clock network to determine latency at register clock pins. This command should be used for post route analysis.

set_propagated_clock [get_clocks {SYS_CLK}]

The clock definition, latency and skew information can be queried using:

report_clocks
report_clocks -skew

I/O timing requirements: The input and output ports needs to be constrained properly, in order to accurately constrain input to register delay and register to output delay. By default, tool assumes the external delays to be 0, which may be highly unrealistic.

set_input_delay command specifies how much time is used by the external logic. The tool will then calculate the available time left for the internal logic.

set_input_delay -max 2.5 -clock [get_clock {SYS_CLK}] [get_ports {in1}]

set_output_delay command specifies how much time the external logic would need. The tool will then calculate the available time left for the internal logic.

set_output_delay -max 2.5 -clock [get_clock {SYS_CLK}] [get_ports {out1}]

Combinational path delay requirements: The combinational path delay can be constrained using either of set_max_delay/set_min_delay or set_input_delay/set_output_delay commands.

The set_max_delay command allows you to specify the maximum path delay for any startpoint to any endpoint. The tool will try to make the path less than the delay value set.

set_max_delay 2.5 -from [get_ports {in1}] -to [get_ports {out1}]

The set_min_delay command allows you to specify the minimum path delay from any startpoint to any endpoint. The tool will try to add delays to fix the violation to make it less than the specified value.

set_min_delay 1 -from [get_ports {in1}] -to [get_ports {out1}]

Alternatively, if set_input_delay and set_output_delay is mentioned on input and output ports respectively, and there happens to be combinational path from these input-output pair, then the combinational path is automatically constrained for:

Clock_Period - [(set_input_delay) + (set_output_delay)]

To see maximum and minimum constraints:

report_timing_requirements

Exceptions: Almost every design has exceptions. Exceptions can be false paths or multicycle paths.

The set_false_path command is used to instruct tool to ignore timing on certain paths.

CLKA and CLKB are asynchronous clocks. The path from Block_A is launched using CLKA and is sampled in Block_B in CLKB domain, so these need to be declared as false[Note: Appropriate CDC solutions needs to implemented between asynchronous domain transfer before any paths are declared false].

set_false_path -from [get_clocks {CLKA}] -to [get_clocks {CLKB}]
set_false_path -from [get_clocks {CLKB}] -to [get_clocks {CLKA}]

The set_false_path command doesn't provide any intention why these particular path was declared as false. Alternatively, there are better ways to provide this relationship for clocks using set_clock_groups command.

When defining relationship between two or more clocks, exact topology and existence can be precisely stated using set_clock_groups command.

If two clocks are asynchronous, it means that they don't have any phase relationship among them at all. So, instead of using definite timing windows based on arrival times and skew, etc, the tool will use infinite timing window when calculating aggressors and victims, therefor you will see maximum SI impact.

set_clock_groups -asynchronous -group {CLKA} -group {CLKB}

If two clocks are logically_exclusive, timing paths between these clock domains are false, but both clocks can exists in the design at the same time, so SI interactions between paths in these domains should be considered. However, crosstalk analysis will be done with regular timing window based on arrival times/skew etc.

set_clock_groups -logically_exclusive -group {CLKA} -group {CLKB}

If two clocks are physically exclusive, timing paths between these clock domains are false, but only one clock can exist in the design at the same time, so there should be no SI victim/aggressor interaction at all between nets clocked by physically exclusive clocks.

set_clock_groups -physically_exclusive -group {CLKA} -group {CLKB}

The set_mutlicycle_path command is used to tell the tool to allow leniency for paths those require longer than a single clock cycle for a path.

For instance, the adder highlighted in the diagram above would take 6 clock cycles to finish. To constrain this using mutlicycle path:

create_clock -name clk -period 10 [get_ports {clk}]
set_multicycle_path 6 -setup -to [get_pins c_reg[*]/D]

One caveat with multicycle path is that it would also move the hold analysis edge, which needs to be restored back to the original location. This can be done using:

set_multicycle_path 5 -hold -to [get_pins c_reg[*]/D]

Area: The tool will perform minimal area optimizations unless an area constraint is set. This can be done using set_max_area command:

set_max_area 1000

To get a minimal area, user can set the maximum area constraint as 0. However, this is not recommended as it often blows up the runtime.

To get an optimal trade off between runtime and good quality results is to set maximum area value to around 90% of the minimum area. The design's minimum area can be found by running simple compile mode with no clock or timing constraints:

set simple_compile_mode true;
compile

Design Rules: The technology library vendor imposes design rules that restrict how many cells are connected to one another based on capacitance, transition and fanout. However, these can be overridden using set_max_capacitance, set_max_transition and set_max_fanout command.

The set_max_capacitance command is used to specify the maximum capacitance allowed on a object and tool will try to make sure that the capacitance value for a net is less than the value specified. The maximum value for a net is defined as the least of maximum capacitance values of the cell pins and design ports on that net.

set_max_capacitance 1.0 [get_ports {out2}]

The set_max_transition command would specify the maximum allowed transition on design object.

set_max_transition 1.0 [get_ports {in2}]

The set_max_fanout command instructs the tool to ensure that the sum of the fanout load attributes for input pins or nets driven by specific ports or all nets in the specified design is less than the given value.

The fanout load value doesn't represent capacitance; it represents the weighted numerical contribution to the total fanout load. Let us take an example of two inverters INV1X2 and INV1X16. The fanout load attribute for each of these can be found by using get_attribute command:

get_attribute ex28nm_ss1p08v125c/INV1X2/A fanout_load
get_attribute ex28nm_ss1p08v125c/INV1X16/A fanout_load

If the above command returns 0.25 and 3.00 respectively, and we have placed the max fanout constraint as:

set_max_fanout 6 [get_ports {in1}]

Then, the tool can load port in1 with 6/.25 = 24 INV1X2 cells, but can only load port in1 with 6/3 = 2 INV1X16 cells.

Om Prakash Hari