Скачати 46.8 Kb.
A PROGRAM SYSTEM FOR DISTRIBUTED EVENT-DRIVEN LOGIC SIMULATION OF VHDL-DESIGNS
LADYZHENSKY Y.V., POPOFF Y.V.
Ladyzhensky Y.V., Popoff Y.V. A program system for distributed event-driven logic simulation of VHDL-designs. Analysis of synchronization protocols for logic simulations is fulfilled. Algorithms for experimental research of parallel simulations are developed. A system architecture is proposed. Software was implemented and tested in a local network.
Project verification is an important stage in creating of electronic designs. Executing of simulation process is an effective way to verify if an electronic device operates correctly. The purpose of simulation is both a functional project verification and time analysis of device behavior. Simulating systems with high complexity and large parameter spaces require tremendous computational and memory resources. Hardware accelerators process simulation process much faster then usual computers. But hardware accelerators are also much more expensive then usual computers. Accelerating of simulation may also be obtained by implementing a simulation model on a computer network.
Last research analysis has outlined that VHDL is a leader in industrial and academic worlds. It is a standard of The Institute of Electrical and Electronics Engineers for more then 15 years [1, 2]. Nowadays there are many systems for creating projects using VHDL: ModelSim of Model Technology Inc., Active-HDL of Aldec, Foundatein Series of Xilinx and oth. All these systems allow users execute simulation in different levels. But these systems do not allow users to execute distributed simulation of their projects. Therefore simulation process is slow when simulating a large project. Fundamental algorithm review is presented in [3, 4]. Multithreaded, parallel and distributed programming principles are outlined in . An effective way for creating fundamentals of designing distributed logical simulation systems is object-oriented approach described in . Parallel and distributed VHDL-simulation questions are outlined in .
Goals and tasks of the article research include analysis of digital devices simulation methods and showing up peculiarities of a program system for distributed event-driven simulation of VHDL-projects based on experimental data.
1. Digital devices simulation algorithms
Functionality of any digital device may be represented as a set of states and events (fig. 1). A current set of signal values in all circuit nodes represents a system state. Event ei is a signal modification. T(ei) – is a moment of Virtual Time (VT) when ei is scheduled to execute. Executing of simulation now means a processing of events that occurs in an exact order in time and a state modification that corresponds to these events. For state modifications in future, new events are to be scheduled .
Figure 1 – Events and states
Simulation algorithms are distinguished with an order of processing and scheduling of new events. A generalized simulation algorithms classification is presented on fig. 2.
In a sequential simulation model events are processed one by one, definitely in order of increasing T(ei). Events with T(ei)=T(ej) are processed one after another. Accelerating of simulation may be obtained by processing these events concurrently.
In a parallel simulation model some acceleration is achieved due to parallel executing of algorithm parts in presence of common memory. For example, there may be an event processing thread, an event scheduling thread and a thread for sending/receiving messages.
In distributed simulation a model parallelism of a simulated circuit is considered, not of a simulation algorithm. A source circuit is partitioned to execute distributed simulation. Every part of the circuit is simulated on a separate computer (Simulation Processor, SimProc). P(ei) is a processor that processes event ei. If T(ei)=T(ej) and P(ei)<>P(ej), then events ei and ej can be processed simultaneously.
After circuit partitioning there may occur events that are generated in one part but affect on other part of the circuit. To obtain correct simulation results it is necessary to process events in chronological order. Synchronization protocols are used to achieve this.
One of possible ways to synchronize simulation processors is to process events in chronological order globally overall the system. This may be obtained if local virtual time (LVT) of every simulation processor equals global virtual time (GVT). Synchronous protocols follow this idea.
Events ei and ej scheduled at different virtual time moments may occur. Circuit nodes ni and nj are called independent if there are paths neither from ni to nj nor from nj to ni. If circuit nodes ni and nj affected by events ei and ej are independent, then events ei and ej are called concurrent and may be processed concurrently. Synchronous protocols do not allow ei and ej to be processed if T(ei)<>T(ej), even if ei and ej are concurrent.
Figure 2 – Generalized classification
of simulation algorithms
Simultaneous processing of concurrent events scheduled at different virtual time moments is allowed in asynchronous synchronization protocols. Local virtual clocks of every simulation processor keep local virtual time (LVT) that is different from global virtual time (GVT) in asynchronous protocols to allow simultaneous processing of events with different timestamps. LVT is always greater then or equals GVT.
Due to different LVT values on different processors in an asynchronous protocol, causality errors are possible. Causality error may occur, for example, when a SimProc Р1 has advanced it’s LVT to t1 that is in a far future in a comparison to SimProc P2. P2 has advanced it’s LVT to t2
In conservative synchronization protocols simulation processors do not allow straggler events to appear. Conservative protocol allows SimProc to process only “safe” events. Event esafe is safe if there are surely no events estraggler, T(estraggler)
In a conservative model a deadlock (cyclic waiting) is possible. A deadlock occurs if, for example a processor P1 is waiting for events from P2 and P2 is waiting for events from P1. Large cycles are possible also. When a deadlock occurs, no processors can advance their LVT and process more events.
Different approaches for a deadlock resume / avoidance are used in different conservative synchronization protocols. Deadlock resume algorithms allow a deadlock to occur, but provide a mechanism to detect a deadlock and to resume a system from a deadlock. Deadlock avoidance algorithms make deadlocks not possible.
A conservative simulation process speed relies on prediction algorithms heavily. Prediction algorithms determine how far a simulation processor can advance it’s LVT. Virtual time until which simulation processor can advance it’s LVT is called a local virtual time horizon (LVTH). T(esafe)<=LVTH.
Possible shortcoming of a conservative synchronization protocol is a presence of a waiting mode. It may turn out that a simulation processor is waiting even if no straggler events will arrive.
In optimistic synchronization protocols simulation processors do not wait until events become safe and allow straggler events and thus causality errors to occur. Optimistic synchronization protocols provide a mechanism to recover from a causality error. When a straggler event arrives, an optimistic synchronization protocol rolls back to a state in the past, and then allows a simulation processor to process this event. To make a rollback possible, all state variables are saved in a state stack. St is a state of a system at virtual time t. While a rollback, all state variables St, t>=tstraggler can be either removed from a memory at once or stay in a memory. If a state St, t>LVT, is saved in the memory, then it can be used if a straggler event has not produced any real modifications in the state variables.
Larger amount of memory may be required to store a state stack. It is possible to store modified state variables only (a partial copy of state variables) in the state stack to save some memory. But in this case it is necessary to look through all St, T(estraggler)<=t
After a straggler event arrives, all events ei>estraggler are considered invalid. These events include local events and external events. Local events are generated and processed on a same processor. External events are generated and processed on different processors. When an external event is generated, a message is sent to a remote processor. It is necessary to notify remote processors if external events become invalid. This operation is called an event cancellation. An antimessage is sent to cancel an event. There are two fundamental event cancellation algorithms.
In an aggressive cancellation algorithm a processor P1 sends antimessages to a processor P2 for all events scheduled at estraggler<=ei, thus it informs P2 faster about invalid messages but also cancels events that are possibly valid. In a lazy cancellation algorithm a processor P1 sends antimessages to a processor P2 after P1 has advanced it’s LVT and it is became clear that an external event is really invalid: T(estraggler)<=T(ei)<=LVT. A combination of these methods can be used: events T(estraggler)<=T(ei)<=LVT+t are cancelled, t is a parameter of the cancellation aggressiveness, 0<=t<=+. If t=0, this algorithm corresponds to a lazy cancellation algorithm. If t=+, this algorithm corresponds to an agressive calcellation algorithm.
Simulation processors process events scheduled at LVT<=T(ei)<=LVTH+t in combined simulation algorithms. t is a degree of a synchronization protocol optimism, 0<=t<=+. If t=0, this protocol conforms a conservative synchronization protocol. If t=+, this protocol conforms an optimistic synchronization protocol. Another approach to limit model optimism is to process events scheduled at LVT<=T(ei)<=GVT+t. In this case t is a degree of system asynchronousness. When t=0, this synchronization protocol conforms a synchronous model.
2. Architecture of a program system for distributed event-driven logic simulation of VHDL-projects
A program system for distributed event-driven simulation of VHDL-projects is an extension for existing simulation systems. A program system architecture is presented in fig. 3. The input data for the system is a VHDL-project and a set of test sequences. A result of the synthesis is a project description on a behavioral level, register-transfer level or a logical level. This description is now considered to be a circuit. Every element used in the circuit has a description in a corresponding dynamic-link library (DLL). We may use different as a circuit elements: from a single gate to a processor unit.
Figure 3 – Architecture of a program system for distributed event-driven logic simulation of VHDL-projects
Partitioning subsystem partitions a input circuit into several parts, each part will be simulated on a separate computer. A partitioned circuit description with test sequences is passed into a distributed simulation subsystem.
The result of the simulation is a file with information about the simulation process. Cause-and-effect relations diagrams (CER-Diagrams) are built on this information. The information is written in a language of cause-and-effect relations diagrams (CER-Program). CER-program includes signal sequences on the circuit outputs and may additionally include information about internal method calls, about all system state modifications and about all messages sent and received from a network.
CER-programs are used for building waveform diagrams on external and internal circuit nodes, for statistical information analysis and for building CER-diagrams.
The purpose of waveform diagrams (fig. 4) is to research a validity of simulation results.
Figure 4 – Waveform diagram
The purpose of CER-diagrams (fig. 5) is a visual simulation process representation and a research of synchronization protocol behavior features.
Figure 5 – Cause-and-effect relations diagram example
- a list of state variables for every simulation processor. This list depends on a used synchronization protocol. Common variables are signal values in circuit nodes and LVT;
- modifications of these signals in real and virtual time;
- messages transferred over network;
- actions performed by simulation processors. A list of actions depends on a synchronization protocol used. Common actions are an event processing and event generation;
- physical timestamps.
3. Experimental research of a program system for distributed logical event-driven simulation
There are two distributed simulation algorithms implemented in the system:
1) based on a conservative synchronization protocol. Marker is used for deadlock detection/recovery. Null-messages described in  are used for better prediction;
2) based on a combined (with a limited optimism) synchronization protocol. Maximum difference in local and global virtual clocks is used as an optimism limit. Lazy cancellation algorithm is used for an invalid message cancellation.
Experiments were performed on three circuits:
1) a small circut. Graph of processors contains some cycles (fig. 6). Every processor contains only two elements. There is a single connection channel between every pair of processors;
Figure 6 – Graph of processors in experiment 1
2) a medium-size circuit. Graph of processors does not contain cycles (fig. 7). A weight of a processor equals to a number of elements on the processor. A weight of a connection equals number of communication channels between processors. There are 23815 elements and 471 communication channels totally in the system. A circuit from  is used in this experiment;
Figure 7 – Graph of processors in experiment 2
3) the same circuit as in 2) but a graph of processors contains some cycles (fig. 8). It was possible to decrease a total number of communication channels in the system due to cycle presence. Total number of channels is 292 (less by 38% then in the previous experiment).
Figure 8 – Graph of processors in experiment 3
All experiments were performed on P4 2.4 HHz computers, 100 Mbit network adapters, operation system - Windows-98 SE. DirectX 8.0 was used to transfer messages over a network. A single computer corresponds to a single simulation processor.
Simulation time (in seconds) in dependence on a degree of optimism (in VT units) in experiment 1 is presented on fig. 9. Optimistic (a) and conservative (b) synchronization protocols are used. There is no optimism degree in a conservative synchronization protocol, but the line (b) is shown for comparison purposes.
Simulation time has reached it’s minimum value when the optimism degree was 2 VT units in the optimistic synchronization protocol. Simulation time was about 1.5 – 2 minutes up until 30 VT units of the optimism degree. Further increasing of optimism degree caused simulation time to increase very quickly. When the optimism degree was 34 VT units, simulation time was greater then simulation time of the conservative model. When optimism degree was 42 VT units (not shown on the figure), simulation time was greater then 1.5 hours. When optimism degree reached 50 VT units, global virtual time has advanced up until 350 (from total 10,000) in 10 hours of simulation.
Figure 9 – Simulation time on optimism degree dependency in experiment 1
Simulation time with optimism degree of 0 is of a special interest. When optimism degree equals to zero, the optimistic model conforms to a synchronous model. It was required 243 seconds to simulate synchronously, and 291 seconds (greater by 20%) to simulate conservatively. Experiments with synchronous and conservative synchronization protocols were performed repeatedly. Maximum time deviation did not exceed 2 seconds. This occurs because a deadlock detection / recovery mechanism used in conservative synchronization protocol is very slow. It is necessary to pass a marker three times (by cycle among all processors) to detect a deadlock and a single message to recover from a deadlock. These messages are transferred one after another and can not be transferred simultaneously. In the worst case there are 6 messages (from each processor to every other processor) to be transferred between processors in an optimistic model to determine current global virtual time. But in contrast to a conservative model, these messages are sent all together and do not wait a previous message to arrive. Less number of messages may be transferred in the optimistic model if some processors have not advanced their LVTs in the last simulation step.
Simulation time (in seconds) in dependence on a degree of optimism (in VT units) in experiment 2 is shown on fig. 10. Optimistic (a) and conservative (b) synchronization protocols are used. There is no optimism degree in a conservative synchronization protocol, but the line (b) is shown for comparison purposes.
It is clear from the chart on fig. 10 that simulation time is decreasing gradually when optimism degree is low. When an optimism degree is high, simulation time is almost constant but is very unstable. This points out that random delays have high significance when an optimism degree is high. Random delays may occur while transferring a message over a network or while processing of events.
Similarly to the previous case, simulation time was higher for a asynchronous model (an optimistic model with an optimism degree equals to 0) then for a conservative model. Simulation time was 545 seconds for a synchronous model and 566 seconds (by about 4% greater) for a conservative model.
As we can see, simulation time decreased when an optimism degree increased. Optimistic model was better for higher optimism degree. Additional network traffic did not cause simulation time to increase in this case. Additional traffic appears for due to invalid calculations in the optimistic model.
Figure 10 – Simulation time on optimism degree dependency in experiment 2
Average simulation time was less then 400 seconds when the optimistic synchronization protocol was used. Simulation time was 566 seconds for the conservative synchronization protocol. Average acceleration is about 30% when changing a synchronization protocol from conservative to optimistic. This is probably because of the presence of circuit parts on the 3 and 4 processors that are not depend on external inputs connected to the processor 1. While messages are transferred from the processor 1 to processors 3 and 4, processors 3 and 4 process events in the future and send messages to the processor 2. After a rollback on the processor 3, it may turn out that straggler event does not cause values on external output nodes of the processor 3 to be changed, or causes only values on few output nodes to be changed. Processor 3 will send to processor 2 less messages after receiving next message from processor 1 in the optimistic model then in the conservative model. This occurs because lazy cancellation algorithm is used in the optimistic synchronization protocol realization.
Simulation time (in seconds) in dependence on degree of optimism (in VT units) in experiment 3 is shown on fig. 11. Optimistic (a) and conservative (b) synchronization protocols are used.
A very low deceleration in the optimistic model with a very high and a very low optimism degree is visible on fig. 11. In total it is clear that simulation time is unstable and depends on random delays. Random delays occur while transferring messages over a network and while processing events. Simulation time deviation is 10% if to execute simulation several times with the same optimism degree.
Insignificant random delays may cause significant growth of total simulation time by a following scheme. An event e1 has delayed in the network, for example, by 10-3 seconds. A receiver processor has advanced it’s local virtual time by a single unit (computers are very fast). The LVT advancement has caused, for example, 5 events to occur on 5 of 100 external output nodes. 5 messages have been sent to notify remote processors about these new events. A rollback begins after e1 is received. After resimulation it may turn out that 2 of these 5 events are invalid and thus 2 messages sent are invalid. An antimessage is sent for every invalid message. Due to lazy cancellation algorithm is used, antimessages are sent not immediately after e1 is received but after resimulation is processed and it became clear which events are really invalid. As a result 4 additional messages were sent over a network. 2 of these messages will cause receiver processors to process erroneous simulation and, possibly, new invalid messages to occur.
Figure 11 – Simulation time on optimism degree dependency in experiment 3
1. A synchronous model (an optimistic model with optimism degree 0) is faster then conservative model in all experiments. This is because a global state calculation algorithm used in the optimistic protocol works faster then a deadlock detection / recovery algorithm used in the conservative synchronization protocol.
2. An optimistic model was better then a conservative model in all experiments. This is because there were many deadlocks in the conservative model and a deadlock detection / recovery took much time for execution. The only exclusion is the first experiment with a high optimism degree. It turned out that GVT calculation algorithm used in the optimistic synchronization protocol works faster then the deadlock detection / recovery algorithm in the conservative synchronization protocol. This feature may be used in conservative protocols to make a deadlock detection/recovery faster.
3. It is clear from experiment 1 that simulation time increased exceedingly during simulation of small circuits on fast computers. This occurs because many invalid events are generated and invalid messages are sent over network in response to a single arrived invalid message.
4. Simulation time is very unstable if optimistic synchronization protocol with a high optimism degree is used. Simulation time is unstable because it depends on random delays while a message is transferred over a network and while events are processed. This occurs because a small delay in transferring of a message will cause many invalid messages to be generated (as processors are very fast). Every invalid message, in turn, causes new invalid messages to be generated.
5. Deadlocks in the conservative synchronization protocol and rollbacks in the optimistic synchronization protocols appear due to cycles in a graph of processors. But it is clear from experiments that high number of communication channels may cause greater simulation deceleration then additional cycles in the graph of processors.
1. Грушвицкий Р.И., Мурсаев А.Х., Угрюмов Е.П. Проектирование систем на микросхемах программируемой логики. – СПб.: БХВ-Петербург, 2002. – 608 с.: илл.
2. Перельройзен Е.З. Проектируем на VHDL. – М.: СОЛОН-Пресс, 2004. – 448 с.: илл.
3. Ferscha Alois. Parallel and Distributed Simulation of Discrete Event Systems. In Hardbound of Parallel and Distributed Computing. McGraw-Hill, 1995.
4. Chandy K.M., Misra J. Asynchronous Distributed Simulation via a Sequence of Parallel Computations. Communications of the ACM, 24(11): 198-206, November, 1981.
5. Эндрюс Г.Р. Основы многопоточного, параллельного и распределенного программирования: Пер. с англ. – М.: Издательский дом «Вильямс», 2003. – 512 с.: Илл.
6. Ладыженский Ю.В., Попов Ю.В. Объектно-ориентированная модель протоколов синхронизации при распределенном логическом моделировании цифровых устройств // Наукові праці Донецького національного технічного університету. Серія: Обчислювальна техніка та автоматизація. Випуск 64. – Донецьк: Вид-во ДонНТУ, 2003. – 280с. – с. 212-221.
7. Lungeanu D., Shi C. Parallel and Distributed VHDL Simulation. – ACM SIGDA Publications on CD-ROM: Date 2000. - Department of Computer Science, University of Lowa. – p. 658.
8. Brglez F., Bryan D., Kozminski K. Combinational Profiles of Sequential Benchmark Circuits. ISCAS'89 Benchmark Circuits. - Proc. IEEE Int. Symposium on Circuits and Systems. - May 1989. - pp. 1929-1934.
|Abstract Ladyzhensky Y. V., Popoff Y. V. A program system for distributed event-driven logic simulation of vhdl-designs||Abstract Y. V. Ladyzhensky, Y. V. Popoff. A program system for distributed event-driven logic simulation|
|Abstract Y. V. Ladyzhensky, Y. V. Popoff. A program system for distributed event-driven logic simulation||Abstract. The abstract body should be in Times New Roman, 10 pt Italic type. Do not cite references in the abstract. Key words|
|Abstract. The abstract body should be in Times New Roman, 10 pt Italic type. Do not cite references in the abstract. Key words||Abstract. The abstract body should be in Times New Roman, 10 pt Italic type. Do not cite references in the abstract. Key words|
|Харченко ПавлоІгорович Освіта|
Участь в освітньому проекті піраньї кроку, за напрямом «Event». Команда Blue-BloodedPiranhas. Отримав спектр знань з особистісного...
|Abstracts application form to the State abstract database “Ukrainika naukova” and Ukrainian abstract journal “Dzherelo”|
|Основные надписи united system for program documentation. Basic legends|
Постановлением Государственного комитета СССР по стандартам 18 декабря от 1978 г. №3351 срок введения установлен
|Master's thesis theme: "Logical and parametric simulation of the mos of the vlsi circuit" Teacher: Dr. Sci. Tech. Andrjuhin Alexander Ivanovich Absatract «Logical and parametric simulation of the mos of the vlsi circuit»|