Off Line Processing of Large GDSII Files

This application note explains in some detail the use and advantages of off-line processing of very large GDSII files. But first some background on how the GDSII database is constructed and what steps must be taken to view part or all of the database is appropriate.


The GDSII database is hierarchical -- that is one can define a group of entities once, give this group a "name" and then "refer" to it by name or "place" it as many times as needed. The group is called a STRUCTURE, and the reference to a structure is called a SREF in GDSII's terminology. Since integrated circuits utilize the same cell many times, the GDSII database is quite efficient in describing an IC's layout. There is no limit to the levels of references - a STRUCTURE can contain references to other structures and so on ... However self-referencing is not allowed. GDSII does not impose any order on which structures are defined or referenced first. That is, one can refer to a STRUCTURE early in the database and then define it later on in the database. The result is that one can define an incredibly complex design with a compact and efficient database.

gdsii hierarchy

There is a high cost for the efficiency of GDSII when it comes time to display or otherwise process the data. That is because one cannot "extract" or calculate what entities are located in a particular part of the design without traversing and exploding the hierarchy. For small designs this is not a serious problem but for today's large chips the time required to traverse a 10 or 20 GB database and to do the exploding (or hierarhy flattening computations) can be extremely slow -- minutes of even hours. This means that panning or zooming through a large database becomes virtually impossible.



Scanning the File
In order to traverse and explode a file as efficiently as possible, QIS first scans the entire file and builds a table of structures (some IC designs have as many as 500,000 structure definitions) as well as a table of references to each structure (it is not unusual in a large design to have millions of references) and thus determines the hierarchy.

The scan module normally places the scan-tree results directly into memory.

How long does a scan take? This depends greatly on the size of the GDSII file, the number of structures and the number of references as well as the CPU speed of the computer. But for large GDSII files, say 10GB or more, a scan can take 20 minutes on a powerful workstation. The results of the scan must be stored in memory since this information will be repeatedly used to traverse the GDSII file.



Building a Quad Tree

When GDSII files were reasonably sized, one could load the entire file into memory and thus traverse it very quickly to pull out the entities to display in any particular region. Today's GDSII files are much too large to load into memory and caching doesn't work since one generally must traverse the entire file for each new display. But when the GDSII file is stored on disk, accessing it is 100 to 1000 times slower and traversing a 10GB file becomes impossibly slow. What to do?


One builds a "quad tree" which is a sophisticated database that organizes data by its location. Quad trees are very efficient for "flat" data (such as star maps) but building a quad tree for hierarchical data is much more complex and involves very complicated tradeoffs between quad tree size, efficiency and time-to-build.

The quad tree must be built and results stored in memory.

The quad tree "points" to an entity's location on the disk. When one is zoomed into a small region, the quad tree greatly narrows down which entities have to be pulled off of disk thus reducing the effects of slow disk I/O.

Building the quad-tree is compute intensive and can take 60 minutes for large GByte-sized files.

One builds a "quad tree" which is a sophisticated database that organizes data by its location. Quad trees are very efficient for "flat" data (such as star maps) but building a quad tree for hierarchical data is much more complex and involves very complicated tradeoffs between quad tree size, efficiency and time-to-build. Nevertheless, for very large GDSII files we have found the quad tree to be the best way to go.

The quad tree "points" to an entity's location on the disk. When one is zoomed into a small region, the quad tree greatly narrows down which entities have to be pulled off of disk thus reducing the effects of slow disk I/O.

Building the quad-tree is compute intensive and can take 60 minutes for large GByte-sized files.



Initial Display Delay

Since no part of the layout can be displayed prior to scanning and building of the quad tree, it is possible that for very large files a user might wait 60 or even 90 minutes before the initial view is available. Clearly no user will be happy about this and in some cases, for example where the user is sitting in front of a ion beam FIB which costs $600/hr to run, such a delay is totally unacceptable.

Off-line Processing to the Rescue

Imagine that the night before one needs to view a GDSII file, you instruct QIS to do a scan/quad-tree build on a particular GDSII file and to store the results to a disk file instead of in memory. When you come in the next morning and open the file, QIS doesn't scan and build a new quad-tree -- it merely reads in the previous night's results directly into memory and starts displaying the data. The display starts drawing almost instantaneously.

QIS is doing the same amount of computation and in the same time (or even slightly longer) but since it is doing it "offline" we don't consider such time very expensive. So as long as we have the ability to time shift the scanning and quad tree building, off line processing can be of great value to the busy engineer.



Page:   [1]   |   2   |   3