web page logo


Scanning and Loading Large GDSII Files

For users that need to process very large GDSII or OASIS files using QISLIB, the first hurdle is to get the data loaded into the computer's memory. Since layout files are hierarchical, the loading process includes building tables of cells and references and in Artwork's case we build a number of quad trees that are used to get fast access to data for a given window.

Our QISLIB requires both an initial database scan; then a second read for database load which is coupled together with building the quad tree. Once both these processes are completed the extraction of polygons can begin.

In order to provide some estimates of how fast one can load large files we took three of our largest files and scanned and loaded them. We also varied some parameters that affect the load time.

 

Scan/Load Timing [i7-3930K @3.2GHz, 32GB Installed RAM, SSD]

TEST
#
FILE
NAME
SIZE
(GB)
QT LOAD
TO MEM
SCAN
(sec)
LOAD
(sec)
MEMORY
(GB)
1 P8 8.9 256 YES 54 36 2.97
2 P9 20.9 256 YES 108 121 10.57
3 P9X2 42 256 YES 303 639 18.39
4 P9 20.9 1024 YES 151 162 10.76
5 P9X2 42 1024 YES 305 636 18.9
6 P9 20.9 1024 NO 150 116 3.22
7 P9X2 42 1024 NO 305 588 3.93


NOTES

Disk and CPU Limited

Both Scan and Load are single threaded processes. Therefore CPU clock speed is more important than the number of available CPU cores. (Unfortunately, the more cores a processor has, the lower the clock speed.) Further, the initial pass (scan) reads the layout data from disk - therefore disk I/O is another important factor. One should consider using newer SSD interfaces that are faster than SATA6.


QT Quad Tree Count

This parameter determines how many entities can go into a quad cell before the cell is divided. Our default value=256 but as your files get large and dense you may wish to increase this value. We ran tests with values of both 256 and 1024.


Input Layer Filtering

For P8, the input layer filter applied passes layers: 40,43,46,47,49. For P9, the input layer filter applied passes layers: 25,43,46,47,49. Filtering just the needed layers reduces both the memory footprint of the quad tree and the entity data.

Memory Usage

MEMORY usage is determined by using the Windows task manager and checking the memory usage before the program started and after scan and load are completed. The memory footprint includes data tables generated by the scan, the quad trees and the entity data loaded to memory if LOAD TO MEM = YES.


Effect of OS File Caching

These tests were run serially so the operating system's cache behavior has some influence on the timings that are beyond our control. For example, one would expect the scan time for test 2 and 4 to be roughly the same – yet test 2 takes 108 seconds and test 4 takes 151 seconds. The only explanation is that the OS is doing something different each time.


LOAD TO MEM

Notice the large difference in memory footprint when we do not use LOAD TO MEM. The quad tree + entity data memory footprint typically ranges from 25% to 45% of the GDSII file size. For a workstation equipped with 192GB, one should be able to use the LOAD TO MEM option (for maximum clip extraction performance) for GDSII files up to about 400 MB.

dbload

To reduce the time to scan/load one can produce a "dbload" file -- essentially a memory map of the quad tree and entity data. Loading this is much quicker than going through the scan/load process. Very often a failure analysis lab will produce the dbload file offline the night before, saving equipment time the next day when such time is more precious.



Page 3 Polygon Extraction Rates