Multi-threaded Exploder Yields much Faster Clip Extraction
July 28, 2016
Steve DiBartolomeo
Applications Manager
Artwork's QISLIB is often used to extract a large number of small bitmap clips from GDSII layout data for purposes of analysis and comparison to data acquired from inspection equipment. For large GDSII files the rate at which "clips" can be extracted was limited by the "explosion" or traversal through the GDSII database. The bigger and more complex the file, the worse this bottleneck appears when compared to our rasterizer.
New Multi-Threaded Exploder
Artwork has been working on a multi-threaded exploder for several years. We now have working code and recently mated the exploder to our latest rasterizer using a highly optimized queue scheduler. The first results are most encouraging.
The Test File
Here are the file properties for P9.GDS, the one we used for our tests:
Parameter |
|
File Size |
20GB |
Extents |
9.2 x 9.1 mm |
Cell Definitions |
34,966 |
SREFs (records) |
~123M |
AREFs (records) |
~670K |
Boundaries (record) |
~105M |
Paths (records) |
~158M |
Layers |
187 |
Polygon Density (polygons/um2) |
~68 |
Single Threaded Exploder
Window Size (um) |
Number of Windows |
Polygon Extraction plus RIP (secs) |
Explosion Rate (M Polys/sec) |
Explosion Rate (M Vertices/sec) |
100 x 100 |
625 |
207 |
1.000 |
5.42 |
31.25 x 31.25 |
6,400 |
220 |
0.995 |
5.41 |
10 x 10 |
62,500 |
275 |
0.825 |
4.5 |
Multi Threaded Exploder
Window Size (um) |
Number of Windows |
Polygon Extraction plus RIP (secs) |
Explosion Rate (M Polys/sec) |
Explosion Rate (M Vertices/sec) |
100 x 100 |
625 |
27.7 |
7.55 |
40.8 |
31.25 x 31.25 |
6,400 |
25.6 |
8.27 |
44.8 |
10 x 10 |
62,500 |
29.6 |
7.42 |
40.6 |
How Much Faster?
Under a wide range of window sizes we get about an 8X improvement.
Caveats, Conditions and Restrictions
You knew there would be fine print.
We used a 6 core Intel i7 with 12 hyperthreads. More cores would achieve better results, no doubt, but we don't know the core to speed ratio. We are not sure that hyperthreading actually improves throughput - more tests will be run.
The polygons were rasterized by our latest RIP but nothing was done with the RIP bitmaps put into memory.
The improvements reported work for "small" bitmap clips - if the extraction window were large (say 1/10 of the chip area or larger) then the bottleneck would move from data explosion to rasterization so a multi-threaded exploder would produce much less (if any) benefit.
All data is read from and written to RAM. If the GDSII data were on disk then multi-threading would be of no use as disk IO would be the defining bottleneck.
These multi-threaded results only apply when used in conjunction with Artwork's ACSRasterLib and an optimized queue scheduler. They do not apply when extracting GDSII clips.
These results are currently valid only for input GDSII data -- not for OASIS files.
Next - The new QISLIB MT Architecture
The next page will discuss the specific architecture we created to achieve these results.
|