QisMRaster - Raster Extension to QisMLib

QisMRaster Extension Library

This rasterization library extension to QisMLib is intended to offer the very highest raster performance for demanding tasks while at the same time it is simple for a programmer to incorporate into his application by managing many of the complex operations internally.

QisMRaster is an extension to the QisMLib.

There seems to be two different use cases when rasterizing GDSII/OASIS layout data - case a) where a very large number of small windows (i.e. the internal polygon count is relatively small) are needed; case b) when a few very large swaths are needed where each "swath" may contain millions of polygons.

The QisMRaster library can be configured to optimally deal with both types of applications.

Pattern Recognition Speed Up

Many rasterization applications for electronics consist of a small "circuit" whose pattern has been repeated over a large panel. This can include IC packages, PCBs and liquid crystal displays.

QisMRaster has an internal function that looks for cells in the GDSII or OASIS layout that are repeatedly placed on the panel. Rather than process each instance of the cell, QisMRaster stores a bitmap of the cell and then places the bitmap directly -- thus bypassing both the exploder and the rasterizer. For layouts that contain large amounts of repeating cells the resulting improvement in throughput is very significant. [Flat input data is not analyzed as the time saved could be easily exceeded by the time required to perform the pattern checks.]

Optimizing for Desired Throughput

Raster applications (such as lithography and inspection) generally have a target throughput. That is, they need to rasterize an image area at a specified DPI within a specified time.

For advanced electronic applications we are talking about processing millions or billions of polygons into billions of pixels. To do that all rasterizers divide the image area into small windows and work on each window individually.

It turns out that there are two main tasks in the pipeline:

throughput is affected by the polygon extraction and by polygon rasterization

a) extracting (exploding) the polygons within a given window.

b) rasterizing those polygons into pixels.

For any given combination of input file size/complexity, image area, DPI and window size it is unlikely that the two tasks will be balanced. One or the other will be the throughput bottleneck.

Deploying Exploders and Rasterizers

The QisMraster extension library can be configured as needed to find the optimal balance between exploding polygons and rasterizing them. The block diagram below describes the internal flow:

The client (calling) program provides a list of windows to rasterizer.

The QisMRaster library allocates windows to the next available exploder thread.

The Exploder extracts any polygons that cross the window under consideration.

Polygons are placed into a memory buffer "p".

One or more rasterizers processes the polygons into bits.

Bits (pixels) generated by the rasterizer threads are placed in a memory buffer.

The client gets the bits using a call back function to read them directly from memory.

How Many Exploders and How Many Rasterizers?

The simplest setup is one exploder thread followed by one rasterizer thread. Suppose you deploy this configuration to run with your application.

Now, let's assume your measured throughput is 4X slower than your target. What to do next to meet your target?

One could simply reconfigure the library to use (assuming you have sufficient licenses) four exploder threads, each with its own rasterizer. But before one chooses that option, it would be worthwhile to study the information included with each callback to determine where most of the time was spent. The callback will tell you whether a) the rasterizer spent a lot of idle time waiting for polygons. Or it might show b) that the rasterizer never waited at all.

In the case a) you might configure the library like this:

four exploder threads each with a single rasterizer thread.

In the case b) you might configure the library like this with a single exploder thread and four rasterizer threads:

Or you may find that the sweet spot between throughput and license cost is a configuration like this:

The optimal configuration is one that achieves the desired headroom and has the lowest licensing costs. Licensing costs are tied to the number of exploder threads. For each licensed exploder thread, as many rasterizer threads can be used until limited by available hardware cores.

Hardware Limitations

The computer hardware may place constraints. There are three tradeoffs: CPU clock speed, number of cores and RAM.