Software architecture, outside-the-box solutions, and optimization
I wanted to set up a nice, simple interface so my teammates would be able to easily integrate their components with my circuitry system. I determined that we would need three components:
This means that a button simply needs to toggle a boolean on the PowerSource to change whether its circuit is powered, and a piston only has to check the boolean on its Circuit to determine if it should extend. All the complicated heavy lifting goes on in CircuitManager.
Note that there are only three times CircuitManager needs to update. The first is when the level starts, as we need to determine the initial circuit groups. The second is when new portal is drawn, as the circuit groups may have changed. The third is when a PowerSource is turned on or off, which does not change the circuit groups, but may change their power state. If we keep track of the group that every Circuit belongs to, then whenever a PowerSource turns on or off, we can easily update the power state of any Circuit in its group, making the third case much cheaper to compute.
The next step was to figure out how we would determine the circuit groups. My first thought was something along the lines of Pipe Mania, as it has to solve a similar problem of determining whether a "circuit" (or in its case, a plumbing system) is connected. The problem is that this approach relies on a tile-based world, where we can e.g. check to see if a pipe with an upward connector exists one tile down from the current pipe. In Gemma's Great Gambit, the circuits aren't aligned to a grid — they sit in floating-point space like every other object in the game. Even if they were initially constrained to a grid, the player can cut holes of any arbitrary size and position out of the circuitry.
It then occurred to me that if I were to reduce the floating-point space down to a tile-based one, I could use a flood fill algorithm to determine which points were connected. In fact, rather than thinking in terms of tiles, I realized that I could think in terms of actual pixels: the simplest way to reduce arbitrarily-shaped objects down to a regular grid, it seemed, is to render them to a raster image. With that in mind, I set out to build the circuit grouping algorithm.
I set up a special camera that culls out everything but the circuitry. The camera is positioned and scaled so that it exactly captures all of the circuitry on the level. This is important because we'll be reducing the world down to a relatively small pixel grid, so we want the circuits to be as large as possible on-camera for accuracy's sake.
We render the circuitry to a RenderTexture, which reduces the practically infinite precision of the circuit's floating-point positions to a relatively small grid. Initiaully, 640x480 was the size that, after some experimentation, seemed to give a good tradeoff between performance and accuracy (though the resolution eventually changed as explained later in this article). Also, note that this camera is disabled most of the time — we only make render calls to it when we need to update the circuit groups, so we're not constantly wasting GPU resources.
We want to classify each pixel in the texture as either "circuit" or "not circuit." As well, because the rendering system averages neighbouring pixels in the process of downscaling, thin strips of circuitry may have become semi-transparent pixels in the process of rendering, and we don't want to ignore them. To solve both of these problems, we apply a threshold where any pixel with a non-zero R, G, B, or A value becomes white, and anything else becomes black.
Rather than continuing to deal with colors, we can at this point convert down to a single integer value. We convert the texture to a 2D array of ints, where circuitry (white) is 0 and empty space (black) is -1.
Let curId = 0. For each integer in the 2D array, we check if its value is 0. If it is, we perform a flood fill at that location with the value of curId, then increment curId by 1. Note that every element affected by the flood fill will now have a non-zero value, so we only perform one flood fill per group.
For each piece of circuitry in the scene, find its corresponding coordinate in the integer array. The value at that coordinate is its group ID, meaning that any circuits sharing that ID are connected.
I found that the algorithm was causing a brief hitch when it was executed. Before cranking down the resolution of the RenderTexture, I decided to first run the algorithm using Unity's profiling tools to see if I could make any optimizations. I found three major optimizations:
First of all, rather than applying the threshold in step 3 on the CPU, I implemented it as a fragment shader on the camera. This dramatically reduces the overhead of step 3, as the problem is easily paralellized per pixel.
Secondly, I reduced the number of pixels that need to be checked in step 5. Rather than interating over every element of the 2D array, I instead iterate over each piece of circuitry and only check if we need to perform a flood fill at their corresponding coordinates. Though this doesn't reduce the number of flood fills, it does significantly reduce the number of if statements checking if a pixel is non-zero, which provides a decent speedup.
Lastly, I noticed that on many levels in the game, our circuitry spans only a small portion of the level. However, the two alternate worlds in the game are implemented as two versions of the level offset by each other in the X axis. This means that if both worlds contain only a small amount of circuitry, the circuitry camera has to capture a very large horizontal area (most of which is empty) in order to encapsulate both sets of circuitry. As a result, the circuitry is compressed into a very small area in the RenderTexture, so we need to increase its resolution to maintain reasonable accuracy.
To solve this, I run the grouping algorithm once for each world, capturing only the circuitry in that world. Obviously running the algorithm twice is more expensive than only running it once, but it allowed me to drop the resolution to 320x240 while maintaining a similar level of accuracy. As the algorithm's running time is approximately linearly dependent on the number of pixels in the RenderTexture, and 320x240 is only a quarter as many pixels as 640x480, this means that even though we now run the algorithm twice, it actually takes approximately half as much time as before!
Naturally, there are a few caveats with this approach. The most obvious one is that it does not scale well to very large levels — if a level has a very large area covered in circuitry, then the RenderTexture will be less accurate (resulting in connections being detected where they don't exist) or the resolution will need to be increased (which degrades performance quickly). For the purposes of this game, none of our levels needed to be that large, so it didn't become an issue. Still, most of the processing time is spent in the flood fill algorithm, which I used more or less out-of-the-box from the Unity wiki, so if I were to attempt to improve performance, I would start there.
Secondly, the threshold is heavily biased towards white pixels rather than black ones, meaning we are much more likely to have false positives (i.e. circuits being grouped even though they aren't connected) than false negatives (i.e. circuits not being grouped even though they are connected). I found that false positives were rare and much less annoying than false negatives, so I consider this to be a reasonable trade-off. To my knowledge, solving this problem would likely require an entirely different approach to the circuit system, which would surely come with its own trade-offs.