The meeting with a client this morning was postponed for about an hour. My job there was to find a way how to adjust a small single-purpose machine that was used to continuously check the quality of a manufacturing process. Every four hours a batch of five devices under test were inserted into the machine by a dedicated operator and the quality of the manufacturing process was assessed.
At first glance, there seemed to be no problems with the machine or the process whatsoever. The operator came with a five freshly made pieces. She inserted the first one into the machine and fiddled with the piece, until the machine turned the red indicator into a green one, signaling that the first device was conforming to the standards. The big red flag. Hidden behind the nice, comforting green glow.
The operator then proceeded to write the value from the display into the spreadsheet. Then she put the second piece into the machine. It took a little longer, but the green light eventually came. Again, she proceeded with the spreadsheet. This went on for all five devices.
Confirming the measurements
When she was finished, she left and I have turned to the client and asked how can we find if the values measured by the machine are indeed inside the allowed ranges. By that time, I was almost sure that if the machine was generating random numbers instead of an actual precise measurements, the final spreadsheet would look the same. We had to check the numbers.
To my even deeper surprise, the process that should confirm or reject the data definitely seemed even more random. I was standing near a piece of equipment that consisted of a very large magnifying glass with two thin black lines crossing in its center combined with the manual controls and a digital display that displayed the values. The values were the precise distance the cross has traveled along the vertical and horizontal axis.
Operator handling this equipment pointed the cross on an arbitrary point on the black silhouette, pressed one button and moved the cross to a different, seemingly arbitrary point on that silhouette. He read the value, and repeated the process on a different part of the silhouette. Then he with the absolute certainty confirmed that the device dimensions were conforming to the required ranges.
To be fair, I was informed beforehand that the equipment needed maintenance and I should only take it as an illustration. The lightbulb that would reveal the surface features on the given silhouette was broken and needed replacement. An operator admitted he had chosen the points by heart. But that meant we had no way of making sure that the original machine in question was making the right measurements and decisions based on them.
Converting pixels to millimeters
Back at the original machine, the client was certain that the operator was doing the work the same way for multiple years and there was no complains from their customer. That only in a recent weeks it took more fiddling to get the green light and if I could do something about it.
Since I was already there, I wanted to do my best. The bulk of the machine was an inspection camera. I was not terribly familiar with its GUI, but I knew some basics. Here, I could again see the black silhouette of the device under test. This time the silhouette was expected however, as the inspection cameras work precisely this way. The light is shone against the device under test against the sensor and then the inspection tools are applied on the result. The tools can then find where the pixel color changes from white to black and act on it. Of course there are also different ways the inspection cameras work, but this is among the core principles.
The camera was doing to measurements on the device, as already pointed out. The measurements the operator was presented was in millimeters. Yet the camera only knows pixels. The way the camera calculates distance in millimeters is to take the number of pixels and multiply it with a coefficient. The coefficient is calculated beforehand, based on the camera resolution and the distance of the object from the sensor. After a few minutes of looking around the GUI I was able to find this coefficient. But there was a catch. The coefficients were slightly different for both the measurements, even though they were done in the same distance. Because the coefficients did not match, I presented the possibility that one, or maybe both coefficients were artificially adjusted to make values conform to the required ranges.
In the end, the machine was still glowing red more often than they were used to. Did the machine started measuring the devices differently or did the devices change slightly? Without the other working equipment that could provide the reliable measurements we could not be certain what was the cause. But the fact that the coefficients were different made people think. Was the machine really flawed? Was my thinking flawed? Or was there something completely different at play. It was however too easy to adjust the coefficient to make the measurements fall within the range more easily. Anyone operating the machine could do it. I may never find out. Being a potential contact worker I was not presented with all the details. But hopefully it will spark more open discussion inside the organization, that would revisit the correctness of the given process. If they decide to hide the issue it could cause unnecessary problems in the future that could be avoided.
This is a 2nd post of #100daystooffload.