Interesting post.
"--the difference between 25 and 30 is negligible, though of course you must draw the line somewhere"
fyi,
I believe the 30 piece number is the minimum required for a 95% confidence of an assumed standard distribution. Statistical confidence drops rather sharply as the sample size goes down from 30. Conversely, it grows rather slowly as it gets bigger (the 300 number is 99%).
So, it's just a "most bang for the buck" number.
For this reason, the 30 number was adopted by Shanin quality techniques for iso-plots and such. (as well as Six-sigma folks and everyone else)
As with every rule there are plenty of exceptions. Data is only as smart as the people using it.
If you poke around with Google for "central limit theorem" and "Shanin",
You can dive into it as deep as you like. (assume you have trouble sleeping at night
)
Suffice it to say that quit a few really,
really, smart folks say that's where the line should to be drawn.
The other side of the coin is: you are correct about subsystems or families of data.
If you are not looking for a specific defect and are willing to look at broader families of data (multiple years or multiple subgroups), high failure rates do not need as much data to accurately reflect general quality.
You can pick any number you want as a target sample size, but as you said, the more you capture, the more you will see families of data. Example: cars made in June/July often stuggle more because this is the normal timing for model launches and model year change issues. I have not seen anyone try to publicly capture that data before. That would be a cool one to add into the mix of rankings.