Ok, I hope this doesn't come off as combative. I'm no believer in positive compensation (I think tuning should result in less muzzle movement), and you seem to actually be doing more real testing than most people. I'm just trying to understand the actual test that you're doing. Let's leave the vector aside, as I agree the vector can't be measured, and surely clamping the gun is the best we can do anyway. Although, if you're clamping the muzzle, that's an issue, as the whole positive compensation requires the muzzle to move doesn't it?
If what you're doing is saying "these bullets fall within the a predicted confidence interval (of whatever % confidence you're going with) based on velocity and BC" then I'm not sure that's actually the right test. Especially if the predicted confidence intervals of all the observed velocities overlaps, and the shots are falling into the confidence interval of all those velocities. Then, your test doesn't have the power to really say much at all. Sure, *if* you found bullets landing way outside the predicted interval, that might support positive compensation, but it's not true to say "since we don't see that, it's not there".
Absence of evidence is not
evidence of absence, as they say. Perhaps you're actually doing something else though.
Two ways I would look into this: first, given a load with a certain average velocity, do lower velocity shots tend to land high within their confidence interval and do higher velocity shots tend to land lower within theirs? Do the impacts tend to have lower dispersion (variance in the waterline) than predicted? Both of those outcomes would point towards a "tuning" effect, but wouldn't land outside of predicted intervals.
Second, positive compensation seems to say that the launch angle changes for shots based on velocity. Ok, you can't measure the angle, but we're clamped so the bore should be nominally pointed in the same direction as best we can control. But launch angle isn't the only prediction... along with this, there should only be one "tuned" range where shots return to a small dispersion. There should also be a point of maximum angular dispersion at an intermediate range... why not test that? If you know where the groups are supposed to be tight, you can calculate a flight path and find where the maximum dispersion is supposed to be. If the groups there have a smaller MOA than the farther "tuned" distance, then it wouldn't support positive compensation. If someone claims that their gun shoots tighter *everywhere* then that's not positive compensation anyway.
On the sidebar, again, 30 observations doesn't "fix" anything. For a given variable, higher numbers of observations shrinks the confidence interval, but that doesn't mean that 30 observations results in small confidence intervals and 10 results in large confidence intervals. 30 will just narrow it down, but how much isn't static. It might not help much if the variable being measured is random as hell. But it's entirely possible to have narrow confidence intervals with lower observation counts if the variable has a low variance. It's absolutely possible to have confidence intervals as wide as the ocean with 100 observations if there is a lot of underlying variation.
A real problem for us is that shooting is destructive testing, as you say, more can result in diminishing returns, and in fact can actually result in worse data. Like, more observations
should narrow down our confidence intervals, but barrel fouling and wear may actually introduce more noise rather than less. We have to go for a parsimonious data collection. If someone has to shoot 30 rounds per test, is the 30th shot really going down the same bore at that point? Should we clean it? Should we not? If we then do another test of 30 rounds, how do we start to compensate for throat erosion? Especially for hot magnums and short barrel life rounds. Etc. There's no perfect way to do it.