The "blind bag test" for handload development...

Dan Newberry

Sergeant
Full Member
Minuteman
Apr 7, 2013
532
4
Wytheville, VA
www.bangsteel.com
A common theme in the threads in the Handloading section is "Will this (fill in the blank) help my accuracy?" Whether it be neck turning, weighing cases, deburring flash holes, getting Redding comp dies, etc., many of us wonder "Is it worth it?"

And the truth will generally be difficult to pin down. Mainly because in some instances __________ will help one's accuracy, and in others--it won't.

Being human, we can fall for the "magic mitt" effect with regard to whether or not something makes a difference. A children's story tells of a little boy who was having trouble catching the baseball during his Little League games until... until his father loaned him his "magic mitt." The child then believes that the mitt has magic, and he then--through his belief that it will work--performs better himself.

In another scenario, it is easy to get drawn into the belief that you must do A, B, C, etc., in order to get even a modicum of accuracy. By skipping any of these "benchrest proven" steps we spoil our chances (so we are taught) to an acceptable outcome with our field rifles. Thumb through the latest flyer from any reloading supply retailer. It's easy enough to see that the industry feeds into and profits from the proliferation of such ideas.

Now. Getting back to what I mentioned earlier about truths. Many presumed truths will not be absolute because they will not apply to every situation. In an extreme example, if one were to neck turn a batch of .223 cases for a Ruger Mini 14, and compare the difference in accuracy to non-neck turned cases in that blunder-buss of a rifle he would almost certainly see no difference in the accuracy level. I don't think any thinking man would dispute this.

But somewhere along the continuum that runs between Ruger Mini 14 accuracy and the bug-hole accuracy of the winning benchrest rifle, neck turning--assuming the brass cases do not have evenly made necks--begins to make a difference. Flash hole deburring will begin to make a difference (assuming the particular lot of brass shows obstructions here, and the rifle firing the shots is accurate enough to display the difference). Reducing runout to .002" or less may begin to make a difference, again, depending on the abilities of the particular rifle--and this will also be a bullet dependent measurement, so .004" of runout with one bullet might make it misbehave in an extremely accurate rifle, but .006" of runout on another bullet may go unnoticed in that same rifle. Here is where one guy says "Four thousandths of runout makes a difference." And the other guy says "No it doesn't, I tested that theory." Only thing is, they were talking about different bullets in different rifles.

But where (again, on this continuum) do these things actually begin to matter? If, for instance, it could be scientifically proven that a particular rifle could realize an improvement in 600 yard accuracy when the runout amount was reduced from .004" to .002" with a particular bullet, how much difference might this actually make? Almost certainly very little. Maybe--and realistically--so little difference that any accuracy advantage would get "lost in the noise" of other factors like wind, shooter limitations, etc.

In the end, there is only one way to know whether a particular step of match prepping will help you or not. You've got to test the idea--and you have to test the idea correctly.

You cannot simply take two rows of assembled cartridges to the range or field and fire the "improved" group at one target and the "unimproved" group at another. Your own psychology--the placebo effect--will call "advantage improved" before the first shot is fired, and your subtle, unconscious behavior will quite possibly force the outcome that you already suspect, and perhaps want to believe. We're all human, and this effect is alive and well at all times. Scientists know this, which is why they take measures to conceal--often even from themselves--which group is which. A third party will hold the information as to which group of subjects got the medicine, and which got the sugar pills. And only after the test results are in do the scientists ask the third party to reveal which group was which. These scientists realize that their own body language and other behaviors might induce certain outcomes in the study groups, so they do not want to know which group is which as the test is under way.

And this is also how we should conduct tests to decide whether or not performing "presumed improvement A" (on sale from MidwayUSA this month~!) really helps or not.

So. Let's say we want to check the effects of neck turning in a particular rifle with a particular lot of brass. (And remember, we must realize that the final results of any such test will only be applicable to the test rifle, and the test batch of brass. Different rifles and different lots of brass will almost certainly realize different results).

We assemble forty or so cartridges, identical in all respects save the one that we're testing. This means that group A will have the necks turned, and group B will not have, or vice versa.

Once you have completed the forty or so cartridges, the next step would be to examine them all to see if it is obvious by looking at them which is which. If you can tell the difference with the Mark 1 eyeball then you'll have to enlist the help of an assistant when you get to the range. He/she will load the rifle for each shot while you look away. You should also look away as you eject the shell casing, and allow your assistant to pick it up and put it back into the bag.

Oh yeah. The bags. What you're going to do with the group A and group B cartridges--before you head to the range--is you're going to place each group into its own paper bag. Obviously you'll need to use identical bags (or boxes or whatever). Write "group A, turned" inside one bag, and put those shells in there. Write "group B, un-turned" inside the other bag, and put those shells into it.

Somewhere before leaving for the range, you'll deliberately mix these bags up. If you have to, have someone switch them around and give them back to you. The main thing, and the most important thing is that you do not know which bag is which. The human mind is amazing, and can even unconsciously discern little subtleties about one bag, and unconsciously realize which one it is (assuming you put the cartriges into the bags yourself, that is). So be thourough and careful with this "blind bag" test. If possible, have someone else fill and label the bags.

Set up two identical targets at the range. Two bullseyes on the same square of paper should be fine. Shoot two fouling shots from a clean barrel, and make sure those fouling shots are put together with the same powder as you're using in the test. Bullets with the same jacket material should also be used (i.e. moly, naked, etc.)

Next step. Most folks will not have ever considered this, but it is of vital importance to the validity of the test. Fire one shot from the first bag (remember, you don't know which is bag A or B at this time) at the first bullseye, then fire one shot from the second bag at the second bullseye. Keep alternating back and forth in this manner. This will spread the barrel fouling and heating effect evenly across both groups. This will also spread the "man, I'm getting tired and bleary eyed and I gotta take a whizz effect" evenly across both groups. If you've shot comparisons in the past without doing them in this alternating manner, I would respectfully question your results.

Once you've fired all forty (or so) shots onto the paper, then--and only then--do you reveal to yourself which bag was which.

And of course you'll know--more surely than the average guy doing the average test will know--whether the "improvement" actually helps or not.;)

Dan
 
As a physician I realize that as you say only double blind tests can truly reveal a difference! Your test set up is brillant and should yield incontestable results. Thanks for the time putting this together. I'll have to give it a try in the future when hand loading issues you described arise.
 
Simple and effective. So, while OCW seems a slightly different animal, it also may be subject to human bias.

What are your feelings about (eg) an "8-bagger" (or 8 cartridge markings done by an assisstant) blind OCW test? Added value, unnecessary complication, or the choice requires self-evaluation lol?
 
Double-blind tests are valuable for getting an accurate result by removing conscious and subconscious biases, which is why psychics and hologram bracelet salesmen hate them ;)
 
Basic scientific method, but well stated.

To help with your double blind test, the helper takes the rounds and puts them in other bags than you did, and labels them with 1 and 2 instead of A and B. Anything to break up the continuity of you knowing what is what.

BEST way would be for one person to make the rounds. Another person to load, and a different person to shoot.

But other important point is NOT extrapolating your findings to EVERY rifle, shooting EVERY caliber, and EVERY bullet, and EVERY other variable. Until you test that SINGLE change, on a number of rifles, calibers, bullets, etc. To see if this is a universal truth or not.
 
Something else to consider... if the relative improvement between process 'A' and process 'B' is fairly small, you need to increase the sample sizes to be able to reliably determine whether the improvement is real, or due to random chance. Aka increase the 'power' of the test. While doing some form of blind testing to remove bias is definitely a step in the right direction, I think its very possible to still end up very frustrated when a given load/process/component appears to give superior performance in load development, but fails to perform up to expectations later on. There are formal methods for doing this sort of thing, but generally speaking, just doing an extended test (what I typically refer to as an 'acid test') under actual use conditions before committing to purchasing large quantities of said components is a good idea ;)
 
Very true.

But when you shoot in different conditions, you are doing another test.

One thing about Dan's OCW method is that it looks for a charge weight that will be usable even when various things change a bit.
 
A science fiction writer (I think it was Asimov, or maybe Heinlein) wrote that 'Anything which is indistinguishable from magic..., is magic.'.

Well; I'm not in the business of magic.

With this in the back of our minds, I think we overcomplicate our load development process when we attempt to split our hairs into finer and finer denominations. In my world, this is precisely the kind of information I am NOT looking for. If the product of my endeavor is so easily influenced, that's an attribute I'm trying to minimize. The stuff I'm looking for is going to be robust and invulnerable to such ephemeral influence, or I'm not going to follow such an avenue to any sort of a conclusion. This is why I choose things like propellant burn rates and bullet vs twist rates with additional care at the earliest stage of a development cycle. Pushing limits and establishing critical dependencies are not an intentional part of my agenda. This is where the pursuit of ultimate accuracy, as opposed to reliably adequate accuracy, derails our intended outcomes.

Now, about the unconscious influence of prior knowledge on the testing process; I think it's a question of mental discipline. I'm not saying that it's not a factor, I'm saying it's a matter of preconception. If we are truly seeking to find out how a specific variable influences an outcome, making any kind of an assumption about outcomes is simply faulty logic. If we think we can predict an outcome, we're usually dead wrong, and it's only pride that allows us to assume such an omniscient attitude. We test precisely because we don't know; because if we knew, the testing would be redundant. Likewise, knowing which variable is crucial, and knowing which value of that variable is present, can allow us to place most emphasis on technique, deriving the most relevant benefit from that particular increment.

I've allowed myself to get quite anal about randomizing human inputs and assumptions in the testing process. My ultimate conclusion about that approach is that I'd have saved a lot of time and trouble by simply ignoring the entire issue. It's just not relevant. If I'm doing my job right, I'm so busy doing that job right that such influences never even occur to me at the time. If they do, they are simply a distraction, and that is an especially good place for that mental discipline I was talking about to intervene.

Stop and think. Being humans, we are all subject to human bias. Trying to separate that bias from the human is doomed from the start. Moreover, success should be, maybe must be, invulnerable to such bias, must work in a manner that consciously recognizes and acknowledges that we constantly exist within the midst of this bias. Only then are our solutions viable in a real world. So any goals need to be framed in such a context.

Greg
 
Last edited:
You cannot simply take two rows of assembled cartridges to the range or field and fire the "improved" group at one target and the "unimproved" group at another. Your own psychology--the placebo effect--will call "advantage improved" before the first shot is fired, and your subtle, unconscious behavior will quite possibly force the outcome that you already suspect, and perhaps want to believe.

I have soooooo fallen into this trap :embarrassed:
As always, thanks Dan. Fantastic write-up, thank you for sharing your info and knowledge:cool:
 
The problem is, you cannot control that bias.

That is why all proper scientific studies are done as double blind studies. The people doing the work do not know which is the control and which is the thing being studied, and they do not know the expected outcome.

And unless you expect some outcome, you wouldn't do the study in the first place. :) You set a hypothesis, and test to see if you are right or wrong.

With shooting, the hypothesis is that if I do X, it will improve something. Very few people test to see what makes things WORSE. :)
 
I think it was Clarke and it was more like "any sufficiently advanced technology is indistinguishable from magic."

The third of his three laws.

Clarke's Three Laws are three "laws" of prediction formulated by the British writer Arthur C. Clarke. They are:
1 - When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2 - The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
3 - Any sufficiently advanced technology is indistinguishable from magic.
 
I think it was Clarke and it was more like "any sufficiently advanced technology is indistinguishable from magic."

Clarke it is.

That wasn't really what I was trying to say, but I'll take it anyway.

Maybe I'm different, and maybe I'm just kidding myself, but doing the double blind bit never rendered any different results for me than just marking things according to what they were and then running the tests. Moreover; as I suggest, I think that faking myself out is excessive, maybe (for me, anyway) pretentious, and accomplishes little more for me than sewing self doubt. Believe me, I already have more than enough of that without going out of my way to add another helping.

Real professionals do scientific studies, with reports and peer review and that formal stuff. Me, I'm just trying to find a handload that my rifle likes.

But to each their own; I just wanted to add my own (usually kooky) viewpoint.

Greg
 
Last edited:
The point is, did you really find a better load, or did you just shoot more carefully with the loads that SHOULD have been better.

Doulbe blind tests take that out of equation.

During his course, Dan mentioned a top shooter who was loading for a big match. He was very careful loading and any rounds that did not feel right or he had any doubt about he set aside. He went to the match, easily won it, and on the way home, realized he had brought the wrong box of ammo, not the carefully selected rounds, but the rounds he had set aside as not good enough. But as he was shooting the match, he THOUGHT he had his best rounds.
 
Funny. All my weighed brass, turned necks,cleaned primer pockets, deboned primer pockets, checked for concentricity, voodoo applied sacrificed a live rooster LOADS, have all revealed one thing to me. AS long as I use good components, anneal, and find the proper load-node for my barrel(ok maybe that is like 2 or 4 things, I am too stupid to count, but some people already knew that) The accuracy will be there in spades.

I agree with what you are saying Dan, I have said it myself for some time. I will also add that I think a lot of the reloading gear makers have gotten like the fishing tackle industry in some ways.