I wanted to comment on the behavior of sample ES at small sample sizes. It suffers the same consequence as sample SD. As can be seen below, sample ES can be underestimated much like sample SD when the sample size is small. Depending on the population SD, convergence onto the population value can be as small as 10 observations or as high as 30 observations. Since you do not know the population standard deviation, you have no idea how underestimated the statistic might be. Worse, the plots below show “on-average” results. No one is testing their rifle 500k times just to estimate a statistic. The minimums and maximums in the table illustrate how disperse any one test could be. As one would expect, more observations shrink the range between the minimum and maximum.
TLDR:
ES at small sample sizes are unreliable.
View attachment 8480321View attachment 8480322