STDEV.S is the correct method. End of story.
The difference between the two (open google--cause statisticians don't use those terms, there is standard deviation (STDEV.P)--statisticians use sigma and then there is ESTIMATED standard deviation,(STDDEV.S) which is known a 's'.
Its like your measured values is "x" and the mean is "mu"
English variables mean "measured" and greek variables are "intrinsic"--properties of the distribution
Standard Deviation is the width on a Normal or Gaussian Distribution. The formula is:
View attachment 8323682
sigma is the Std Dev. mu (the u thingy) is the mean).
Its a complicated formula, but just know that sigma controls the width, and mu controls the location:
View attachment 8323683
That's pretty standard stuff, but that's pure THEORY (egads a running gag in my own posts).
When you MEASURE something you are sampling from that theoretical distribution (you get points at random, from a random distribution). Assuming your distribution is normal (not always safe), you can estimate the TRUE Standard Deviation by using the Estimated Standard Deviation, s. They key here is there is some unknown TRUE sigma which we try and estimate by taking samples. The Estimated and True will only begin to approach equal once the number of samples becomes very very large.
So the TLDR version is STDEV.S would be the statistician's choice to ESTIMATE the standard deviation of your velocity.
Example: I ran created a distribution with mean 2800 and simga 10. I grabbed 5 samples at random, 10 times and took the STDEV.P--However, we KNOW its 10 (I made it!). Here are the results:
"True Method"
[13.275930373733024,
8.342655830573312,
7.23926247655942,
5.156843761443447,
8.550960467769086,
13.76093532000941,
3.2590625589140196,
7.783767404854264,
3.911747500033366,
7.031416934586289]
Estimated:
[14.842941390110614,
9.327372775023447,
8.09374150227517,
5.765526599966628,
9.560264439422538,
15.385193404759432,
3.6437427123280806,
8.702516519150631,
4.373466660444734,
7.861363121939068]
With the estimated std dev, you get a value that is higher (instead of dividing by num of samples, you divide by n-1 for math reasons--in reality using the "true" method induces a bias towards 0)--so you could get lucky or you could get unlucky.
OR
Based on the following random data I "sampled":
array([2794.70526224, 2816.48889134, 2797.31856425, 2785.55297754,
2791.62315356])
I would publish an s of 11.67 with 95% CI of (4.07, 14.80) meaning I am 95% sure the TRUE sigma lies on the interval 4.07 to 14.80
You use confidence intervals : since I have to publish shit for the dreaded peer review, I would say my estimated std dev is 10.4 (4.07, 14.80)
So I read that as a sample size of 5 means dick either way. But "s" is the correct method.
Also statisticians don't use excel, they use R, which is a filthy language.
I teach data science/AI so I use python:
Go to google colab and reproduce my results. I prob made a mistake, but I fuck it. That's how you think of this as a "statistician"
colab.research.google.com
spaces are important in the code btw Due to RNG, your number may vary slightly from mine. I'm, not getting into psuedo-random numbers with howler monkies.
[FONT=courier new]import numpy as np
p=[]
s=[]
n=10
size=5
for i in range( n ):
x =np.random.normal(loc=2800, scale=10.0, size=size)
p.append(np.std(x))
tmp = (x-x.mean())**2
tmp = tmp.sum()
tmp = 1/(size-1) * tmp
tmp = np.sqrt(tmp)
s.append(tmp)
import matplotlib.pyplot as plt
plt.scatter(np.linspace(0,9,10),p,label="True")
plt.scatter(np.linspace(0,9,10),s,label="Estimated")
plt.legend()
plt.show()
sigma=[]
for i in range(100):
tmp = np.random.choice(x,5,replace=True)
print(tmp)
sigma.append(np.sqrt((((tmp-tmp.mean())**2).sum())/4))
print(np.percentile(sigma,95))
print(np.percentile(sigma,5))
print(np.sqrt((((x-x.mean())**2)/4).sum()))[/FONT]