Looks like a good start. As you say, all systems have bias. Refining any model to reduce that comes down to eliminating the extraneous influences where you can, so take this as advice on potential issues that are fixable
What scoring system did you normalize against? Most functional scoring...