I agree that it would be helpful to have an accessible system for interface tests (i.e. registering tests for AbstractArrays that the author of a custom array type can easily find and run against their own type). Invenia published a package and/or a workflow for interface testing, but it is not well known and it requires a fair amount of ceremony to get right. This could establish whether e.g. addition must be commutative for some abstract type, or important facts about how iteration should behave.
For the issue of packages using the interface of a value incorrectly, I think tooling and greater awareness of the issues could help. An important part might be some way of discovering what the interfaces actually are. That would be handy both for people implementing new types and for people using interfaces.
Simple tools like an easy way to run a package’s tests but replacing regular arrays with star wars arrays and with @inbounds turned off might be handy. Linters and property testing have been successful for other languages, I think.
I don’t see any quick way around the correctness issues with our statistics libraries. Perhaps someone could write a bunch more tests or copy tests from R or python libraries that do similar work?