The guy in this YouTube video mentions that Go does not efficiently compute PBKDF2: https://www.youtube.com/watch?v=k_szwKBuNBw&t=418
Starting at 6:58 -- 9:00
I don't understand the internals but please take a look if this is still not optimized from 4i -> 2+2i