Why measuring performance is our biggest blind spot in quantum machine learning