The whole AI chip conversation always focuses on the compute side and ignores memory. Modern AI accelerators are almost always memory bandwidth limited, not compute limited. Any custom chip that does not solve the memory problem is not going to be dramatically better.
