A CUDA core is just a vector processor like every GPUs since the late 90s has been made of, but with a different name so it sounds special. It doesn’t just run CUDA, it runs everything else a GPU has traditionally been for, too, and that was stuff people were doing before CUDA was introduced. There are lots of tasks that require the same sequence of operations to be applied to groups of 32 numbers.
An NPU is a much more specialised piece of hardware, and it’s only really neural network training and inference that it can help with. There aren’t many tasks that require one operation to be applied over and over to groups of hundreds of numbers. Most people aren’t finding that they’re spending lots of time waiting for neural network inference or draining their batteries doing neural network inference, so making it go faster and use less power isn’t a good use of their money compared to making their computer better at the things they do actually do.
A CUDA core is just a vector processor like every GPUs since the late 90s has been made of, but with a different name so it sounds special. It doesn’t just run CUDA, it runs everything else a GPU has traditionally been for, too, and that was stuff people were doing before CUDA was introduced. There are lots of tasks that require the same sequence of operations to be applied to groups of 32 numbers.
An NPU is a much more specialised piece of hardware, and it’s only really neural network training and inference that it can help with. There aren’t many tasks that require one operation to be applied over and over to groups of hundreds of numbers. Most people aren’t finding that they’re spending lots of time waiting for neural network inference or draining their batteries doing neural network inference, so making it go faster and use less power isn’t a good use of their money compared to making their computer better at the things they do actually do.