Extract first histogram-only pass in DeviceTopK into its own kernel
#8299
+311
−221
DeviceTopK into its own kernel
#8299