Submissions from runanywhere.ai

		Fastest LLM decode engine on Apple Silicon. 658 tok/s on M4-max,beats mlx by 19% (runanywhere.ai)
		5 points by sanchitmonga22 3 months ago \| past \| 3 comments