AWS Neuron introduces speculative decoding April 10, 2024 By admin Today, AWS announces the release of Neuron 2.18, introducing stable support (out of beta) for PyTorch 2.1 and adding support for speculative decoding with Llama-2-70B sample in Transformers NeuronX library.