Skip to content

Medusa

Medusa decoding module for faster LLM inference.

Status

This module is currently under development. Check back for updates.

Planned Features

  • Medusa head implementation
  • Speculative decoding
  • Multi-token prediction