Llama 3.3 70B Results

Llama 3.3 70B is listed with 2 endpoints from Together AI, Groq.

Provider Endpoints

Llama 3.3 70B has 27 public runs across 2 providers. Provider-hosted versions can differ in serving configuration, infrastructure, and latency.

Best Public Scores

  • IFEval: 100.0% across 7 public runs.
  • MuSR: 100.0% across 6 public runs.
  • GSM8K: 97.0% across 1 public run.
  • GSM8K: 96.0% across 2 public runs.
  • IFEval: 90.3% across 1 public run.
  • MMLU: 86.5% across 3 public runs.

Benchscope is a JavaScript app. If the interactive interface does not load, enable JavaScript or use the links above for the main public sections.