Docs

MAX provides a unified and extensible platform that includes everything you need to deploy low-latency, high-throughput AI inference pipelines into production.

Get started with MAX What is MAX?

MAX Engine

MAX Serving

Mojo

What you can do with MAX

MAX Engine

Benchmark any model without any code

Use a simple command line tool to execute any model in MAX Engine with MLPerf.

MAX Engine

Write a custom op with Mojo

Create custom ops for your model that get optimized with the rest of the graph.

MAX Serving

Start an inference service in Triton

Try MAX Serving in a container and respond to inference requests from an HTTP/gRPC client.

Mojo

Write Mojo code that uses Python

Learn how to write Mojo code that interoperates with Python packages like NumPy and Matplotlib.

MAX Engine

Try Llama2 or Stable Diffusion

Check out our code examples that run inference with a variety of model.

Mojo

Start coding with Mojo in your browser

Go to our Mojo coding playground that's built into this website. There's nothing to install.

MAX Engine

Run an existing model from Python

Learn how to run inference using a model from PyTorch, TensorFlow, or ONNX.

MAX Engine

Build an inference graph in Mojo

Learn how to build a high-performance inference graph in Mojo with the MAX Graph API.

MAX Engine

Benchmark any model without any code

Use a simple command line tool to execute any model in MAX Engine with MLPerf.

MAX Engine

Write a custom op with Mojo

Create custom ops for your model that get optimized with the rest of the graph.

MAX Serving

Start an inference service in Triton

Try MAX Serving in a container and respond to inference requests from an HTTP/gRPC client.

Mojo

Write Mojo code that uses Python

Learn how to write Mojo code that interoperates with Python packages like NumPy and Matplotlib.

MAX Engine

Try Llama2 or Stable Diffusion

Check out our code examples that run inference with a variety of model.

Mojo

Start coding with Mojo in your browser

Go to our Mojo coding playground that's built into this website. There's nothing to install.

MAX Engine

Run an existing model from Python

Learn how to run inference using a model from PyTorch, TensorFlow, or ONNX.

MAX Engine

Build an inference graph in Mojo

Learn how to build a high-performance inference graph in Mojo with the MAX Graph API.

Want to know
more about MAX?

Read about MAX

One AI runtime for any model from any ML framework on any hardware

Unparalleled performance for generative and traditional AI models

Compatible with the tools and technologies you already use in production

Provides model extensibility with custom ops and kernels written in Mojo

Docs

MAX Engine

MAX Serving

Mojo

What you can do with MAX

Want to know more about MAX?

Want to know
more about MAX?