Thinking Machines Lab Launches Tinker: A Breakthrough in AI Training APIs

Thinking Machines Lab Makes Tinker Generally Available: Adds Kimi K2 Thinking And Qwen3-VL Vision Input

Introduction

Thinking Machines Lab has officially released Tinker, its advanced training API, to the public. This innovative platform introduces significant enhancements including compatibility with the Kimi K2 Thinking reasoning model, OpenAI-compatible sampling features, and image input capabilities using Qwen3-VL vision models. This makes Tinker an indispensable tool for AI developers looking to fine-tune modern models without needing extensive distributed training setups.

what’s Tinker?

Tinker is designed as a user-friendly training API that specializes in fine-tuning large language models. What sets it apart is its ability to smooth out the complex process of distributed training. Users can create a straightforward Python loop that operates on just a CPU machine. You specify your data or reinforcement learning environment, outline the loss function, and define the training logic. Tinker then efficiently maps this loop to a cluster of GPUs to execute your specified computations.

Key Features of Tinker

The Tinker API provides a limited but powerful set of primitives that make the training process intuitive. These include:

forward_backward: Computes gradients for model training.
optim_step: Updates model weights based on the computed gradients.
sample: Generates outputs from the model.
Save/Load functions: For managing model states.

This simplicity allows users to concentrate on implementing supervised learning, reinforcement learning, or preference optimization without getting bogged down by GPU management issues.

Low-Rank Adaptation (LoRA)

Tinker employs Low-Rank Adaptation (LoRA) instead of full fine-tuning for the supported models. LoRA trains small adapter matrices on top of fixed base weights. This approach minimizes memory usage, making it feasible to conduct repeated experiments with large models in the same cluster.

What’s New in Tinker?

General Availability and Kimi K2 Thinking

The most notable update in December 2025 is that Tinker is now fully available without a waitlist. Anyone can create an account, explore current models, and work with cookbook examples directly from the platform. (CoinDesk)

Among the models available, users can now fine-tune the moonshotai/Kimi-K2-Thinking, which is a reason-based model featuring around 1 trillion parameters in a mixture of experts architecture. This model excels at processing long sequences of reasoning and complex tool usage, making it the largest offering in Tinker’s catalog.

Capabilities of Kimi K2 Thinking

In the Tinker lineup, Kimi K2 appears as a Reasoning MoE model. Alongside it, you’ll find Qwen3’s dense and mixture of experts variants, Llama-3 generation models, and DeepSeek-V3.1. Reasoning models like Kimi K2 generate internal thought processes before delivering final outputs, whereas instruction models prioritize response speed. You might also enjoy our guide on Bitcoin Stabilizes as ETF Demand Shifts and Altcoins Adjust.

OpenAI Compatible Sampling

Previously, Tinker had a native sampling interface through its SamplingClient. The latest version introduces a second sampling pathway that aligns with OpenAI’s completions interface. You can now reference a model checkpoint on Tinker using a URI, allowing for simple integration with existing OpenAI tools. For example:

response = openai_client.completions.create(
    model="tinker://0034d8c9-0a88-52a9-b2b7-bce7cb1e6fef:train:0/sampler_weights/000080",
    prompt="The capital of France is",
    max_tokens=20,
    temperature=0.0,
    stop=["\n"],
)

Image Input with Qwen3-VL

Another exciting feature is the introduction of image input capabilities via Qwen3-VL models. Tinker now offers two variants: Qwen/Qwen3-VL-30B-A3B-Instruct and Qwen/Qwen3-VL-235B-A22B-Instruct. These models are categorized as Vision MoE and can be used for both training and sampling through the same API.

To send an image to a model, you’ll create a ModelInput that combines an ImageChunk with text chunks. Here’s a quick example:

model_input = tinker.ModelInput(chunks=[
    tinker.types.ImageChunk(data=image_data, format="png"),
    tinker.types.EncodedTextChunk(tokens=tokenizer.encode("what's this?")),
])

In this snippet, image_data represents the raw bytes of the image, while format denotes the file type, such as PNG or JPEG. This setup is versatile enough for both supervised learning and reinforcement learning fine-tuning, keeping your multimodal workflows consistent.

Performance Comparisons: Qwen3-VL vs. DINOv2

To illustrate the capabilities of the new vision input feature, Tinker’s team fine-tuned Qwen3-VL-235B-A22B-Instruct for image classification tasks across four standard datasets:

Caltech 101
Stanford Cars
Oxford Flowers
Oxford Pets

Because Qwen3-VL is a language model that accepts visual input, image classification is framed as generating text sequences. When given an image, the model produces the corresponding class name in text format.

For comparison, a DINOv2 base model was fine-tuned alongside it. DINOv2 is a self-supervised vision transformer that encodes images into embeddings, commonly used in various vision tasks. A classification head was attached to DINOv2 to predict a distribution across multiple labels.

Both models utilized LoRA adapters within Tinker, emphasizing data efficiency throughout the training process. The team varied the number of labeled examples per class, starting with only one sample, and measured the accuracy of classifications at each stage.

Conclusion and Notable Insights

With Tinker now available to all, anyone can easily fine-tune open-weight LLMs using a straightforward Python training loop, while Tinker manages the complex distributed training backend. For more tips, check out Hyperliquid Unleashes HIP-3: A Game-Changer for Decentralize.

Tinker supports Kimi K2 Thinking, an impressive 1 trillion parameter reasoning model from Moonshot AI, now offered as a fine-tunable option in its model lineup. (Bitcoin.org)

On top of that, the platform’s new OpenAI-compatible inference interface expands its usability, allowing sampling from training checkpoints via a tinker:// URI through standard OpenAI clients.

Lastly, the introduction of vision input through Qwen3-VL models empowers developers to create multimodal training pipelines, significantly enhancing the functionality of Tinker. The performance of Qwen3-VL-235B, fine-tuned on Tinker, demonstrates superior few-shot image classification compared to the DINOv2 baseline across various datasets, showcasing the data efficiency of large vision language models.