.

Research and latest news.

Code Review Bench: The Software Factory's Inspection Problem

K-Steering: Controlling Multiple Behaviors in Language Models at Once

February 26, 2026

Code Review Bench: Towards Billion Dollar Benchmarks

January 30, 2026

ARES: Open-Source Infrastructure for Online RL on Coding Agents

January 15, 2026

Beyond Static Mechanistic Interpretability: Agentic Long-Horizon Tasks as the Next Frontier

December 7, 2025

Martian Interpretability Challenge, Part 2: The Core Problems In Interpretability

October 30, 2025

Beyond Beyond Monoliths: An Exploration of Martian’s Position Paper - Part 1

October 3, 2025

Up and to the left! How Martian Uses Routing to Push the Pareto Frontier

September 30, 2025

Hiring Announcement: Fazl Barez

August 18, 2025

Approximating Human Preferences Using a Multi-Judge Learned System

Research Highlight: Guardian Loop

Beyond Monolithic AI: The Case for an Expert Orchestration Architecture

December 6, 2024

AI Safety Grant Update: Purging Corrupted Capabilities across Language Models

September 16, 2024

Martian Partners with Accenture, Launches Airlock Compliance for Enterprises

Claude Sonnet 3.5 Release: Token Prices and Jevons Paradox

Cracking the Code: Automated Prompt Optimization. Insights from Industry Leaders

Scaling AI Interpretability

AI Safety vs Capitalism

The Sustainability Challenge of AI: Tackling the Energy Footprint of LLMs

Model Mapping: The Key to AI Alignment and Beyond

Introducing RouterBench

Introducing Martian - Better AI Tools Through Better Understanding