Our Blog

An image of translucent blue fabric-like form against a clean white background.

Interpretable Intelligence: AI you can Understand and Trust

March 19, 2026

Minimalist illustration of a concept tree with one branch fading to represent unlearning

Alignment Without Retraining: Auditing and Controlling Steerling-8B

March 19, 2026

Abstract blue translucent shape on a white background.

The FineWeb Concept Atlas

March 05, 2026

Magnifying glass over an image of a geometric neural network

Discovering human-understandable concepts in Steerling-8B

February 27, 2026

Steerling-8B interpretable language model

Steering Interpretable Language Models

February 25, 2026

Interpretable models

Steerling-8B: The First Inherently Interpretable Language Model

February 23, 2026

Abstract 3D colored triangular tessellation

PRISM: Training Data Prototypes for Language Models

December 08, 2025

Interpretable models

Scaling Interpretable Language Models to 8 Billion Parameters

December 06, 2025

Block causal diffusion model

Causal Diffusion Language Models

December 04, 2025

Stylized UMAP plot of concept clusters.

Atlas: Orienting the Pre-Training data of an LLM

December 02, 2025

An illustration of a Black man, shoulder-up view, on a light blue background. The man has a light orange shirt. His face is covered in colorful brush strokes.

Introducing Guide Labs: Engineering Interpretable and Auditable AI Systems

November 17, 2024