Interpretable Foundation Models
We have developed foundation models that can reliably explain their reasoning and are easy to steer, debug, and align.
Interpretable Model API
We have developed interpretable foundation models (LLMs, Diffusion models, and large-scale classifiers) that can:
explain and steer a model's output using human-understandable features;
provide reliable prompt attribution, i.e., indicate what part of the prompt is important; and,
identify tokens, in the pre-training and fine-tuning data, that most influence the model's generated output.
1from guidelabs import Client
2
3gl = Client(api_key="<your secret key>")
4prompt_poem = "Once upon a time there was a pumpkin, "
5
6response, explanation = gl.chat.create(model="cb-llm-v1",
7 prompt=prompt_poem,
8 prompt_attribution=True,
9 concept_importance=True,
10 influential_points=10)
11
12# Explanations
13prompt_attribution = explanation['prompt_attribution']
14concept_importance = explanation['concept_importance']
15influential_points = explanation['influential_points']
1from guidelabs import Client
2
3gl = Client(api_key="<your secret key>")
4
5# upload file
6gl.files.create(file=open("pathtodata.jsonl", "rb"),
7 purpose="fine-tune")
8
9# fine-tune model
10gl.fine_tuning.jobs.create(traing_file="file-name",
11 model="cb-llm-v1")
Fine-tuning API for interpretable models
Get high quality output and explanations
Our interpretable models are as accurate and high performing as standard foundation models. However, we also provide explanations.
Fine-tune the model with customized data
Use your own data to insert high-level concepts into the model to steer and control the model's output.
Interpretable models that provide reliable explanations for debugging, aligning, and steering the model.
Novel Research
Our team built these new models
based on a decade of insights from the machine learning literature.
Accurate and Interpretable
Our interpretable models do not underperform
black-box counterparts.
Prompt Attribution
Identify the part of the prompt that
the generated output is most sensitive to.
Data Attribution
Identify which pre-training and/or fine-tuning data
is most influential for the generated output.
Concept Explanation
Customize and explain the foundation model
using high level human understandable features provided
by a domain expert.
Multi-Modal
Our models can be trained/fine-tuned on any
data modality.
Interpretable ML Experts
PhD in ML interpretability from MIT,
Google Brain, Meta, & Prescient Design,
Published > 10 ML interpretable papers.
PhD in ML interpretability from MIT,
Meta & Apple,
Key developer on Captum package.
Join the Waitlist for Exclusive Early Access!
We are currently working with selected users to test these models. Sign up to get early access.
Stay Informed
If you have any questions, please feel free to contact us at info@guidelabs.ai or julius@guidelabs.ai
Follow us on Twitter @guidelabsai for any news and product updates.