How do Large Language Models Navigate Honesty and Helpfulness?

A3DC9F36-1916-4342-BB3B-818FFD9E7DD0.jpg

Do We Need Zero Training Loss After Achieving Zero Training Error?

image.png

The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks

image.png

Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought

image.png