Amanda Askell

哲學家暨 AI 對齊研究員

關於

Amanda Askell 是 Anthropic 的哲學家與研究員，主導 Claude 的性格與價值觀塑造工作。她擁有紐約大學哲學博士學位，研究橫跨道德哲學與 AI 對齊——致力於打造有益、無害且誠實的 AI 系統。她在發展 Claude 的人格特質與倫理準則上居功厥偉，將嚴謹的哲學思維帶入塑造 AI 行為的實踐挑戰之中。

主要貢獻

主導 Claude 的 character 工作，讓模型性格成為有意識設計的介面，而非偶然副產品
將道德哲學帶入對齊實務，特別是誠實、無害、自主性與拒答等問題
參與 Anthropic 的 Constitutional AI 路線，透過書面原則在規模化訓練中塑造模型行為
顯示人文專業能在前沿實驗室內部發揮作用，不只是外部批評，也能直接影響產品
她的工作也提出「AI 性格」背後的治理難題：誰的價值會成為數百萬使用者面前的預設值？

影片與訪談

Amanda Askell Answers Questions About Claude's Character

Anthropic philosopher answers community questions about her work shaping Claude

View Details

Amanda Askell Segment: Claude's Character Training

Segment from Lex Fridman Podcast #452 discussing the ethical and epistemic virtues Claude should enact

View Details