I like the critiques of consequentalism and deontological principles, but "virtue" still feels like a stand in for the flexible nuanced elusive thing we want AIs aligned to. Like conceptual negative space.
Would Anthropic's move from the earlier Constitutional AI to the current "soul" document approach be considered moving from a purely deontological (obey rules) approach to a proto-virtue approach (express certain traits like honesty, kindness, helpfulness, and possess a stable disposition)?
I like the critiques of consequentalism and deontological principles, but "virtue" still feels like a stand in for the flexible nuanced elusive thing we want AIs aligned to. Like conceptual negative space.
https://www.meaningalignment.org/
are you aware of their work?
Would Anthropic's move from the earlier Constitutional AI to the current "soul" document approach be considered moving from a purely deontological (obey rules) approach to a proto-virtue approach (express certain traits like honesty, kindness, helpfulness, and possess a stable disposition)?