Accounting for Context: Shaping Moral Credences for Value Alignment

arXiv cs.AI 06/08/26, 04:00 AM Papers
moral-uncertainty value-alignment pluralistic-alignment ethics ai-alignment decision-theory simpsons-paradox
Summary
This paper argues that aggregating moral evaluations for AI value alignment must account for contextual factors, showing that ignoring context can lead to violations of the weak Pareto principle, analogous to Simpson's paradox.
arXiv:2606.06972v1 Announce Type: new Abstract: Ensuring that agent behaviours are aligned with human moral values inevitably raises the problem of how to account for the plurality of moral perspectives that societies -- and even individuals -- typically adopt. Work on moral uncertainty proposes mechanisms to fairly and democratically aggregate evaluations of actions across different moral theories. However, this paper argues that one needs to account for contextual factors when aggregating moral evaluations. For example, consequentialist perspectives assume an ability to accurately determine how an agent's actions change the world; an assumption that often does not hold in real world settings. We, therefore, formalise agent decision making under moral uncertainty, while also accounting for these kinds of contextual factors. We thereby show that a seemingly commonsensical property -- the weak Pareto principle -- is violated. We argue that this apparent problem is, in fact, a variation of Simpson's paradox, and hence reveals the limitations of aggregation mechanisms that ignore the impact of contextual factors.
Original Article
View Cached Full Text
Cached at: 06/08/26, 09:14 AM
# Accounting for Context: Shaping Moral Credences for Value Alignment
Source: [https://arxiv.org/html/2606.06972](https://arxiv.org/html/2606.06972)
###### Abstract

Ensuring that agent behaviours are aligned with human moral values inevitably raises the problem of how to account for the plurality of moral perspectives that societies – and even individuals – typically adopt\. Work onmoral uncertaintyproposes mechanisms to fairly and democratically aggregate evaluations of actions across different moral theories\. However, this paper argues that one needs to account for*contextual factors*when aggregating moral evaluations\. For example, consequentialist perspectives assume an ability to accurately determine how an agent’s actions change the world; an assumption that often does not hold in real world settings\. We, therefore, formalise agent decision making under moral uncertainty, while also accounting for these kinds of contextual factors\. We thereby show that a seemingly commonsensical property – the*weak Pareto principle*– is violated\. We argue that this apparent problem is, in fact, a variation of*Simpson’s paradox*, and hence reveals the limitations of aggregation mechanisms that ignore the impact of contextual factors\.

## 1Introduction

*Value alignment*aims at ensuring that the behaviours of intelligent \(and ultimately ‘superintelligent’\) agents are aligned with human ethical values\(Gabriel[2020](https://arxiv.org/html/2606.06972#bib.bib44)\)\. However, human societies rarely exhibit moral consensus; societies \(and even individuals\) evaluate actions according to different ethical theories that may advocate conflicting prescriptions\(Haidt[2012](https://arxiv.org/html/2606.06972#bib.bib43); Awadet al\.[2018](https://arxiv.org/html/2606.06972#bib.bib88); MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9)\)\.*Pluralistic alignment*\(Sorensenet al\.[2024](https://arxiv.org/html/2606.06972#bib.bib162)\)\(also called*democratic value alignment*\(Gabriel[2020](https://arxiv.org/html/2606.06972#bib.bib44)\)\) is therefore concerned with aligning agents’ actions in a way that respects the plurality of ethical values found in human societies\.

A notable approach to value alignment adopts a multi\-step pipeline111We present a modified and simplified version here\.\(Noothigattuet al\.[2019](https://arxiv.org/html/2606.06972#bib.bib127)\)that: 1\) elicits individuals’ ethical preferences in simulated, ethically\-salient scenarios \(e\.g\. the Moral Machine experiment\(Awadet al\.[2018](https://arxiv.org/html/2606.06972#bib.bib88)\)\); 2\) uses some form of inverse reinforcement learning\(Adamset al\.[2022](https://arxiv.org/html/2606.06972#bib.bib2)\)to derive the individuals’ reward \(and utility\) functions from these preferences; 3\) aggregates these functions so as to appropriately account for the diverse preferences, where the results of aggregation are adopted as the agent’s reward function\.

This paper focuses on the third step with implications for the whole pipeline; specifically, we focus on formalisms inspired by recent philosophical studies of ‘*moral uncertainty*’\(MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9)\)\. The core idea is to extend the standard decision\-theoretic framework to account for uncertainty about which ethical theories should be used to evaluate actions\. This uncertainty is expressed in the form ofmoral credencesin various ethical theories222Extending the notion of credences \(subjective probabilities\) standardly associated with beliefs\(Skyrms[2000](https://arxiv.org/html/2606.06972#bib.bib10)\)to credences in moral theories\.\. For example, given ethical theoriest1t\_\{1\}andt2t\_\{2\}, each assigning numerical evaluations to actionsaaandbb, these evaluations are aggregated using a social choice framework333Note that within pluralistic alignment,*social choice*is often used to formally study the process of aggregating different values\(Baum[2020](https://arxiv.org/html/2606.06972#bib.bib141); Qiu[2024](https://arxiv.org/html/2606.06972#bib.bib164); Conitzeret al\.[2024](https://arxiv.org/html/2606.06972#bib.bib163)\)\.that accounts for the relative proportion of individuals endorsing each theory\. Intuitively, these proportions are expressed as moral credences444Indeed, these credences may express anindividual’srelative confidence in the appropriateness/suitability of the theories’ possibly distinct evaluations\. Consequently, moral uncertainty has been used in AI research to aggregate the values of different groups within societies\(Bogosian[2017](https://arxiv.org/html/2606.06972#bib.bib21); Bhargava and Kim[2017](https://arxiv.org/html/2606.06972#bib.bib67); Ecoffet and Lehman[2021](https://arxiv.org/html/2606.06972#bib.bib30); Martinhoet al\.[2021](https://arxiv.org/html/2606.06972#bib.bib59)\)\. In summary then, moral uncertainty advocates explicit reference to the plurality of ethical theories that account for the diversity of elicited human evaluations in the above mentioned pipeline\.

In this paper, we examine how contextual factors influence the moral credences assigned to ethical theories and, consequently, how accounting for these factors impacts the outcomes of moral aggregation\. Consider an agent and the question of whether the agent should evaluation actions from a deontological or utilitarian perspective\. Suppose a population is evenly split:50%50\\%prefer a deontological evaluation and50%50\\%a consequentialist one, resulting in equal credence \(0\.50\.5\) assigned to each theory\. These evaluations are understood as measures of the actions’ overall ‘goodness’\. However, such credences are typically elicited in response to generic scenarios – such as those used in standard trolley problems\(Foot[1967](https://arxiv.org/html/2606.06972#bib.bib56)\)or large\-scale studies like the Moral Machine experiment\(Awadet al\.[2018](https://arxiv.org/html/2606.06972#bib.bib88)\)– where context\-specific details are abstracted away\. In practice, it is neither realistic nor adequate to ignore these contextual factors\. For instance, in a particular setting, the availability of computational resources for accurately estimating the consequences of actions should influence the weight \(i\.e\. moral credence\) assigned to a consequentialist evaluation\. This suggests that moral credences are not static, but should be dynamically adjusted in light of relevant contextual constraints\.

#### Running Example

Let us present our running example; we use this example to demonstrate many of our results regarding contextual features\. Consider a small, mobile firefighter robot called FROBO whose main aim is to contain fires and facilitate rescue in environments inaccessible to humans\. We suppose FROBO is deployed in a burning hospital, in a country that is under military blockade\. FROBO has been sent to a hallway with two rooms, one on the left and one on the right, both of which contain fires\. FROBO only has the capacity to successfully deal with one of these fires\. The left room contains a significant amount of insulin; a medicine that is extremely hard to come by while the country is under blockade\. Without this insulin, many diabetics may dieifthe blockade is not lifted\. On the other hand, there is an unconscious patient in the right hand room\. If FROBO does not attend to the right room, the patient will die\. What should FROBO do?

![Refer to caption](https://arxiv.org/html/2606.06972v1/FROBO.png)Figure 0:FROBO’s dilemma: whether to save a person in immediate danger \(right room\) or preserve enough insulin to treat dozens of patients \(left room\)
#### Contributions

Section[2\.1](https://arxiv.org/html/2606.06972#S2.SS1)provides the necessary background for our formal framework as well as a short summary of deontological and utilitarian ethics\. In Section[2\.2](https://arxiv.org/html/2606.06972#S2.SS2), we review a set of contextual features that, we argue, should be considered when aggregating moral preferences derived from simulation\-based studies\. Sections[3](https://arxiv.org/html/2606.06972#S3)and[4](https://arxiv.org/html/2606.06972#S4)presents the paper’s core contributions\. We begin \(Section[3](https://arxiv.org/html/2606.06972#S3)\) by formalising a model of moral uncertainty that explicitly incorporates these contextual factors\. Then, in Section[4](https://arxiv.org/html/2606.06972#S4)we demonstrate that a broad class of aggregation methods violates the*weak Pareto principle*, a property often regarded as non\-controversial\. We interpret this result as a variant of the well\-known*Simpson’s paradox*, which we take to highlight the conceptual and practical limitations of context\-insensitive aggregation\. We argue that this insight motivates a new area of research aimed at understanding how contextual features shape aggregation of moral evaluations\. Section[5](https://arxiv.org/html/2606.06972#S5)outlines key directions for future work\. In particular, we suggest an alternative approach to addressing the limitations of our current three\-step pipeline: the integration of ‘thicker’ ethical information, in particular through the use of structured dialogues and argumentation\.

## 2Accounting for Contextual Factors in Shaping Moral Credences

### 2\.1Background

This paper’s contributions are presented within a framework proposed by\(MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9)\)and formalised in\(Szaboet al\.[2024](https://arxiv.org/html/2606.06972#bib.bib70)\)\. LetTTbe a set of ethical theories andAAa set of actions\. Each ethical theoryt∈Tt\\in Tis a functiont:A→ℝt:A\\to\\mathbb\{R\}, so that for each actiona∈Aa\\in A,t\(a\)t\(a\)represents the*choiceworthiness*of actionaawith respect to theorytt\. The induced ordering⪯t\\preceq\_\{t\}of a theoryttis the implicit ordering of actions by their choiceworthiness, i\.e\. for alla,b∈Aa,b\\in A,a⪯tba\\preceq\_\{t\}bifft\(a\)≤t\(b\)t\(a\)\\leq t\(b\)\. A*credence function*ccis a function that maps each ethical theoryt∈Tt\\in Tto a real numberc\(t\)∈\(0,1\)c\(t\)\\in\(0,1\)such that the weights of all the theories sum up to11, i\.e\.∑t∈Tc\(t\)=1\\sum\_\{t\\in T\}c\(t\)=1\. Finally, an*ethical framework*⟨T,c⟩\\langle T,c\\rangleis a pair consisting of a set of ethical theoriesTTand a credence functioncc\.

This formalism facilitates the study of*metanormative theories*: various aggregation functions that yield an overall ranking of actions by integrating the theories’ choiceworthiness, each weighted by its associated credencec\(t\)c\(t\)\. For example, in\(Szaboet al\.[2024](https://arxiv.org/html/2606.06972#bib.bib70)\), the issue of which metanormative theories avoid fanatical outcomes is studied, where a fanatical outcome arises when a theoryttwith low credence/advocated by very few in a society, has an outsized effect upon aggregation\.

In this paper we focus on two ethical theories:*deontology*and*utilitarianism*\. Utilitarianism is a consequentialist ethical theory\(Sinnott\-Armstrong[2022](https://arxiv.org/html/2606.06972#bib.bib39)\): the right thing to do is the action whose consequences impartially maximise well\-being\. On the other hand, deontological theories\(Alexander and Moore[2021](https://arxiv.org/html/2606.06972#bib.bib48)\)specify rules encoding obligations and prohibitions on agent behaviours, without necessarily accounting for the consequences of abiding by these rules \(hence the deontological maxim ‘ends do not justify the means’\)\. We also consider the*two\-level utilitarianism*variant of utilitarianism\(Bauer[2020](https://arxiv.org/html/2606.06972#bib.bib14)\); a kind of synthesis of utilitarianism and deontology\. Normally, ethical rules guide agent behaviours\. However, in novel or unforeseen situations – where no applicable rules exist or the existing ones fall short –agents revert to direct consequentialist calculations to determine the morally appropriate course of action

### 2\.2The Impact of Contextual Factors

Recall \(Section[1](https://arxiv.org/html/2606.06972#S1)\) that we assume that moral credences are a proxy for the relative proportions of individuals in a group that advocate for the use of a theorytt’s choiceworthiness\. We do not consider here the issue of how the exact choiceworthiness values are assigned \(although we comment further on this issue in Section[5](https://arxiv.org/html/2606.06972#S5)\)\. Suffice it to say that, as discussed in\(MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9)\), different theories posit different moral stakes associated with actions\. For example, the deontological prohibition on torture is absolute and thus may assign a negative choiceworthiness that is an order of magnitude greater than the negative choiceworthiness assigned by consequentialism \(as in the archetypal dilemma in which torture might be used on a terrorist to locate a ticking bomb\)\.

We also assume that the credence\-adjusted aggregation that is used to align the utility function of an AI agent, is obtained on simulated scenarios, and so does not account for the specific contexts in which the AI agent is deployed \(although Section[5](https://arxiv.org/html/2606.06972#S5)will consider examples wherean individual’srelative credences in distinct ethical theories is used to support AI decision making ‘on the fly’\)\.

We are therefore interested in accounting for how context\-specific factors can be used, on the fly, to adjust the moral credences in the relevant ethical theories that are elicited on simulations, and the impact this has on the credence adjusted aggregations\. We now briefly review a non\-exhaustive list of context\-specific factors:

#### Resource Bounds

A crucial insight from the study of*machine ethics*is that different ethical theories necessitate different algorithmic implementations\(Bench\-Capon[2020](https://arxiv.org/html/2606.06972#bib.bib113)\)\. In this paper we assume \(as in for example\(Bauer[2020](https://arxiv.org/html/2606.06972#bib.bib14)\)\) that rule\-based ethics such as deontology can be implemented efficiently555For example, if deontological rules are specified in*propositional defeasible logic*then inference has linear time complexity\(Maher[2001](https://arxiv.org/html/2606.06972#bib.bib166)\)\.\(although this is a matter of contention\(Stenseke[2024](https://arxiv.org/html/2606.06972#bib.bib167)\)\)\. In contrast, we assume that consequentialist ethics, such as utilitarianism, cannot be implemented efficiently, for three reasons\(Stenseke[2024](https://arxiv.org/html/2606.06972#bib.bib167)\): 1\) there is a combinatorial explosion in the number of possible actions that an agent has to evaluate; 2\) causal predictions are expensive under epistemic uncertainty; 3\) causal predictions are expensive in the presence of other agents\. While Reinforcement Learning \(RL\) methods can alleviate these issues, such stochastic methods presuppose a trial\-and\-error method, which can be disastrous in morally\-salient situations\.

With this in mind, the essential idea is that under resource bounds some ethical theories fare well, while others do not\. For example, assume for simplicity that FROBO has a single deontological ethical rule: fight fires in rooms where someone is in immediate danger\. In our running example, the rule applies only to the right room, and so deontically prescribes a preference for FROBO attending to the right room\.

Consequentialist ethical theories do not fare well under resource bounds\. FROBO has limited time; if FROBO waits too long, both rooms will burn down\. FROBO also has limited computational resources for reasoning; FROBO is a small robot, not a supercomputer\. Suppose then that under the given bounds, and given the available information, FROBO predicts that the military blockade will stay in place and that insulin supplies will remain scarce for the forseeable future\. Hence, FROBO calculates that saving the medicine, not the patient, is the action which leads to the most number of people saved\. Therefore, FROBO attends to the left room\.

However, such predicted outcomes can easily be wrong\. FROBO’s resource\-bounded processing of the available information indicates that ongoing negotiations are unlikely to end soon\. However, a more thorough going analysis would have indicated that the military blockade would likely end relatively soon\. FROBO reached the wrong conclusion, but would have reached the right conclusion if its reasoning was subject to less strict bounds\. As a result, FROBO’s implementation of utilitarianism is unreliablein the given resource\-bounded context\. On the other hand, whether the blockade will continue or not makes no difference to the deontological obligation to attend to the right room\. Consequently, under resource bounds, not every theory is equally appropriate\. The broader lesson demonstrated by this scenario is that when aggregating the evaluations of ethical theories, it does not suffice to rely on their relative support \(moral credences\); one needs to account for the extent to which concrete context\-specific factors differentially impact the appropriateness of their application\. In the case of FROBO, even if a majority of people prefer utilitarianism to deontology, deontology’s evaluation should take precedence\.

#### Uncertainty

Another closely related \(to resource bounds\) contextual factor is uncertainty about: i\) the facts; ii\) whether an action will have the desired \(intended consequences\); iii\) the consequencestout court\. While it is unrealistic to comprehensively account for uncertainty in simulated scenarios, they can significantly influence moral evaluation\. Consider a case where deontology prescribes actionaa, whereas utilitarianism advocates actionbb, based on the immediate consequences of the actions\. While uncertainty of types i\) and ii\) may not differentiate between deontology and utilitarianism, the extent to which there is uncertainty of type iii\) – broader or more downstream consequences – may well impact the moral credence assigned to utilitarianism\.

A classic example is the trolley dilemmas\(Foot[1967](https://arxiv.org/html/2606.06972#bib.bib56)\)\. A common objection that subjects raise when asked to judge whether to steer the trolley so as to kill the one, rather than let the trolley continue on its way and kill the five, is that nothing is assumed known about the five or the one\(Greene[2015](https://arxiv.org/html/2606.06972#bib.bib149)\)\. What if the five were escaped murderers from a local prison, and the one a renown scientist on the verge of a major medically beneficial discovery? This is an admittedly extreme and unrealistic scenario, albeit one that makes an intuitive point: the more the uncertainty w\.r\.t\. downstream consequences, the less credence one might intuitively assign to consequentialist evaluations666Note that research in moral psychology supports the view that as uncertainty increases, the appropriateness of relying on utilitarian calculations diminishes\(Kortenk Amp and Moore[2014](https://arxiv.org/html/2606.06972#bib.bib17)\)\.\.

#### Novelty

*Rule utilitarianism*\(Sinnott\-Armstrong[2022](https://arxiv.org/html/2606.06972#bib.bib39)\)is, as the name implies, a rule\-based variant of utilitarianism \(the version of utilitarianism we have encountered thus far is called*act utilitarianism*\)\. Under this view, agents ought to act according to moral rules derived from utilitarian calculations – rules that, when generally followed, promote the greatest overall well\-being in the long run\. Such adherence serves two key purposes: it avoids the need for computationally costly utility calculations on a case\-by\-case basis\(Bauer[2020](https://arxiv.org/html/2606.06972#bib.bib14)\), and it facilitates social coordination\(Serramiaet al\.[2018](https://arxiv.org/html/2606.06972#bib.bib170)\)\.

However, the rules may be imperfect: in novel or unexpected situations, strict adherence to the rules may result in undesirable outcomes\(Bench\-Capon and Modgil[2017](https://arxiv.org/html/2606.06972#bib.bib169),[2019](https://arxiv.org/html/2606.06972#bib.bib171)\)\. For example, consider the famous ‘no vehicle in the park’ example\(Schlag[1999](https://arxiv.org/html/2606.06972#bib.bib173)\)\. There is a rule that states that no vehicles are to drive into the park, as their presence can endanger those on foot\. Now, consider that there is an emergency and an ambulance is needed to pick up an injured person\. Clearly, the rule prohibiting the ambulance’s entry did not anticipate this situation; the ambulance should enter the park\. Indeed, as some philosophers argue\(Hare[1963](https://arxiv.org/html/2606.06972#bib.bib172)\), in such novel situations, agents ought to reason from first principles\. An ambulance driver ought to realise that the rule is mistaken and should be violated\. Therefore, some rule\-based ethical theories, such as rule utilitarianism, perform poorly in unexpected, novel situations\. In such situations, their moral credence ought to be lowered\.

Note that these concerns also extend to deontological rules that are not grounded in consequentialist considerations\. Consider the following example\. A literal minded interpretation of one of Asimov’s famous laws of robotics\(Asimov[1940](https://arxiv.org/html/2606.06972#bib.bib4)\)–a robot may not injure a human being or, through inaction, allow a human being to come to harm– might entail a rescue robot intervening in a situation where a citizen is performing an emergency tracheostomy on a victim in a disaster zone\. Contextual information should clearly reduce credence in this deontological prohibition\.

#### Supererogation

*Supererogation*\(Heyd[2019](https://arxiv.org/html/2606.06972#bib.bib1)\)is the phenomenon wherein an action is good and yet not morally required\. For example, jumping into a stormy and treacherous sea to save a drowning person is clearly a morally good action and yet goes beyond what is reasonably expected from anyone\. We do not typically expect strangers to risk their own lives to save someone else’s\.

The extent to which an action is considered supererogatory may depend on alternatives courses of action that were not anticipated in simulations\. Again, consider the Moral Machine experiment\(Awadet al\.[2018](https://arxiv.org/html/2606.06972#bib.bib88)\), and suppose that on simulations in which an autonomous vehicle \(AV\) is in a one\-way tunnel, subjects judge that the AV should swerve sharply to the right \(rr\) to avoid ploughing straight on \(ss\) and killing three pedestrians\. However, in so doingrris likely to lead to the death of the passenger as the simulation is such that the car will crash head on into the tunnel wall\.

However, suppose that in a real\-world deployment of the AV, the width of the tunnel is such that the AV can swerve slightly to the right and slide along the inside wall of the tunnel \(ww\), with the non\-negligible possibility of injury to the passenger and serious injury to the one pedestrian closest to the tunnel wall \(a scenario unanticipated in simulations\)\. Now, inthiscontext,rrremains the preferred utilitarian option, but it has become supererogatory in a way that it was not in the simulated scenario\. The presence of the unanticipated third optionww– which may lead to greater overall harm, but with less severe consequences for the passenger – introduces a morally acceptable alternative that rendersrrno longer strictly required\. Arguably, this expanded context should lower moral credence in the utilitarian prescription ofrr, as it shows that the original obligation was contingent on a limited set of actions\.

## 3Formalising Contextual Shaping of Moral Credences

In this section, our formalisation of ethical decision\-making accounts for the contextual shaping of moral credences\. Then in Section[4](https://arxiv.org/html/2606.06972#S4)we show that a class of intuitively reasonable metanormative theories – including Maximum Expected Choiceworthiness \(MEC\) – violate the weak Pareto principle\. However, we argue that this is not necessarily undesirable but rather a variation on Simpson’s paradox\.

In the FROBO example, the left room contains insulin, which may save numerous lives if the military blockade continues\. However, as mentioned in Section[2\.2](https://arxiv.org/html/2606.06972#S2.SS2), due to resource bounds, FROBO’s utilitarian evaluation about the left room is potentially unreliable\. On the other hand, FROBO can accurately evaluate the consequences of attending the right room; it would save a single person\. Therefore, we would like to modulate the credences such that for the left room, utilitarianism has a lower credence, while for the right room, its credence remains unchanged\. To accommodate such context\-dependent and action\-specific credences, we define*credence profiles*\.

###### Definition 1\(Credence profile\)\.

Given actionsA=\{a,b,…\}A=\\\{a,b,\\ldots\\\}and a set of ethical theoriesTT, a*credence profile*C=⟨ca,cb,…⟩C=\\langle c\_\{a\},c\_\{b\},\.\.\.\\rangleis a list of credence functions, one for each action\.

For example,crc\_\{r\}denotes the credence function for actionrr, attending to the right room\.

We formalise*contextual features*as functions that affect credences as they pertain to actions:

###### Definition 3\(Contextual feature\)\.

Given a set of actionsAAand a set of ethical theoriesTT, let a*contextual feature*be a functiong:A×T→\(0,1\]g:A\\times T\\to\(0,1\]\.

Letrbrbdenote the resource bounds contextual feature\. In our FROBO example, utilitarianism𝐮\\mathbf\{u\}does not produce a highly reliable evaluation for the left room under the resource bounds that apply in the given context, and sorb\(l,𝐮\)=0\.1rb\(l,\\mathbf\{u\}\)=0\.1is small\. In general, we want lower\-valued contextual features to decrease \(or at least not increase\) the credence of the theory\. Hence, we expect the credencec𝐮\(l\)c\_\{\\mathbf\{u\}\}\(l\)to be decreased\. On the other hand, deontology𝐝\\mathbf\{d\}’s evaluation of the actionllis not impacted by the bounds on resources, and sorb\(l,𝐝\)=1rb\(l,\\mathbf\{d\}\)=1\. Similarly, we typically expect higher\-valued contextual features to increase \(or at least not decrease\) the credence of the theory777One might expect that ifrb\(l,𝐝\)=1rb\(l,\\mathbf\{d\}\)=1, this would have no impact on the moral credence assigned to𝐝\\mathbf\{d\}\. However, because the total moral credence across ethical theories must sum to11, a reduction in credence for one theory due to a contextual factor may require a corresponding increase in credence for another theory that is not similarly affected\. Here, sicerb\(l,𝐝\)rb\(l,\\mathbf\{d\}\)is maximal, we expect the credencec𝐝\(l\)c\_\{\\mathbf\{d\}\}\(l\)to be increased\.

Consider the right roomrr\. As discussed in Section[2\.2](https://arxiv.org/html/2606.06972#S2.SS2), utilitarianism𝐮\\mathbf\{u\}produces reliable evaluations for the right room, even under the resource bounds, and sorb\(r,𝐮\)=1rb\(r,\\mathbf\{u\}\)=1is maximal\. Similarly, deontology𝐝\\mathbf\{d\}’s evaluation ofrris also not impacted by resource bounds, and sorb\(r,𝐝\)=1rb\(r,\\mathbf\{d\}\)=1is also maximal\. \(Notice that we do not allow contextual features to have a0evaluation; we do not want that a contextual feature completely invalidates credence in an ethical theory\.\)

In Section[2\.2](https://arxiv.org/html/2606.06972#S2.SS2), we reviewed a number of different contextual features\. Since we want our formalism to be sufficiently general to accommodate a range of contextual factors, we define a*context*to be anmm\-ary list of contextual features\.

###### Definition 4\(Context\)\.

Given a set of actionsAA, anmm\-ary*context*𝑐𝑜𝑛\\mathit\{con\}is a list of contextual features⟨g1,…,gm⟩\\langle g\_\{1\},\.\.\.,g\_\{m\}\\ranglesuch that for alll∈\[1,m\]l\\in\[1,m\],gl:A×T→\(0,1\]g\_\{l\}:A\\times T\\to\(0,1\]is a contextual feature\.

For example,⟨rb⟩\\langle rb\\rangledenotes a context containing onlyrbrb\.

There are different options, which we encode as*adjustment functions*, as to how to formalise updates to moral credences on the basis of contextual features\. So, an adjustment function takes the initial moral credences \(credence functioncc\) and updates them, taking into account the contextual features enumerated in the context𝑐𝑜𝑛\\mathit\{con\}\.

###### Definition 5\(Adjustment function\)\.

Given a set of actionsAA,cca credence function and𝑐𝑜𝑛\\mathit\{con\}an*adjustment function*,𝑎𝑑𝑗\\mathit\{adj\}maps⟨A,c,𝑐𝑜𝑛⟩\\langle A,c,\\mathit\{con\}\\rangleto a credence profileCC\.

We use the notation𝑎𝑑𝑗\(A,c,con\)\\mathit\{adj\}\(A,c,con\)to denote application of𝑎𝑑𝑗\\mathit\{adj\}to⟨A,c,𝑐𝑜𝑛⟩\\langle A,c,\\mathit\{con\}\\rangle\. In this paper, we introduce two adjustment functions\. The first –𝑝𝑟𝑜𝑑\\mathit\{prod\}– takes the product of all the contextual features and the initial credence function\. In other words, each contextual reason can multiplicatively increase the credence or decrease it\.

###### Definition 6\(Adjustment functionprod\)\.

LetTTbe set of ethical theories,AAa set of actions\{a,b,…\}\\\{a,b,\\dots\\\}, and𝑐𝑜𝑛\\mathit\{con\}the context⟨g1,…,gm⟩\\langle g\_\{1\},\.\.\.,g\_\{m\}\\rangle\. Then, for every functionccthat assigns a moral credence to somet∈Tt\\in T,prodprodis function that returns the context adjusted moral credence oftt’s evaluation of each action inAA\(i\.e\., a credence profile888Recall Definition[1](https://arxiv.org/html/2606.06972#Thmtheorem1)and Remark[2](https://arxiv.org/html/2606.06972#Thmtheorem2)\.\)\. That is to say, for a given theorytt,CCis the credence profile⟨ca,cb…⟩=𝑝𝑟𝑜𝑑\(A,c,𝑐𝑜𝑛\)\\langle c\_\{a\},c\_\{b\}\.\.\.\\rangle=\\mathit\{prod\}\(A,c,\\mathit\{con\}\), where forx∈Ax\\in A:

cx\(t\)=α\[∏l∈\[1,m\]gl\(x,t\)\]c\(t\)c\_\{x\}\(t\)=\\alpha\\left\[\\prod\_\{l\\in\[1,m\]\}g\_\{l\}\(x,t\)\\right\]c\(t\)and whereα∈ℝ\+\\alpha\\in\\mathbb\{R\}^\{\+\}is a normalising constant defined such that∑t∈Tcx\(t\)=1\\sum\_\{t\\in T\}c\_\{x\}\(t\)=1\. \(Recall that the credence functions have to be unitary, i\.e\., sum up to11\)\.

Consider FROBO example with initial credencesc\(𝐮\)=0\.6c\(\\mathbf\{u\}\)=0\.6andc\(𝐝\)=0\.4c\(\\mathbf\{d\}\)=0\.4; the initial credence of utilitarianism is higher than deontology\. Here, the only contextual feature we take into account is resource boundsrbrb, whererb\(l,𝐮\)=0\.1rb\(l,\\mathbf\{u\}\)=0\.1andrb\(l,𝐝\)=1rb\(l,\\mathbf\{d\}\)=1\. Then the updated \(context\-adjusted\) credences arecl\(𝐮\)=α×0\.1×0\.6=0\.06αc\_\{l\}\(\\mathbf\{u\}\)=\\alpha\\times 0\.1\\times 0\.6=0\.06\\alphaandcl\(𝐝\)=α×1×0\.4=0\.4αc\_\{l\}\(\\mathbf\{d\}\)=\\alpha\\times 1\\times 0\.4=0\.4\\alpha\. Sinceclc\_\{l\}is a credence function, we must have0\.06α\+0\.4α=10\.06\\alpha\+0\.4\\alpha=1and soα=5023\\alpha=\\frac\{50\}\{23\}\. Therefore,cl\(𝐮\)≈0\.13c\_\{l\}\(\\mathbf\{u\}\)\\approx 0\.13andcl\(𝐝\)≈0\.87c\_\{l\}\(\\mathbf\{d\}\)\\approx 0\.87\. In other words, despite the initial credence favouring utilitarianism, the adjusted credence for the left room strongly prefers deontology\.

For the right room, we haverb\(r,𝐮\)=1rb\(r,\\mathbf\{u\}\)=1andrb\(r,𝐝\)=1rb\(r,\\mathbf\{d\}\)=1, as explained earlier \(following Definition[3](https://arxiv.org/html/2606.06972#Thmtheorem3)\)\. Then, the updated credences for the right room iscr\(𝐮\)=β×1×0\.6=0\.6βc\_\{r\}\(\\mathbf\{u\}\)=\\beta\\times 1\\times 0\.6=0\.6\\betaandcr\(𝐝\)=β×1×0\.4=0\.4βc\_\{r\}\(\\mathbf\{d\}\)=\\beta\\times 1\\times 0\.4=0\.4\\beta\. Sincecrc\_\{r\}is a credence function, we must have0\.6β\+0\.4β=10\.6\\beta\+0\.4\\beta=1and soβ=1\\beta=1\. Therefore,cr\(𝐮\)=0\.6c\_\{r\}\(\\mathbf\{u\}\)=0\.6andcr\(𝐝\)=0\.4c\_\{r\}\(\\mathbf\{d\}\)=0\.4\. In other words, the adjusted credences for the right room match the initial credencecc, unlike the adjusted credences for the left room \(recall that the adjusted credence functions are individuated with respect to the different actions\)\.

The second adjustment function we introduce is𝑚𝑖𝑛𝑖\\mathit\{mini\}, which takes the minimum of all contextual features and the initial credence\. In other words, our updated \(non\-normalised\) credence should not exceed the credence of any contextual feature\. Intuitively, this is a ‘risk\-averse’ way of adusting credences as low contextual features/moral credence cannot be traded off by high contextual features/moral credences\. By contrast,𝑝𝑟𝑜𝑑\\mathit\{prod\}can be said to be ‘risk\-neutral’ as the low\-values of contextual features can be negated by the high\-values of other contextual features\.

###### Definition 7\(Adjustment functionmini\)\.

LetTTbe set of ethical theories,AAa set of actions\{a,b,…\}\\\{a,b,\\dots\\\}, and𝑐𝑜𝑛\\mathit\{con\}the context⟨g1,…,gm⟩\\langle g\_\{1\},\.\.\.,g\_\{m\}\\rangle\. Then, for every functionccthat assigns a moral credence to somet∈Tt\\in T,miniminiis function that returns the context adjusted moral credence oftt’s evaluation of each action inAA\. That is to say, for a given theorytt,CCis the credence profile⟨ca,cb…⟩=𝑚𝑖𝑛𝑖\(A,c,𝑐𝑜𝑛\)\\langle c\_\{a\},c\_\{b\}\.\.\.\\rangle=\\mathit\{mini\}\(A,c,\\mathit\{con\}\), where forx∈Ax\\in A:

cx\(t\)=αmin⁡\(\[minl∈\[1,m\]⁡gl\(x,t\)\],c\(t\)\)c\_\{x\}\(t\)=\\alpha\\min\\left\(\\left\[\\min\_\{l\\in\[1,m\]\}g\_\{l\}\(x,t\)\\right\],c\(t\)\\right\)whereα∈ℝ\+\\alpha\\in\\mathbb\{R\}^\{\+\}is a normalising constant defined such that∑t∈Tcx\(t\)=1\\sum\_\{t\\in T\}c\_\{x\}\(t\)=1\. \(Recall the unitary requirement\)\.

Consider again the FROBO example with initial credencesc\(𝐮\)=0\.6c\(\\mathbf\{u\}\)=0\.6andc\(𝐝\)=0\.4c\(\\mathbf\{d\}\)=0\.4\. Again, we only take into account resource bounds \(rbrb\) whererb\(l,𝐮\)=0\.1rb\(l,\\mathbf\{u\}\)=0\.1andrb\(l,𝐝\)=1rb\(l,\\mathbf\{d\}\)=1\. Then the updated credences for the left room iscl\(𝐮\)=αmin⁡\(0\.1,0\.6\)=0\.1αc\_\{l\}\(\\mathbf\{u\}\)=\\alpha\\min\(0\.1,0\.6\)=0\.1\\alphaandcl\(𝐝\)=αmin⁡\(1,0\.4\)=0\.4αc\_\{l\}\(\\mathbf\{d\}\)=\\alpha\\min\(1,0\.4\)=0\.4\\alpha\. Sinceclc\_\{l\}is a credence function,0\.1α\+0\.4α=10\.1\\alpha\+0\.4\\alpha=1and soα=2\\alpha=2\. Therefore,cl\(𝐮\)=0\.2c\_\{l\}\(\\mathbf\{u\}\)=0\.2andcl\(𝐝\)=0\.8c\_\{l\}\(\\mathbf\{d\}\)=0\.8\. In other words, despite the initial credence favouring utilitarianism, the adjusted credence for the left room strongly prefers deontology\.

For the right room:rb\(r,𝐮\)=1rb\(r,\\mathbf\{u\}\)=1andrb\(r,𝐝\)=1rb\(r,\\mathbf\{d\}\)=1\. Then, the updated credences for the right room arecr\(𝐮\)=βmin⁡\(1,0\.6\)=0\.6βc\_\{r\}\(\\mathbf\{u\}\)=\\beta\\min\(1,0\.6\)=0\.6\\betaandcr\(𝐝\)=βmin⁡\(1,0\.4\)=0\.4βc\_\{r\}\(\\mathbf\{d\}\)=\\beta\\min\(1,0\.4\)=0\.4\\beta\. Sincecrc\_\{r\}is a credence function:0\.6β\+0\.4β=10\.6\\beta\+0\.4\\beta=1and soβ=1\\beta=1\. Therefore,cr\(𝐮\)=0\.6c\_\{r\}\(\\mathbf\{u\}\)=0\.6andcr\(𝐝\)=0\.4c\_\{r\}\(\\mathbf\{d\}\)=0\.4\. That is, the adjusted credences for the right room match the initial credencecc, unlike the adjusted credences for the left room\.

In the moral uncertainty literature,*metanormative theories*tell us how to order the actions, given an ethical framework⟨T,c⟩\\langle T,c\\rangleconsisting of a set of ethical theoriesTT, a credence functioncc, and the evaluations assigned by the theories to the actions\. Since we are interested in context adjusted credences individuated w\.r\.t\. the actions, in this paper a metanormative theoryfftakes a credence profile as an argument, not a credence function\.

###### Definition 8\(Metanormative theory\)\.

A*metanormative theory*ffis a function that maps every set of ethical theoriesTTand credence profileCCto a total preorderf\(T,C\)f\(T,C\)over the set of actions\.

In this paper, we use MEC as our primary metanormative theory999Note that this is because MEC is the standard in the moral uncertainty literature\(MacAskill[2016](https://arxiv.org/html/2606.06972#bib.bib13); Bogosian[2017](https://arxiv.org/html/2606.06972#bib.bib21)\)\., which we now formally define\. Given a set of ethical theoriesTTand a credence profileCC, the weighted arithmetic mean of an actionx∈Ax\\in Ais defined as:

𝑤𝑎𝑚\(T,C,x\)=∑t∈Tcx\(t\)t\(x\)\\mathit\{wam\}\(T,C,x\)=\\sum\_\{t\\in T\}c\_\{x\}\(t\)t\(x\)
###### Definition 9\(Maximising expected choiceworthiness\)\.

\(mec\)\] Leta,b∈Aa,b\\in Abe actions, and⟨T,C⟩\\langle T,C\\ranglean ethical framework\. Thena⪯mecba\\preceq\_\{\\textit\{mec\}\}biff𝑤𝑎𝑚\(T,C,a\)≤𝑤𝑎𝑚\(T,C,b\)\\mathit\{wam\}\(T,C,a\)\\leq\\mathit\{wam\}\(T,C,b\), wheremec\(T,C\)=⪯mec\\textit\{mec\}\(T,C\)~=~\\preceq\_\{\\textit\{mec\}\}\.

Note that the weighted arithmetic mean𝑤𝑎𝑚\\mathit\{wam\}is often called the*expected choiceworthiness*\(MacAskill[2014](https://arxiv.org/html/2606.06972#bib.bib23)\), hence the name ‘maximising expected choiceworthiness’\.

Table 0:FROBO’s evaluations and credence profile\. Here, the columnscl\(t\)c\_\{l\}\(t\)andcr\(t\)c\_\{r\}\(t\)denote the credence profile of the ethical theories\. Moreover, the columnst\(l\)t\(l\)andt\(r\)t\(r\)denote the evaluations of the actions – left room and right room, respectively – by the different ethical theories\. Finally,𝑤𝑎𝑚\\mathit\{wam\}gives the expected choiceworthiness of the actions\.Consider FROBO with adjustment function𝑚𝑖𝑛𝑖\\mathit\{mini\}and a context consisting only ofrbrb\(see Table[‣3](https://arxiv.org/html/2606.06972#S3.T1)\)\. Recall our earlier calculations:cl\(𝐮\)=0\.2c\_\{l\}\(\\mathbf\{u\}\)=0\.2,cl\(𝐝\)=0\.8c\_\{l\}\(\\mathbf\{d\}\)=0\.8, andcr\(𝐮\)=0\.6c\_\{r\}\(\\mathbf\{u\}\)=0\.6,cr\(𝐝\)=0\.4c\_\{r\}\(\\mathbf\{d\}\)=0\.4\. Moreover, assume that the evaluations are𝐮\(l\)=1\\mathbf\{u\}\(l\)=1,𝐮\(r\)=0\\mathbf\{u\}\(r\)=0, and𝐝\(l\)=0\\mathbf\{d\}\(l\)=0,𝐝\(r\)=3\\mathbf\{d\}\(r\)=3\. In other words, while utilitarian calculations prefer the left room, deontological rules prioritise the right room\. MEC orders actions according to their expected choiceworthiness𝑤𝑎𝑚\\mathit\{wam\}\. For the left room, we have:

𝑤𝑎𝑚\(\{𝐝,𝐮\},⟨cl,cr⟩,l\)=0\.2×1\+0\.9×0=0\.2\\mathit\{wam\}\(\\\{\\mathbf\{d\},\\mathbf\{u\}\\\},\\langle c\_\{l\},c\_\{r\}\\rangle,l\)=0\.2\\times 1\+0\.9\\times 0=0\.2Similarly, for the right room, we have:

𝑤𝑎𝑚\(\{𝐝,𝐮\},⟨cl,cr⟩,r\)=0\.6×0\+0\.4×3=1\.2\\mathit\{wam\}\(\\\{\\mathbf\{d\},\\mathbf\{u\}\\\},\\langle c\_\{l\},c\_\{r\}\\rangle,r\)=0\.6\\times 0\+0\.4\\times 3=1\.2Thus, due to its higher expected choiceworthiness, MEC leads FROBO to choose the right room\.

## 4Violation of the Weak Pareto principle

We now define an important basic property of moral uncertainty \(and social choice\): the weak Pareto principle, which ensures that unanimous decisions are respected\.

###### Definition 10\(Weak Pareto principle\)\.

A metanormative theoryffand an adjustment function𝑎𝑑𝑗\\mathit\{adj\}are said to satisfy the*weak Pareto principle*, if for every ethical framework⟨T,c⟩\\langle T,c\\rangle, every context𝑐𝑜𝑛\\mathit\{con\}, every pair of actionsa,b∈Aa,b\\in A,

- •if for every theoryt∈Tt\\in Tt\(a\)<t\(b\)t\(a\)<t\(b\)holds
- •a≺fba\\prec\_\{f\}bmust hold, where
- •⪯f=f\(T,C\)\{\\preceq\_\{f\}\\ =f\(T,C\)\}andC=𝑎𝑑𝑗\(C,𝑐𝑜𝑛\)C=\\mathit\{adj\}\(C,\\mathit\{con\}\)\.

We identify a class of metanormative theories and adjustment functions – specifically*inter\-theoretically responsive metanormative theories*and*context\-surjective adjustment functions*– that jointlyviolatethe weak Pareto property\.

###### Definition 11\(Inter\-theoretic responsiveness\)\.

A metanormative theoryffis said to be*inter\-theoretically responsive*if for all sets of ethical theoriesTTand all pairs of actionsa,b∈Aa,b\\in A, whereu\(a\)\>v\(b\)u\(a\)\>v\(b\)holds for some theoriesu,v∈Tu,v\\in T, there exists credence profileCCsuch thata≻fba\\succ\_\{f\}b, where⪯f=f\(T,C\)\\preceq\_\{f\}\\ =f\(T,C\)\.

In other words, an inter\-theoretically responsive metanormative theory is such that if there is a justification for preferringaaoverbb– specifically, if there exists two theoriesuuandvvsuch thatu\(a\)\>v\(b\)u\(a\)\>v\(b\)– then there must exist a credence profileCCsuch thataais preferred tobbin the aggregate ordering, i\.e\.a≻fba\\succ\_\{f\}bholds\.

###### Definition 12\(Context surjectivity\)\.

An adjustment function𝑎𝑑𝑗\\mathit\{adj\}is*context\-surjective*if for any set of actionsAA, credence functionccand credence profileCC, there exists a context𝑐𝑜𝑛\\mathit\{con\}such that𝑎𝑑𝑗\(A,c,𝑐𝑜𝑛\)=C\\mathit\{adj\}\(A,c,\\mathit\{con\}\)=C\.

Alternatively, a function𝑎𝑑𝑗\\mathit\{adj\}is context\-surjective if function𝑎𝑑𝑗\(A,c,⋅\)\\mathit\{adj\}\(A,c,\\cdot\)is surjective for all credence functionscc\.

Intuitively, context surjectivity means that the context can update the initial credence function arbitrarily\. Both𝑝𝑟𝑜𝑑\\mathit\{prod\}and𝑚𝑖𝑛𝑖\\mathit\{mini\}are context\-surjective \(see Theorems[15](https://arxiv.org/html/2606.06972#Thmtheorem15)and[16](https://arxiv.org/html/2606.06972#Thmtheorem16)\)\.

### 4\.1Proofs

We first prove that inter\-theoretically responsive metanormative theories and context\-surjective adjustment functions do not jointly satisfy the weak Pareto property\.

###### Theorem 13\.

A metanormative theoryffand adjustment function𝑎𝑑𝑗\\mathit\{adj\}does not jointly satisfy the weak Pareto property ifffis inter\-theoretically responsive and𝑎𝑑𝑗\\mathit\{adj\}is context\-surjective\.

###### Proof\.

Leta,b∈Aa,b\\in Abe arbitrary actions\. Let\(T,c\)\(T,c\)be any ethical framework such that for all theoriestt,t\(a\)<t\(b\)t\(a\)<t\(b\)\. Moreover, letTTbe such that there exist theoriesuuandvvwithu\(a\)\>v\(b\)u\(a\)\>v\(b\)\. By inter\-theoretic responsiveness, there exists a credence profileCCsuch thata≻fba\\succ\_\{f\}b, where⪯f=f\(T,C\)\\preceq\_\{f\}\\ =f\(T,C\)\.

Because𝑎𝑑𝑗\\mathit\{adj\}is context\-surjective, for allC∗C^\{\*\}, there exists context𝑐𝑜𝑛∗\\mathit\{con^\{\*\}\}such thatC∗=𝑎𝑑𝑗\(A,c,𝑐𝑜𝑛∗\)C^\{\*\}=\\mathit\{adj\}\(A,c,\\mathit\{con^\{\*\}\}\)\. Specifically, let𝑐𝑜𝑛\\mathit\{con\}be such thatC=𝑎𝑑𝑗\(A,c,𝑐𝑜𝑛\)C=\\mathit\{adj\}\(A,c,\\mathit\{con\}\)\. Therefore, we have that, even though for all theoriestt,t\(a\)<t\(b\)t\(a\)<t\(b\), we also have thata≻fba\\succ\_\{f\}b, thereby violating the weak Pareto property\. ∎

To show the significance of the above result, we first prove that MEC is inter\-theoretically responsive\.

###### Theorem 14\.

The metanormative theory MEC is inter\-theoretically responsive\.

###### Proof\.

Leta,ba,bbe arbitrary actions\. LetTTbe any set of ethical theories such that there exist theoriesuuandvvwithu\(a\)\>v\(b\)u\(a\)\>v\(b\)\. We now present a credence profileCCsuch thata≻𝑚𝑒𝑐ba\\succ\_\{\\mathit\{mec\}\}bholds, where⪯𝑚𝑒𝑐=f\(T,C\)\\preceq\_\{\\mathit\{mec\}\}\\ =f\(T,C\)\.

Under MEC, the actions are ordered by their weighted arithmetic mean\. That is, for all credence profilesCC:

𝑤𝑎𝑚\(T,C,a\)=∑t∈Tca\(t\)t\(a\)\\mathit\{wam\}\(T,C,a\)=\\sum\_\{t\\in T\}c\_\{a\}\(t\)t\(a\)and

𝑤𝑎𝑚\(T,C,b\)=∑t∈Tcb\(t\)t\(b\)\\mathit\{wam\}\(T,C,b\)=\\sum\_\{t\\in T\}c\_\{b\}\(t\)t\(b\)
By Lemma[18](https://arxiv.org/html/2606.06972#Thmtheorem18)\(see later\), we can setcac\_\{a\}such that for anyϵ\>0\\epsilon\>0, we have\|u\(a\)−𝑤𝑎𝑚\(T,C,a\)\|<ϵ\|u\(a\)\-\\mathit\{wam\}\(T,C,a\)\|<\\epsilon, i\.e\.𝑤𝑎𝑚\(T,C,a\)∈\(u\(a\)−ϵ,u\(a\)\+ϵ\)\\mathit\{wam\}\(T,C,a\)\\in\(u\(a\)\-\\epsilon,u\(a\)\+\\epsilon\)\. Specifically, such that

𝑤𝑎𝑚\(T,C,a\)\>u\(a\)−ϵ\\mathit\{wam\}\(T,C,a\)\>u\(a\)\-\\epsilon\(1\)
Similarly, by Lemma[18](https://arxiv.org/html/2606.06972#Thmtheorem18), we can setcbc\_\{b\}such that for anyδ\>0\\delta\>0, we have\|v\(b\)−𝑤𝑎𝑚\(T,C,b\)\|<δ\|v\(b\)\-\\mathit\{wam\}\(T,C,b\)\|<\\delta, i\.e\.𝑤𝑎𝑚\(T,C,b\)∈\(v\(b\)−δ,v\(b\)\+δ\)\\mathit\{wam\}\(T,C,b\)\\in\(v\(b\)\-\\delta,v\(b\)\+\\delta\)\. Specifically, such that

𝑤𝑎𝑚\(T,C,b\)<v\(b\)\+δ\\mathit\{wam\}\(T,C,b\)<v\(b\)\+\\delta\(2\)
We now show that for appropriateϵ,δ\>0\\epsilon,\\delta\>0,𝑤𝑎𝑚\(T,C,a\)\>𝑤𝑎𝑚\(T,C,b\)\\mathit\{wam\}\(T,C,a\)\>\\mathit\{wam\}\(T,C,b\), i\.e\.a≻𝑚𝑒𝑐ba\\succ\_\{\\mathit\{mec\}\}b\. We know thatu\(a\)\>v\(b\)u\(a\)\>v\(b\)\. Letd=u\(a\)−v\(b\)\>0d=u\(a\)\-v\(b\)\>0\. Then, letϵ,δ<d2\\epsilon,\\delta<\\frac\{d\}\{2\}\.

To show that𝑤𝑎𝑚\(T,C,a\)\>𝑤𝑎𝑚\(T,C,b\)\\mathit\{wam\}\(T,C,a\)\>\\mathit\{wam\}\(T,C,b\), it’s sufficient to show thatv\(b\)\+δ<u\(a\)−ϵv\(b\)\+\\delta<u\(a\)\-\\epsilon\(by Inequalities[1](https://arxiv.org/html/2606.06972#S4.E1)and[2](https://arxiv.org/html/2606.06972#S4.E2)\)\. We can rearrange this and obtain that we need to show thatδ\+ϵ<u\(a\)−v\(b\)\\delta\+\\epsilon<u\(a\)\-v\(b\)\. Note thatu\(a\)−v\(b\)=du\(a\)\-v\(b\)=d, by definition\. Using the fact thatϵ,δ<d2\\epsilon,\\delta<\\frac\{d\}\{2\}, we obtain thatδ\+ϵ<d2\+d2=d\\delta\+\\epsilon<\\frac\{d\}\{2\}\+\\frac\{d\}\{2\}=d, as required\. ∎

We now prove that𝑝𝑟𝑜𝑑\\mathit\{prod\}satisfies context surjectivity\.

###### Theorem 15\.

The adjustment function𝑝𝑟𝑜𝑑\\mathit\{prod\}is context\-surjective\.

###### Proof\.

An adjustment function𝑎𝑑𝑗\\mathit\{adj\}is*context\-surjective*if for every set of actionsA=\{a,b,…\}A=\\\{a,b,\\ldots\\\}, credence functionccand credence profileC=⟨ca,cb…⟩C=\\langle c\_\{a\},c\_\{b\}\.\.\.\\rangle, there exists a context𝑐𝑜𝑛=⟨g1,…,gm⟩\\mathit\{con\}=\\langle g\_\{1\},\.\.\.,g\_\{m\}\\ranglesuch thatC′=CC^\{\\prime\}=C, whereC′=𝑎𝑑𝑗\(A,c,𝑐𝑜𝑛\)=⟨ca′,cb′…⟩C^\{\\prime\}=\\mathit\{adj\}\(A,c,\\mathit\{con\}\)=\\langle c\_\{a\}^\{\\prime\},c\_\{b\}^\{\\prime\}\.\.\.\\rangle\.

Specifically, we show that for arbitraryx∈Ax\\in A, we can set the context𝑐𝑜𝑛\\mathit\{con\}so thatcx′=cxc^\{\\prime\}\_\{x\}=c\_\{x\}holds\. For𝑝𝑟𝑜𝑑\\mathit\{prod\}, given set of ethical theoriesTT, for arbitraryx∈Ax\\in A, let

Mx′=mint∈T⁡c\(t\)cx\(t\)M\_\{x\}^\{\\prime\}=\\min\_\{t\\in T\}\\frac\{c\(t\)\}\{c\_\{x\}\(t\)\}\(3\)Note that for allt∈Tt\\in T,cx\(t\)\>0c\_\{x\}\(t\)\>0andcx\(t\)\>0c\_\{x\}\(t\)\>0\(as they are both credence functions\) and soMx′\>0M\_\{x\}^\{\\prime\}\>0\. Moreover, letMx=min⁡\(1,Mx′\)M\_\{x\}=\\min\(1,M\_\{x\}^\{\\prime\}\)\.

We can now define the context𝑐𝑜𝑛\\mathit\{con\}\. For anyx∈Ax\\in A,t∈Tt\\in T, let:

g1\(x,t\)=Mx×cx\(t\)c\(t\)g\_\{1\}\(x,t\)=M\_\{x\}\\times\\frac\{c\_\{x\}\(t\)\}\{c\(t\)\}\(4\)Note that we must haveg1\(x,t\)∈\(0,1\]g\_\{1\}\(x,t\)\\in\(0,1\]\. By definition ofMxM\_\{x\}andMx′M\_\{x\}^\{\\prime\}, we know thatMx≤Mx′≤c\(t\)cx\(t\)M\_\{x\}\\leq M^\{\\prime\}\_\{x\}\\leq\\frac\{c\(t\)\}\{c\_\{x\}\(t\)\}\. And so, we must have that1Mx≥cx\(t\)c\(t\)\\frac\{1\}\{M\_\{x\}\}\\geq\\frac\{c\_\{x\}\(t\)\}\{c\(t\)\}and so1≥Mx×cx\(t\)c\(t\)1\\geq M\_\{x\}\\times\\frac\{c\_\{x\}\(t\)\}\{c\(t\)\}, i\.e\.g1\(x,t\)≤1g\_\{1\}\(x,t\)\\leq 1\. Moreover, we know thatMx\>0M\_\{x\}\>0andcx\(t\)c\(t\)\>0\\frac\{c\_\{x\}\(t\)\}\{c\(t\)\}\>0and sog1\(x,t\)≥0g\_\{1\}\(x,t\)\\geq 0is also true\.

Moreover, for eachl∈\[2,m\]l\\in\[2,m\],x∈Ax\\in A,t∈Tt\\in T, let

gl\(x,t\)=1g\_\{l\}\(x,t\)=1\(5\)By definition of𝑝𝑟𝑜𝑑\\mathit\{prod\}, for eacht∈Tt\\in Tandx∈Ax\\in A, it must be the case that:

cx′\(t\)=α×g1\(x,t\)×\[∏l∈\[2,m\]gl\(x,t\)\]c\(t\)c^\{\\prime\}\_\{x\}\(t\)=\\alpha\\times g\_\{1\}\(x,t\)\\times\\left\[\\prod\_\{l\\in\[2,m\]\}g\_\{l\}\(x,t\)\\right\]c\(t\)\(6\)By Equations[4](https://arxiv.org/html/2606.06972#S4.E4)and[5](https://arxiv.org/html/2606.06972#S4.E5), we substitute forg1g\_\{1\}andglg\_\{l\}in Eq\.[6](https://arxiv.org/html/2606.06972#S4.E6):

cx′\(t\)=α×Mx×cx\(t\)c\(t\)×\[∏l∈\[2,m\]1\]c\(t\)c^\{\\prime\}\_\{x\}\(t\)=\\alpha\\times M\_\{x\}\\times\\frac\{c\_\{x\}\(t\)\}\{c\(t\)\}\\times\\left\[\\prod\_\{l\\in\[2,m\]\}1\\right\]c\(t\)\(7\)=α×Mx×cx\(t\)c\(t\)×c\(t\)=\\alpha\\times M\_\{x\}\\times\\frac\{c\_\{x\}\(t\)\}\{c\(t\)\}\\times c\(t\)=α×Mx×cx\(t\)=\\alpha\\times M\_\{x\}\\times c\_\{x\}\(t\)\(8\)
We can calculateα\\alphafrom the fact thatcx′c^\{\\prime\}\_\{x\}must be unitary:

∑t∈Tcx′\(t\)=1\\sum\_\{t\\in T\}c^\{\\prime\}\_\{x\}\(t\)=1Substituting in Equation[8](https://arxiv.org/html/2606.06972#S4.E8)obtains

∑t∈Tα×Mx×cx\(t\)=1\\sum\_\{t\\in T\}\\alpha\\times M\_\{x\}\\times c\_\{x\}\(t\)=1rewritten as:

α×Mx∑t∈Tcx\(t\)=1\\alpha\\times M\_\{x\}\\sum\_\{t\\in T\}c\_\{x\}\(t\)=1Sincecxc\_\{x\}is unitary \(∑t∈Tcx\(t\)=1\\sum\_\{t\\in T\}c\_\{x\}\(t\)=1\):

α×Mx=1\\alpha\\times M\_\{x\}=1which, substituting in Equation[8](https://arxiv.org/html/2606.06972#S4.E8)obtains that for allt∈Tt\\in T,x∈Ax\\in A:

cx′\(t\)=cx\(t\)c^\{\\prime\}\_\{x\}\(t\)=c\_\{x\}\(t\)Therefore,𝑝𝑟𝑜𝑑\\mathit\{prod\}is context\-surjective\.

∎

Similarly, we show that𝑚𝑖𝑛𝑖\\mathit\{mini\}is also context\-surjective\.

###### Theorem 16\.

The adjustment function𝑚𝑖𝑛𝑖\\mathit\{mini\}is context\-surjective\.

###### Proof\.

An adjustment function𝑎𝑑𝑗\\mathit\{adj\}is*context\-surjective*if for every set of actionsA=\{a,b,…\}A=\\\{a,b,\\ldots\\\}, credence functionccand credence profileC=⟨ca,cb…⟩C=\\langle c\_\{a\},c\_\{b\}\.\.\.\\rangle, there exists a context𝑐𝑜𝑛=⟨g1,…,gm⟩\\mathit\{con\}=\\langle g\_\{1\},\.\.\.,g\_\{m\}\\ranglesuch thatC′=CC^\{\\prime\}=C, whereC′=𝑎𝑑𝑗\(A,c,𝑐𝑜𝑛\)=⟨ca′,cb′…⟩C^\{\\prime\}=\\mathit\{adj\}\(A,c,\\mathit\{con\}\)=\\langle c\_\{a\}^\{\\prime\},c\_\{b\}^\{\\prime\}\.\.\.\\rangle\.

Specifically, we show that for arbitraryx∈Ax\\in A, we can set the context𝑐𝑜𝑛\\mathit\{con\}so thatcx′=cxc^\{\\prime\}\_\{x\}=c\_\{x\}holds\. Let:

M=mint∈T⁡c\(t\)M=\\min\_\{t\\in T\}c\(t\)\(9\)Note that for alltt,c\(t\)∈\(0,1\)c\(t\)\\in\(0,1\), and soM∈\(0,1\)M\\in\(0,1\)\.

We can now define the contextconcon\. For anyx∈Ax\\in A,t∈Tt\\in T, let:

g1\(x,t\)=M×cx\(t\)g\_\{1\}\(x,t\)=M\\times c\_\{x\}\(t\)\(10\)Note thatg1g\_\{1\}is a contextual feature: we haveg1\(x,t\)∈\(0,1\]g\_\{1\}\(x,t\)\\in\(0,1\]asM∈\(0,1\)M\\in\(0,1\)andcx\(t\)∈\(0,1\)c\_\{x\}\(t\)\\in\(0,1\)and so their product is also between0and11\. Moreover, sincecx\(t\)<1c\_\{x\}\(t\)<1, theng1\(x,t\)≤Mg\_\{1\}\(x,t\)\\leq M\. Moreover, for anyl∈\[2,m\]l\\in\[2,m\], anyx∈Ax\\in A,t∈Tt\\in T, let

gl\(x,t\)=1g\_\{l\}\(x,t\)=1\(11\)By definition of𝑚𝑖𝑛𝑖\\mathit\{mini\}, for anyt∈Tt\\in T:

cx′\(t\)=αmin⁡\(g1\(x,t\),\[minl∈\[2,m\]⁡gl\(x,t\)\],c\(t\)\)c^\{\\prime\}\_\{x\}\(t\)=\\alpha\\min\\left\(g\_\{1\}\(x,t\),\\left\[\\min\_\{l\\in\[2,m\]\}g\_\{l\}\(x,t\)\\right\],c\(t\)\\right\)\(12\)which, when substituting forg1\(x,t\)g\_\{1\}\(x,t\)\(Eq\.[10](https://arxiv.org/html/2606.06972#S4.E10)\) andg1\(x,t\)g\_\{1\}\(x,t\)\(Eq\.[11](https://arxiv.org/html/2606.06972#S4.E11)\) obtains:

cx′\(t\)=αmin⁡\(M×cx\(t\),\[minl∈\[2,m\]⁡1\],c\(t\)\)c^\{\\prime\}\_\{x\}\(t\)=\\alpha\\min\\left\(M\\times c\_\{x\}\(t\),\\left\[\\min\_\{l\\in\[2,m\]\}1\\right\],c\(t\)\\right\)=αmin⁡\(M×cx\(t\),1,c\(t\)\)=\\alpha\\min\\left\(M\\times c\_\{x\}\(t\),1,c\(t\)\\right\)and sinceM×cx\(t\)<1M\\times c\_\{x\}\(t\)<1\(recall Eq\.[10](https://arxiv.org/html/2606.06972#S4.E10)andg1\(x,t\)∈\(0,1\]g\_\{1\}\(x,t\)\\in\(0,1\]\) andc\(t\)<1c\(t\)<1:

cx′\(t\)=αmin⁡\(M×cx\(t\),c\(t\)\)c^\{\\prime\}\_\{x\}\(t\)=\\alpha\\min\\left\(M\\times c\_\{x\}\(t\),c\(t\)\\right\)\(13\)
From Eq\.[9](https://arxiv.org/html/2606.06972#S4.E9), we have thatM≤c\(t\)M\\leq c\(t\)\. Moreover,cx\(t\)<1c\_\{x\}\(t\)<1for alltt\. HenceM×cx\(t\)<c\(t\)M\\times c\_\{x\}\(t\)<c\(t\), and so:

cx′\(t\)=α×M×cx\(t\)c^\{\\prime\}\_\{x\}\(t\)=\\alpha\\times M\\times c\_\{x\}\(t\)\(14\)
We can calculateα\\alphafrom the fact thatcx′c^\{\\prime\}\_\{x\}must be unitary:

∑t∈Tcx′\(t\)=1\\sum\_\{t\\in T\}c^\{\\prime\}\_\{x\}\(t\)=1Substituting in Equation[14](https://arxiv.org/html/2606.06972#S4.E14), we obtain:

∑t∈Tα×M×cx\(t\)=1\\sum\_\{t\\in T\}\\alpha\\times M\\times c\_\{x\}\(t\)=1rewritten as:

α×M∑t∈Tcx\(t\)=1\\alpha\\times M\\sum\_\{t\\in T\}c\_\{x\}\(t\)=1Sincecxc\_\{x\}is unitary \(∑t∈Tcx\(t\)=1\\sum\_\{t\\in T\}c\_\{x\}\(t\)=1\), we obtain:

And so substituting into Equation[14](https://arxiv.org/html/2606.06972#S4.E14)we obtain that for allt∈Tt\\in T,x∈Ax\\in A:

cx′\(t\)=cx\(t\)c^\{\\prime\}\_\{x\}\(t\)=c\_\{x\}\(t\)Therefore,𝑚𝑖𝑛𝑖\\mathit\{mini\}is context\-surjective\.

∎

From Theorems[13](https://arxiv.org/html/2606.06972#Thmtheorem13),[14](https://arxiv.org/html/2606.06972#Thmtheorem14),[15](https://arxiv.org/html/2606.06972#Thmtheorem15)and[16](https://arxiv.org/html/2606.06972#Thmtheorem16)the following result immediately follows:

###### Corollary 17\.

The metanormative theory𝑚𝑒𝑐\\mathit\{mec\}with 1\) the adjustment function𝑝𝑟𝑜𝑑\\mathit\{prod\}and 2\) the adjustment function𝑚𝑖𝑛𝑖\\mathit\{mini\}jointly violate the weak Pareto principle\.

We now prove the earlier referenced Lemma[18](https://arxiv.org/html/2606.06972#Thmtheorem18):

###### Lemma 18\.

LetSSbe a set of real numbers and letsi′∈S\{s\_\{i^\{\\prime\}\}\\in S\}be an arbitrary element ofSS\. Then, for anyϵ\>0\\epsilon\>0, there exists a weight functionw:S→\(0,1\)w:S\\to\(0,1\)such that\|si′−∑si∈Sw\(si\)si\|<ϵ\\left\|s\_\{i^\{\\prime\}\}\-\\sum\_\{s\_\{i\}\\in S\}w\(s\_\{i\}\)s\_\{i\}\\right\|<\\epsilon\.

###### Proof\.

We show our proof by constructing an appropriateww\. In particular, letw\(si′\)=1−δw\(s\_\{i^\{\\prime\}\}\)=1\-\\deltaand for alli≠i′i\\neq i^\{\\prime\}, letw\(si\)=δ\|S\|−1\{w\(s\_\{i\}\)=\\frac\{\\delta\}\{\|S\|\-1\}\}whereδ\>0\\delta\>0is a positive constant \(we give its precise value later\)\. Therefore, we must have∑si∈S∧i≠i′w\(i\)=δ\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(i\)=\\deltasince we add upδ\\delta\|S\|−1\|S\|\-1times\. Overall, we have:

∑si∈Sw\(si\)si=\(1−δ\)si′\+∑si∈S∧i≠i′w\(si\)si\\sum\_\{s\_\{i\}\\in S\}w\(s\_\{i\}\)s\_\{i\}=\(1\-\\delta\)s\_\{i^\{\\prime\}\}\+\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)s\_\{i\}
And so, we must have:

\|si′−∑si∈Sw\(si\)si\|=\|si′−\(1−δ\)si′−∑si∈S∧i≠i′w\(si\)si\|\\left\|s\_\{i^\{\\prime\}\}\-\\sum\_\{s\_\{i\}\\in S\}w\(s\_\{i\}\)s\_\{i\}\\right\|=\\left\|s\_\{i^\{\\prime\}\}\-\(1\-\\delta\)s\_\{i^\{\\prime\}\}\-\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)s\_\{i\}\\right\|=\|δsi′−∑si∈S∧i≠i′w\(si\)si\|=\\left\|\\delta s\_\{i^\{\\prime\}\}\-\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)s\_\{i\}\\right\|
Sinceδ\>0\\delta\>0andw\(si\)\>0w\(s\_\{i\}\)\>0for allsi∈Ss\_\{i\}\\in S, we can derive the following inequality:

\|δsi′−∑si∈S∧i≠i′w\(si\)si\|≤\|δsi′\|\+\|∑si∈S∧i≠i′w\(si\)si\|\\left\|\\delta s\_\{i^\{\\prime\}\}\-\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)s\_\{i\}\\right\|\\leq\|\\delta s\_\{i^\{\\prime\}\}\|\+\\left\|\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)s\_\{i\}\\right\|≤δ\|si′\|\+∑si∈S∧i≠i′w\(si\)\|si\|\\leq\\delta\|s\_\{i^\{\\prime\}\}\|\+\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)\|s\_\{i\}\|
Now, letM′=maxsi∈S⁡\|si\|M^\{\\prime\}=\\max\_\{s\_\{i\}\\in S\}\|s\_\{i\}\|and letM=max⁡\(1,M′\)M=\\max\(1,M^\{\\prime\}\)\. Therefore, we must haveδ\|si′\|≤δM\\delta\|s\_\{i^\{\\prime\}\}\|\\leq\\delta M\. Moreover, for any other elementsis\_\{i\}, it must be thatw\(si\)\|si\|≤w\(si\)Mw\(s\_\{i\}\)\|s\_\{i\}\|\\leq w\(s\_\{i\}\)M\. Hence:

δ\|si′\|\+∑si∈S∧i≠i′w\(si\)\|si\|≤δM\+∑si∈S∧i≠i′w\(si\)M\\delta\|s\_\{i^\{\\prime\}\}\|\+\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)\|s\_\{i\}\|\\leq\\delta M\+\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)M=δM\+M∑si∈S∧i≠i′w\(si\)=\\delta M\+M\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(s\_\{i\}\)Note that we have definedwwsuch that∑si∈S∧i≠i′w\(i\)=δ\\sum\_\{s\_\{i\}\\in S\\wedge i\\neq i^\{\\prime\}\}w\(i\)=\\delta, and so we must have

=δM\+Mδ=2δM=\\delta M\+M\\delta=2\\delta M
Note thatM\>0M\>0is some positive constant\. Therefore, we can chooseδ\\deltato be arbitrarily small, namely, we can make2δM<ϵ2\\delta M<\\epsilonfor any chosenϵ\>0\\epsilon\>0by settingδ<ϵ2M\\delta<\\frac\{\\epsilon\}\{2M\}\. ∎

### 4\.2Simpson’s Paradox

We now invoke*Simpson’s paradox*by way of commenting on what at first glance is the seemingly unintuitive violation of the weak Pareto property by context\-adjusted metanormative theories\. A well\-known illustration of Simpson’s paradox is the gender disparity paradox observed in admissions to the University of California, Berkeley\(Wagner[1982](https://arxiv.org/html/2606.06972#bib.bib174)\)\. The fact that men were more likely to be admitted to Berkeley than women, prompted the university to review each department’s admission rates\. They discovered a seeming paradox: for each department, the admission rate for women was higher than for men\. The explanation was that women were more likely than men to apply to departments whose admission rates were lower, with men more likely to apply to departments with higher admissions rates\. As a result, while women had higher admission rates for individual departments, overall admission rates were lower\.

Note that the men had higher acceptance rates because the acceptance rates of women in the ‘harder\-to\-get\-into’ departments was lower than the acceptance rates of men in the ‘easier\-to\-get\-into’ departments\. In other words, for the average acceptance rate of men to be higher than that of women, there had to be two departments, such that the average acceptance rate of men in an ‘easier’ department was higher than the average acceptance rate of women in a ‘harder’ department\. This idea is captured by inter\-theoretic responsiveness: if under one theoryuuan actionaais preferred to another theoryvvfor actionbb\(i\.e\.u\(a\)\>v\(b\)u\(a\)\>v\(b\)\) then we can set the credence such thataawins out overbb\. In other words, inter\-theoretic responsiveness is the systematic possibility of Simpson’s paradox happening\.

Table 0:FROBO example demonstrating the violation of the weak Pareto property using MEC\.Let us now consider a concrete example demonstrating Simpson’s paradox for MEC and𝑝𝑟𝑜𝑑\\mathit\{prod\}\. Let𝐮\\mathbf\{u\}be an \(act\) utilitarian ethical theory with𝐮\(l\)=1\\mathbf\{u\}\(l\)=1and𝐮\(r\)=0\\mathbf\{u\}\(r\)=0\. Let the other theory be𝐯\\mathbf\{v\}, a two\-level utilitarian theory\. In particular, the rule\-utilitarian component of𝐯\\mathbf\{v\}has a rule that FROBO should save lives wherever possible\. As a result, attending to the right room is seen as quite a good action, as it satisfies an ethical norm, i\.e\.𝐯\(r\)=3\\mathbf\{v\}\(r\)=3\. However, FROBO has no established ethical rules to evaluate the left room; FROBO resorts to evaluating the left room from first \(utilitarian\) principles\. As stated earlier, FROBO reasons that the blockade will not be lifted and so the left room is a preferable action to the right room, i\.e\.𝐯\(l\)=4\\mathbf\{v\}\(l\)=4\.

Assume that the only contextual feature FROBO takes into account is the resource boundsrbrb\. Here, consider a modified version of the FROBO example wherein the computational power of FROBO is severely limited\. We therefore assume that FROBO’s \(act\) utilitarian calculations do not fare well under these stricter resource bounds in either of the rooms, i\.e\.rb\(l,𝐮\)=rb\(r,𝐮\)=0\.1rb\(l,\\mathbf\{u\}\)=rb\(r,\\mathbf\{u\}\)=0\.1\. Since𝐯\\mathbf\{v\}’s evaluations of the left room are based on similar calculations, it also does not perform well under FROBO’s resource constraints, i\.e\.rb\(l,𝐯\)=0\.1rb\(l,\\mathbf\{v\}\)=0\.1\. However, the ethical rule FROBO used to evaluate the right room is not affected by these constraints, i\.e\.rb\(r,𝐯\)=1rb\(r,\\mathbf\{v\}\)=1\. In other words,𝐯\\mathbf\{v\}is an ethical theory such that only one of its prescriptions are negatively impacted by the resource bounds\.

Assume the initial credences arec\(𝐮\)=0\.6c\(\\mathbf\{u\}\)=0\.6andc\(𝐯\)=0\.4c\(\\mathbf\{v\}\)=0\.4\. Using𝑝𝑟𝑜𝑑\\mathit\{prod\}, we can calculate the updated credences\. First,clc\_\{l\}is such thatcl\(𝐮\)=α×0\.1×0\.6=0\.06αc\_\{l\}\(\\mathbf\{u\}\)=\\alpha\\times 0\.1\\times 0\.6=0\.06\\alphaandcl\(𝐯\)=α×0\.1×0\.4=0\.04αc\_\{l\}\(\\mathbf\{v\}\)=\\alpha\\times 0\.1\\times 0\.4=0\.04\\alpha, Therefore,α=10\\alpha=10and socl\(𝐮\)=0\.6c\_\{l\}\(\\mathbf\{u\}\)=0\.6andcl\(𝐯\)=0\.4c\_\{l\}\(\\mathbf\{v\}\)=0\.4\. In other words, for the left room the initial credences remain unchanged, i\.e\.cl=cc\_\{l\}=c\.

Second,crc\_\{r\}is such thatcr\(𝐮\)=β×0\.1×0\.6=0\.06βc\_\{r\}\(\\mathbf\{u\}\)=\\beta\\times 0\.1\\times 0\.6=0\.06\\betaandcr\(𝐯\)=β×1×0\.4=0\.4βc\_\{r\}\(\\mathbf\{v\}\)=\\beta\\times 1\\times 0\.4=0\.4\\beta\. Therefore,β=5023\\beta=\\frac\{50\}\{23\}andcr\(𝐮\)≈0\.13c\_\{r\}\(\\mathbf\{u\}\)\\approx 0\.13andcr\(𝐯\)≈0\.87c\_\{r\}\(\\mathbf\{v\}\)\\approx 0\.87\. In other words, for the right room, two\-level utilitarian𝐯\\mathbf\{v\}is weighed significantly more, despite its lower initial credence\.

We calculate the weighted arithmetic mean of both actions\. For the left room:𝑤𝑎𝑚\(\{𝐮,𝐯\},⟨cl,cr⟩,l\)=cl\(𝐮\)×𝐮\(l\)\+cl\(𝐯\)×𝐯\(l\)=0\.6×1\+0\.4×4=2\.2\\mathit\{wam\}\(\\\{\\mathbf\{u\},\\mathbf\{v\}\\\},\\langle c\_\{l\},c\_\{r\}\\rangle,l\)=c\_\{l\}\(\\mathbf\{u\}\)\\times\\mathbf\{u\}\(l\)\+c\_\{l\}\(\\mathbf\{v\}\)\\times\\mathbf\{v\}\(l\)=0\.6\\times 1\+0\.4\\times 4=2\.2\. For the right room:𝑤𝑎𝑚\(\{𝐮,𝐯\},⟨cl,cr⟩,r\)=cr\(𝐮\)×𝐮\(r\)\+cr\(𝐯\)×𝐯\(r\)≈0\.13×0\+0\.87×3≈2\.61\\mathit\{wam\}\(\\\{\\mathbf\{u\},\\mathbf\{v\}\\\},\\langle c\_\{l\},c\_\{r\}\\rangle,r\)=c\_\{r\}\(\\mathbf\{u\}\)\\times\\mathbf\{u\}\(r\)\+c\_\{r\}\(\\mathbf\{v\}\)\\times\\mathbf\{v\}\(r\)\\approx 0\.13\\times 0\+0\.87\\times 3\\approx 2\.61\. Due to its higher weighted arithmetic mean, MEC prefers the right room\.

That is, despite all ethical theories preferring the left room to the right room, overall the right room is preferred\. Note that this happened because for the left room the evaluation of𝐮\\mathbf\{u\}was dominant, while for the right room the evaluation of𝐯\\mathbf\{v\}was dominant, where𝐮\(l\)<𝐯\(r\)\\mathbf\{u\}\(l\)<\\mathbf\{v\}\(r\)\.

## 5Conclusions and Future Work

The central thesis of our paper is that contextual factors should be taken into account when aggregating ethical evaluations of actions elicited from simulations\. We believe that our formalisation of context\-adjusted moral credences in the ethical theories licensing evaluations, and the study thereof, opens up a new area of study within moral uncertainty research and value\-alignment more generally, with many crucial follow up questions\.

#### Contextually\-Weighted Social Choice

We have shown that MEC \- the standard mode of aggregation in moral uncertainty \- may violate the weak Pareto property\. Moreover, we have argued that despite the apparent paradox, this is not problematic\. This departure from standard assumptions implies that contextually\-weighted preference aggregation diverges significantly from traditional social choice theory, where the weak Pareto property is typically regarded as foundational\(List[2022](https://arxiv.org/html/2606.06972#bib.bib47)\)\. Our findings therefore motivate a reassessment of the criteria used to evaluate aggregation methods in context\-sensitive settings\. Addressing these questions could lead to a new area of research at the intersection of moral uncertainty and social choice theory\.

#### Contextual Factors

We have identified several contextual factors that influence how pluralistic ethical evaluations are aggregated\. We do not claim this list to be exhaustive\. An important direction for future work is to more comprehensively investigate additional factors that may shape aggregation\. For example, the extent to which unanticipated context\-specific alternative choices of action may impact the degree to which one is risk averse\(Buchak[2013](https://arxiv.org/html/2606.06972#bib.bib34)\), and how this may differentially impact credences in moral theories\.

#### Moral Uncertainty

This paper has not addressed several key challenges in the literature on moral uncertainty\. One such challenge is the*problem of fanaticism*\(MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9); Szaboet al\.[2024](https://arxiv.org/html/2606.06972#bib.bib70)\), in which low\-credence moral theories can disproportionately influence an agent’s decision\-making\. Notably, MEC is known to be susceptible to this problem\. Further research is needed to understand how contextual factors might mitigate or exacerbate fanaticism and related phenomena\.

#### Adjustment Functions

We introduced two simple adjustment functions:𝑝𝑟𝑜𝑑\\mathit\{prod\}and𝑚𝑖𝑛𝑖\\mathit\{mini\}\. However, we do not claim that these functions are*descriptive*– that is, reflective of how humans actually incorporate contextual information – nor do we claim that they are*prescriptive*– that agents ought to adjust their credence in these ways\. Future work should explore adjustment functions along both these dimensions\. Empirical studies could help illuminate the descriptive question, while philosophical analysis can contribute to understanding the normative dimension\.

#### ‘On the fly’ Alignment Through Dialogue

In Section[2\.2](https://arxiv.org/html/2606.06972#S2.SS2)we suggested application of our work to scenarios in which anindividual’s relative credences in distinct ethical theories is used to support AI decision making ‘on the fly’\. Indeed, while the value alignment problem initially came to prominence in anticipation of more generally intelligent, in particular ‘superintelligent’, systems\(Bostrom[2014](https://arxiv.org/html/2606.06972#bib.bib105)\), it also applies more prosaically to narrow AI systems\.

Consider a personal assistant large language model PAL who assembles holiday itineraries for Sally\. Based on her basic requirements, PAL recommends itineraries in Malaga and Lanzarote\. Sally then asks PAL to estimate the overall carbon footprints of these itineraries\. On this basis Sally adopts a consequentialist argument for preferring Malaga\. However PAL reminds Sally that in their previous interactions, she deontologically preferred destinations with better value\-for\-money hotels, and that this constitutes a deontological argument for preferring Lanzarote\. Sally responds by arguing that in this decision making context sustainability is more important than cost, because she will be spending very little time in a hotel, given that the weather in both destinations will be gorgeous\. In other words, in this context her moral credence for the consequentialist based preference is greater than the deontology based preference\. As a result of this interaction, PAL’s understanding of Sally’s values is augmented and learnt for re\-use in future interactions\.

This scenario adheres to the spirit of value alignment solutions advocated by\(Hadfield–Menellet al\.[2016](https://arxiv.org/html/2606.06972#bib.bib27); Russell[2019](https://arxiv.org/html/2606.06972#bib.bib106)\), viz\.a\.vie\. that AI systems should perform tasks while simultaneously learning users’ value\-based preferences as they evolve over time\. However, these works assume that humans know their value based preferences from the outset \(and are learnt by passive observation and instruction from users\), rather than being shaped by reasons and argument\. On the other hand, in the Sally\-PAL dialogical interaction, PAL’s superior information processing is leveraged in support of the decision making task, while also exploiting PAL’s superior consequential reasoning in helping Sally establish her value\-based preferences\. Indeed, she is further supported by PAL’s reference to her prior decisions, which help establish – within this particular decision\-making context – differential moral credences assigned to the arguments associated with distinct consequentialist and deontological theories\. We therefore aim to integrate our work on moral credences and the impact of contextual factors, with ongoing proposals for argumentation\-based dialogues designed to support value\-alignment\(Bezou\-Vrakatseliet al\.[2024](https://arxiv.org/html/2606.06972#bib.bib26)\)\. Moreover, the uses of argument to support users in assigning theory specific valuations of actions and their differential credences in these theories, would be especially useful when subjects are asked to judge simulations in the multi\-step pipeline approaches our current paper assumes\(Noothigattuet al\.[2019](https://arxiv.org/html/2606.06972#bib.bib127)\)\. After all, the simulated scenarios typically present the kinds of ethical dilemmas that subjects are unlikely to have encountered in their everyday lives\. As a result: 1\) they are unlikely to feel confident in their ethical evaluations; 2\)individualsubjects may wish to assessindividualactions under different ethical theories, attributing varying degrees of confidence \(i\.e\. moral credences\) to the relevance of each theory, and 3\) they may seek arguments that offerprescriptiveguidance, especially since eliciting theirdescriptivepreferences – typically used for value alignment – is less feasible given the ethical novelty of the simulations\.

#### Thick Ethics

Moral uncertainty research, and more generally value\-alignment research, often makes strong, oversimplifying assumption regarding ethical theories; in philosophical terms, ethical theories are represented ‘thinly’\(Väyrynen[2025](https://arxiv.org/html/2606.06972#bib.bib3); MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9)\)\. That is, moral theories are merely seen as utility functions or preference orderings\. Such representations ignore the ‘thick’ concepts/commitments these ethical theories subscribe to:*how*and*why*these distinct theories support ethical evaluations\. Ignoring the ‘how’ is problematic as many ethical theories have fundamentally different algorithmic properties, as discussed in this paper\. Moreover, ignoring the ‘why’ is problematic as ethical theories, such as deontological ethical theories, cannot be properly represented by utility functions or preference orderings\(Alexander and Moore[2021](https://arxiv.org/html/2606.06972#bib.bib48); MacAskillet al\.[2020](https://arxiv.org/html/2606.06972#bib.bib9)\)\. In short, the very need to consider contextual factors arises from the limitations of thin representations of moral theories\. In this sense, our paper can be viewed as an attempt to ‘thicken’ ethical evaluations derived from simulations\. An alternative – and potentially complementary – approach would be to account for inherently thicker rationales for ethical evaluations such as those elicited by dialogue and argument\. The Sally\-PAL dialogue illustrates extraction of such richer ethical information than that captured through preferences or utility functions alone\.

## References

- S\. Adams, T\. Cody, and P\.A\. Beling \(2022\)A survey of inverse reinforcement learning\.Artificial Intelligence Review55,pp\. 4307––4346\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p2.1)\.
- L\. Alexander and M\. Moore \(2021\)Deontological Ethics\.InThe Stanford Encyclopedia of Philosophy,E\. N\. Zalta \(Ed\.\),Note:https://plato\.stanford\.edu/archives/win2021/entries/ethics\-deontological/Cited by:[§2\.1](https://arxiv.org/html/2606.06972#S2.SS1.p3.1),[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px6.p1.1)\.
- I\. Asimov \(1940\)I\. robot\.Narkaling Productions\.\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p3.1)\.
- E\. Awad, S\. Dsouza, R\. Kim, J\. Schulz, J\. Henrich, A\. Shariff, J\. Bonnefon, and I\. Rahwan \(2018\)The moral machine experiment\.Nature563\(7729\),pp\. 59–64\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p1.1),[§1](https://arxiv.org/html/2606.06972#S1.p2.1),[§1](https://arxiv.org/html/2606.06972#S1.p4.3),[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px4.p2.3)\.
- W\. A\. Bauer \(2020\)Virtuous vs\. utilitarian artificial moral agents\.AI Soc\.35\(1\),pp\. 263–271\.External Links:[Link](https://doi.org/10.1007/s00146-018-0871-3),[Document](https://dx.doi.org/10.1007/s00146-018-0871-3)Cited by:[§2\.1](https://arxiv.org/html/2606.06972#S2.SS1.p3.1),[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px1.p1.1),[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p1.1)\.
- S\. D\. Baum \(2020\)Social choice ethics in artificial intelligence\.Ai & Society35\(1\),pp\. 165–176\.Cited by:[footnote 3](https://arxiv.org/html/2606.06972#footnote3)\.
- T\. Bench\-Capon and S\. Modgil \(2017\)Norms and value based reasoning: justifying compliance and violations\.Artificial Intelligence and Law25,pp\. 29–64\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p2.1)\.
- T\. J\. Bench\-Capon \(2020\)Ethical approaches and autonomous systems\.Artificial Intelligence281,pp\. 103239\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px1.p1.1)\.
- T\. Bench\-Capon and S\. Modgil \(2019\)Norms and extended argumentation frameworks\.InProceedings of the Seventeenth International Conference on Artificial Intelligence and Law,pp\. 174–178\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p2.1)\.
- E\. Bezou\-Vrakatseli, O\. Cocarascu, and S\. Modgil \(2024\)Towards dialogues for joint human\-ai reasoning and value alignment\.External Links:2405\.18073,[Link](https://arxiv.org/abs/2405.18073)Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px5.p3.1)\.
- V\. Bhargava and T\. W\. Kim \(2017\)Autonomous vehicles and moral uncertainty\.Robot ethics2\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p3.4)\.
- K\. Bogosian \(2017\)Implementation of moral uncertainty in intelligent machines\.Minds Mach\.27\(4\),pp\. 591–608\.External Links:[Link](https://doi.org/10.1007/s11023-017-9448-z),[Document](https://dx.doi.org/10.1007/s11023-017-9448-z)Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p3.4),[footnote 9](https://arxiv.org/html/2606.06972#footnote9)\.
- N\. Bostrom \(2014\)Superintelligence: paths, dangers, strategies\.Oxford University Press, Oxford\.Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px5.p1.1)\.
- L\. Buchak \(2013\)Risk and rationality\.Oxford University Press\.Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px2.p1.1)\.
- V\. Conitzer, R\. Freedman, J\. Heitzig, W\. H\. Holliday, B\. M\. Jacobs, N\. Lambert, M\. Mossé, E\. Pacuit, S\. Russell, H\. Schoelkopf, E\. Tewolde, and W\. S\. Zwicker \(2024\)Position: social choice should guide AI alignment in dealing with diverse human feedback\.InForty\-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21\-27, 2024,External Links:[Link](https://openreview.net/forum?id=w1d9DOGymR)Cited by:[footnote 3](https://arxiv.org/html/2606.06972#footnote3)\.
- A\. Ecoffet and J\. Lehman \(2021\)Reinforcement learning under moral uncertainty\.InProceedings of the 38th International Conference on Machine Learning, ICML 2021, 18\-24 July 2021, Virtual Event,M\. Meila and T\. Zhang \(Eds\.\),Proceedings of Machine Learning Research, Vol\.139,pp\. 2926–2936\.External Links:[Link](http://proceedings.mlr.press/v139/ecoffet21a.html)Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p3.4)\.
- P\. Foot \(1967\)The problem of abortion and the doctrine of double effect\.Vol\.5,Oxford\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p4.3),[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px2.p2.1)\.
- I\. Gabriel \(2020\)Artificial intelligence, values, and alignment\.Minds and machines30\(3\),pp\. 411–437\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p1.1)\.
- J\. Greene \(2015\)Moral tribes: emotion, reason and the gap between us and them\.Atlantic Books\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px2.p2.1)\.
- D\. Hadfield–Menell, A\. Dragan, P\.Abbeel, and S\. Russell \(2016\)Cooperative inverse reinforcement learning\.InNIPS’16: Proc\. 30th Int\. Conference on Neural Information Processing Systems,pp\. 3916–3924\.Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px5.p3.1)\.
- J\. Haidt \(2012\)The righteous mind: why good people are divided by politics and religion\.Vintage\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p1.1)\.
- R\. M\. Hare \(1963\)Freedom and reason\.Vol\.92,Oxford Paperbacks\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p2.1)\.
- D\. Heyd \(2019\)Supererogation\.InThe Stanford Encyclopedia of Philosophy,E\. N\. Zalta \(Ed\.\),Note:https://plato\.stanford\.edu/archives/win2019/entries/supererogation/Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px4.p1.1)\.
- K\. V\. Kortenk Amp and C\. F\. Moore \(2014\)Ethics under uncertainty: the morality and appropriateness of utilitarianism when outcomes are uncertain\.The American journal of psychology127\(3\),pp\. 367–382\.Cited by:[footnote 6](https://arxiv.org/html/2606.06972#footnote6)\.
- C\. List \(2022\)Social Choice Theory\.InThe Stanford Encyclopedia of Philosophy,E\. N\. Zalta and U\. Nodelman \(Eds\.\),Note:https://plato\.stanford\.edu/archives/win2022/entries/social\-choice/Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px1.p1.1)\.
- W\. MacAskill, K\. Bykvist, and T\. Ord \(2020\)Moral uncertainty\.Oxford University Press\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p1.1),[§1](https://arxiv.org/html/2606.06972#S1.p3.4),[§2\.1](https://arxiv.org/html/2606.06972#S2.SS1.p1.21),[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.p1.1),[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px3.p1.1),[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px6.p1.1)\.
- W\. MacAskill \(2014\)Normative uncertainty\.Ph\.D\. Thesis,University of Oxford\.Cited by:[§3](https://arxiv.org/html/2606.06972#S3.p18.1)\.
- W\. MacAskill \(2016\)Normative uncertainty as a voting problem\.Mind125\(500\),pp\. 967–1004\.Cited by:[footnote 9](https://arxiv.org/html/2606.06972#footnote9)\.
- M\. J\. Maher \(2001\)Propositional defeasible logic has linear complexity\.Theory and Practice of Logic Programming1\(6\),pp\. 691–711\.External Links:[Document](https://dx.doi.org/10.1017/S1471068401001168)Cited by:[footnote 5](https://arxiv.org/html/2606.06972#footnote5)\.
- A\. Martinho, M\. Kroesen, and C\. Chorus \(2021\)Computer says i don’t know: an empirical approach to capture moral uncertainty in artificial intelligence\.Minds and Machines31\(2\),pp\. 215–237\.Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p3.4)\.
- R\. Noothigattu, D\. Bouneffouf, N\. Mattei, R\. Chandra, P\. Madan, K\. R\. Varshney, M\. Campbell, M\. Singh, and F\. Rossi \(2019\)Teaching AI agents ethical values using reinforcement learning and policy orchestration\.IBM J\. Res\. Dev\.63\(4/5\),pp\. 2:1–2:9\.External Links:[Link](https://doi.org/10.1147/JRD.2019.2940428),[Document](https://dx.doi.org/10.1147/JRD.2019.2940428)Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p2.1),[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px5.p3.1)\.
- T\. Qiu \(2024\)Representative social choice: from learning theory to AI alignment\.CoRRabs/2410\.23953\.External Links:[Link](https://doi.org/10.48550/arXiv.2410.23953),[Document](https://dx.doi.org/10.48550/ARXIV.2410.23953),2410\.23953Cited by:[footnote 3](https://arxiv.org/html/2606.06972#footnote3)\.
- S\. Russell \(2019\)Human compatible: ai and the problem of control\.Penguin Uk\.Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px5.p3.1)\.
- P\. Schlag \(1999\)No vehicles in the park\.Seattle UL Rev\.23,pp\. 381\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p2.1)\.
- M\. Serramia, M\. López\-Sánchez, J\. A\. Rodríguez\-Aguilar, J\. Morales, M\. J\. Wooldridge, and C\. Ansótegui \(2018\)Exploiting moral values to choose the right norms\.InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018, New Orleans, LA, USA, February 02\-03, 2018,J\. Furman, G\. E\. Marchant, H\. Price, and F\. Rossi \(Eds\.\),pp\. 264–270\.External Links:[Link](https://doi.org/10.1145/3278721.3278735),[Document](https://dx.doi.org/10.1145/3278721.3278735)Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p1.1)\.
- W\. Sinnott\-Armstrong \(2022\)Consequentialism\.InThe Stanford Encyclopedia of Philosophy,E\. N\. Zalta and U\. Nodelman \(Eds\.\),Note:https://plato\.stanford\.edu/archives/win2022/entries/consequentialism/Cited by:[§2\.1](https://arxiv.org/html/2606.06972#S2.SS1.p3.1),[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px3.p1.1)\.
- B\. Skyrms \(2000\)Choice and chance\.Belmont, CA: Wadsworth/Thompson\.Cited by:[footnote 2](https://arxiv.org/html/2606.06972#footnote2)\.
- T\. Sorensen, J\. Moore, J\. Fisher, M\. L\. Gordon, N\. Mireshghallah, C\. M\. Rytting, A\. Ye, L\. Jiang, X\. Lu, N\. Dziri, T\. Althoff, and Y\. Choi \(2024\)Position: A roadmap to pluralistic alignment\.InForty\-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21\-27, 2024,External Links:[Link](https://openreview.net/forum?id=gQpBnRHwxM)Cited by:[§1](https://arxiv.org/html/2606.06972#S1.p1.1)\.
- J\. Stenseke \(2024\)On the computational complexity of ethics: moral tractability for minds and machines\.Artificial Intelligence Review57\(4\),pp\. 105\.Cited by:[§2\.2](https://arxiv.org/html/2606.06972#S2.SS2.SSS0.Px1.p1.1)\.
- J\. Szabo, N\. Criado, J\. Such, and S\. Modgil \(2024\)Moral uncertainty and the problem of fanaticism\.InProceedings of the AAAI Conference on Artificial Intelligence,Vol\.38,pp\. 19948–19955\.Cited by:[§2\.1](https://arxiv.org/html/2606.06972#S2.SS1.p1.21),[§2\.1](https://arxiv.org/html/2606.06972#S2.SS1.p2.2),[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px3.p1.1)\.
- P\. Väyrynen \(2025\)Thick Ethical Concepts\.InThe Stanford Encyclopedia of Philosophy,E\. N\. Zalta and U\. Nodelman \(Eds\.\),Note:https://plato\.stanford\.edu/archives/spr2025/entries/thick\-ethical\-concepts/Cited by:[§5](https://arxiv.org/html/2606.06972#S5.SS0.SSS0.Px6.p1.1)\.
- C\. H\. Wagner \(1982\)Simpson’s paradox in real life\.The American Statistician36\(1\),pp\. 46–48\.Cited by:[§4\.2](https://arxiv.org/html/2606.06972#S4.SS2.p1.1)\.
Accounting for Context: Shaping Moral Credences for Value Alignment

Similar Articles

Frame-Conditioned Moral Computation in LLaMA 3.1-8B-Instruct: A Mechanistic Interpretability Audit of Ethical Reasoning

@BetaTomorrow: https://x.com/BetaTomorrow/status/2077136005266878745

Constructive Alignment: Governing Preference Dynamics in Human-AI Interaction

@AnthropicAI: Read the full post here: https://alignment.anthropic.com/2026/teaching-claude-why/…

AI Alignment: Can we trust the reasoning behind the AI task?

Submit Feedback

Similar Articles

Frame-Conditioned Moral Computation in LLaMA 3.1-8B-Instruct: A Mechanistic Interpretability Audit of Ethical Reasoning
@BetaTomorrow: https://x.com/BetaTomorrow/status/2077136005266878745
Constructive Alignment: Governing Preference Dynamics in Human-AI Interaction
@AnthropicAI: Read the full post here: https://alignment.anthropic.com/2026/teaching-claude-why/…
AI Alignment: Can we trust the reasoning behind the AI task?