Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation

arXiv cs.AI 06/08/26, 04:00 AM Papers
bias-mitigation fairness machine-learning symmetry regularization classification ai-ethics
Summary
The paper proposes treating fairness as a symmetry operation in machine learning classifiers, implementing loss-based regularization to enforce invariance under swapping of sensitive attributes while holding merit features fixed. The framework achieves over 90% bias reduction with minimal accuracy loss and requires no causal graph knowledge.
arXiv:2606.06514v1 Announce Type: new Abstract: Machine learning systems deployed in high stakes socioeconomic settings routinely display bias. We formalize bias as a symmetry breaking operation: a classifier is fair if its outputs remain invariant under the counterfactual operation of switching a sensitive attribute, with merit features held fixed. We implement loss based regularization as a symmetry restoring mechanism and evaluate the framework on four synthetic datasets with varying levels of noise, correlation, and bias. The framework achieves upwards of 90\% violation reduction, with accuracy costs around 5\%. This framework does not require causal graph knowledge, is computationally lightweight, and generalizes to any sensitive attribute definable as a bit-flip, making it suitable for contexts where local sources of discrimination remain absent from mainstream benchmarks.
Original Article
View Cached Full Text
Cached at: 06/08/26, 09:13 AM
# Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation
Source: [https://arxiv.org/html/2606.06514](https://arxiv.org/html/2606.06514)
###### Abstract

Machine learning systems deployed in high stakes socioeconomic settings routinely display bias\. We formalize bias as a symmetry breaking operation: a classifier is fair if its outputs remain invariant under the counterfactual operation of switching a sensitive attribute, with merit features held fixed\. We implement loss based regularization as a symmetry restoring mechanism and evaluate the framework on four synthetic datasets with varying levels of noise, correlation, and bias\. The framework achieves upwards of 90% violation reduction, with accuracy costs around 5%\. This framework does not require causal graph knowledge, is computationally lightweight, and generalizes to any sensitive attribute definable as a bit\-flip, making it suitable for contexts where local sources of discrimination remain absent from mainstream benchmarks\.

Machine Learning, ICML

## 1Introduction

Machine Learning is increasingly being implemented in high\-stakes decisions concerning real social groups\. In these contexts, it is vital that the models and algorithms used are fair and just\. However, such systems often exhibit systematic biases against socially sensitive groups, proving to be a significant source of harm\(Hardtet al\.,[2016](https://arxiv.org/html/2606.06514#bib.bib2); Chouldechova,[2017](https://arxiv.org/html/2606.06514#bib.bib1)\)\. Previous research has proposed statistical definitions of fairness via criteria like demographic parity, equalized odds, and calibration as post\-hoc constraints on a trained model\.\(Chouldechova,[2017](https://arxiv.org/html/2606.06514#bib.bib1); Kleinberget al\.,[2016](https://arxiv.org/html/2606.06514#bib.bib3)\)prove that it is impossible to simultaneously satisfy calibration and equalized odds, opening the landscape for a fundamentally different approach: rather than imposing post\-hoc constraints, we ask what structural invariance an unbiased model should satisfy by construction\.

In this work, we treat fairness as a symmetry operation, and bias as a symmetry breaking operation\. Borrowing the language of group theory and physics, a system has a symmetry if it is invariant under a group action\. Formally, a classifierffis invariant under actionTTiff\(x\)=f\(Tx\)f\(x\)=f\(Tx\)for allx∈𝒳x\\in\\mathcal\{X\}\. In our setting,TTis the counterfactual operator that flips all sensitive attributes111Attributes like caste, gender, race etc\.while holding merit features222Attributes like education, experience, age etc\.fixedT\(𝐱\)=\[𝐱m;1−𝐱s\]T\(\\mathbf\{x\}\)=\[\\mathbf\{x\}\_\{m\};\\ \\mathbf\{1\}\-\\mathbf\{x\}\_\{s\}\]where𝐱m\\mathbf\{x\}\_\{m\}are merit features and𝐱s\\mathbf\{x\}\_\{s\}are sensitive attributes\. This line of work closely follows the paradigm of symmetry\-aware ML in texts likeCohen and Welling \([2016](https://arxiv.org/html/2606.06514#bib.bib4)\)\. This invariance condition is an observational flip, and thus requires no causal graph knowledge, making it simpler and more tractable than counterfactual fairness\(Kusneret al\.,[2017](https://arxiv.org/html/2606.06514#bib.bib5)\), at the cost of not accounting for causal mediation through correlated features\.

## 2Problem Formulation

### 2\.1Setup

We define a probabilistic classifierf:X→\[0,1\]f:X\\to\[0,1\]\. Under the model,f\(x\)=P\(y=1\|x\)f\(x\)=P\(y=1\|x\), whereyyis a binary labely=\{0,1\}y=\\\{0,1\\\}, andx∈Xx\\in X\.xxis partitioned asx=\[xm;xs\]x=\[x\_\{m\};x\_\{s\}\], wherexmx\_\{m\}are merit features andxsx\_\{s\}are sensitive attributes\. We define the transformationT:X→XT:X\\to XasT\(x\)=\[xm;1−xs\]T\(x\)=\[x\_\{m\};1\-x\_\{s\}\]\.

### 2\.2Bias as Symmetry Breaking

A classifier is symmetric; orTT\-invariant iff\(x\)=f\(Tx\)f\(x\)=f\(Tx\)for allx∈Xx\\in X\. Similar toDworket al\.\([2012](https://arxiv.org/html/2606.06514#bib.bib6)\), we define pointwise violation asv\(x\)=\|f\(Tx\)−f\(x\)\|v\(x\)=\|f\(Tx\)\-f\(x\)\|, and population violation asV=Ex∼PX\[\|f\(x\)−f\(T\(x\)\)\|\]V=E\_\{x\\sim P\_\{X\}\}\[\|f\(x\)\-f\(T\(x\)\)\|\], whereP\(X\)P\(X\)is the data distribution\. Practically, population violation is approximated by the empirical meanV^=1n∑i=1n\|f\(Txi\)−f\(xi\)\|\\hat\{V\}=\\frac\{1\}\{n\}\\sum\_\{i=1\}^\{n\}\|f\(Tx\_\{i\}\)\-f\(x\_\{i\}\)\|\.

Table 1:Dataset taxonomy whereγ=\[γgender,γrace\]\\gamma=\[\\gamma\_\{gender\},\\gamma\_\{race\}\]andd=6d=6\(2 sensitive attributes \+ 4 merit features\)\.
### 2\.3Dataset Generation

The labelyyis drawn from

y∣x∼Bernoulli\(σ\(β0\+𝜷⊤xm\+𝜸⊤xs\)\)y\\mid x\\sim\\text\{Bernoulli\}\\\!\\left\(\\sigma\\\!\\left\(\\beta\_\{0\}\+\\boldsymbol\{\\beta\}^\{\\top\}x\_\{m\}\+\\boldsymbol\{\\gamma\}^\{\\top\}x\_\{s\}\\right\)\\right\)
whereσ\(z\)=\(1\+e−z\)−1\\sigma\(z\)=\(1\+e^\{\-z\}\)^\{\-1\}is the logistic sigmoid,xm=\[age,years\_exp,education,skill\_score\]⊤x\_\{m\}=\[\\texttt\{age\},\\ \\texttt\{years\\\_exp\},\\ \\texttt\{education\},\\ \\texttt\{skill\\\_score\}\]^\{\\top\}are the merit features,xs=\[gender,race\]⊤x\_\{s\}=\[\\texttt\{gender\},\\ \\texttt\{race\}\]^\{\\top\}are the sensitive attributes,𝜷=\[0\.0,0\.04,0\.70,0\.035\]⊤\\boldsymbol\{\\beta\}=\[0\.0,\\ 0\.04,\\ 0\.70,\\ 0\.035\]^\{\\top\}are the merit coefficients,β0=−3\.0\\beta\_\{0\}=\-3\.0is the intercept, and𝜸\\boldsymbol\{\\gamma\}is the injected bias vector whose values vary by dataset as described in Table 1\.

The baseline employment probability \(average merit, no bias\) is about 4\.7%, reflecting competitive hiring\. Education is given the highest influence on merit, followed by experience and skill\. The symmetry breaking term is the bias injection, which is \[0\.5, 0\.375\] for low bias datasets and \[1\.8, 1\.35\] for high bias datasets\. If𝜸=0\\boldsymbol\{\\gamma\}=0,V=0V=0by construction\.

In order to stress the model, we also generate a correlation between merit features by generating them as a function of sensitive features333Through the correlation injectionedui←clip\(edui\+0\.6⋅genderi\+ϵi,0,3\),ϵi∼N\(0,0\.09\)andskilli←clip\(skilli\+8⋅genderi\+δi,0,100\),δi∼N\(0,9\)\\text\{edu\}\_\{i\}\\leftarrow\\text\{clip\(edu\}\_\{i\}\+0\.6\\cdot\\text\{gender\}\_\{i\}\+\\epsilon\_\{i\},0,3\),\\epsilon\_\{i\}\\sim N\(0,0\.09\)\\textbf\{ and \}\\text\{skill\}\_\{i\}\\leftarrow\\text\{clip\(skill\}\_\{i\}\+8\\cdot\\text\{gender\}\_\{i\}\+\\delta\_\{i\},0,100\),\\delta\_\{i\}\\sim N\(0,9\)\. We also introduce noise444The noise is generated bynk\(i\)=ϵk\(i\)\+δk⋅s\(i\)n\_\{k\}^\{\(i\)\}=\\epsilon\_\{k\}^\{\(i\)\}\+\\delta\_\{k\}\\cdot s^\{\(i\)\}, wherenk\(i\)n\_\{k\}^\{\(i\)\}is thekkth noise feature for sampleii,ϵk\(i\)∼𝒩\(0,1\)\\epsilon\_\{k\}^\{\(i\)\}\\sim\\mathcal\{N\}\(0,1\),si∈\{0,1\}s\_\{i\}\\in\\\{0,1\\\}is the sensitive attribute, andδk\\delta\_\{k\}is the spurious correlation coefficient\.in a high bias dataset, which are 6 additional features which carry no meaningful signal aboutyy, and made to have a small spurious correlation with the sensitive attributes\.

### 2\.4Loss Based Regularization

The full objective is:

ℒ\(𝐰,b\)=ℒtask\(𝐰,b\)\+λℒsym\(𝐰,b\)\\mathcal\{L\}\(\\mathbf\{w\},b\)=\\mathcal\{L\}\_\{task\}\(\\mathbf\{w\},b\)\+\\lambda\\mathcal\{L\}\_\{sym\}\(\\mathbf\{w\},b\)Whereℒtask\\mathcal\{L\}\_\{task\}is the standard task loss by binary cross entropy:

ℒtask=−1n∑i=1n\[yilogf\(xi\)\+\(1−yi\)log\(1−f\(xi\)\)\]\\mathcal\{L\}\_\{task\}=\-\\frac\{1\}\{n\}\\sum\_\{i=1\}^\{n\}\[y\_\{i\}logf\(x\_\{i\}\)\+\(1\-y\_\{i\}\)log\(1\-f\(x\_\{i\}\)\)\]Andf\(x\)=σ\(𝐰⊤x~\+b\);σ\(z\)=1/\(1\+e−z\)f\(x\)=\\sigma\(\\mathbf\{w\}\\top\\tilde\{x\}\+b\);\\sigma\(z\)=1/\(1\+e^\{\-z\}\)\. We define symmetric lossℒsym\\mathcal\{L\}\_\{sym\}as:

ℒsym=1n∑i=1n\[f\(xi\)−f\(Txi\)\]2\\mathcal\{L\}\_\{sym\}=\\frac\{1\}\{n\}\\sum\_\{i=1\}^\{n\}\[f\(x\_\{i\}\)\-f\(Tx\_\{i\}\)\]^\{2\}Now, we define the gradient for the new loss functionℒ\\mathcal\{L\}by defining the prediction gap for sampleiiasΔi=f\(𝐱i\)−f\(T\(𝐱i\)\)\\Delta\_\{i\}=f\(\\mathbf\{x\}\_\{i\}\)\-f\(T\(\\mathbf\{x\}\_\{i\}\)\)\. Applying the chain rule toℒsym=1n∑i=1nΔi2\\mathcal\{L\}\_\{\\text\{sym\}\}=\\frac\{1\}\{n\}\\sum\_\{i=1\}^\{n\}\\Delta\_\{i\}^\{2\}:

∂ℒsym∂𝐰=2n∑i=1nΔi⋅∂Δi∂𝐰\\frac\{\\partial\\mathcal\{L\}\_\{\\text\{sym\}\}\}\{\\partial\\mathbf\{w\}\}=\\frac\{2\}\{n\}\\sum\_\{i=1\}^\{n\}\\Delta\_\{i\}\\cdot\\frac\{\\partial\\Delta\_\{i\}\}\{\\partial\\mathbf\{w\}\}Sincef\(𝐱\)=σ\(𝐰⊤𝐱~\+b\)f\(\\mathbf\{x\}\)=\\sigma\(\\mathbf\{w\}^\{\\top\}\\tilde\{\\mathbf\{x\}\}\+b\)andσ′\(z\)=σ\(z\)\(1−σ\(z\)\)\\sigma^\{\\prime\}\(z\)=\\sigma\(z\)\(1\-\\sigma\(z\)\):

∂Δi∂𝐰=f\(𝐱i\)\(1−f\(𝐱i\)\)𝐱~i−f\(T\(𝐱i\)\)\(1−f\(T\(𝐱i\)\)\)T\(𝐱i\)~\\begin\{split\}\\frac\{\\partial\\Delta\_\{i\}\}\{\\partial\\mathbf\{w\}\}=\{\}&f\(\\mathbf\{x\}\_\{i\}\)\\bigl\(1\-f\(\\mathbf\{x\}\_\{i\}\)\\bigr\)\\tilde\{\\mathbf\{x\}\}\_\{i\}\\\\ &\-f\(T\(\\mathbf\{x\}\_\{i\}\)\)\\bigl\(1\-f\(T\(\\mathbf\{x\}\_\{i\}\)\)\\bigr\)\\widetilde\{T\(\\mathbf\{x\}\_\{i\}\)\}\\end\{split\}Substituting:

∂ℒsym∂𝐰=2n∑i=1nΔi\[f\(𝐱i\)\(1−f\(𝐱i\)\)𝐱~i−f\(T\(𝐱i\)\)\(1−f\(T\(𝐱i\)\)\)T\(𝐱i\)~\]\\begin\{split\}\\frac\{\\partial\\mathcal\{L\}\_\{\\text\{sym\}\}\}\{\\partial\\mathbf\{w\}\}=\\frac\{2\}\{n\}\\sum\_\{i=1\}^\{n\}\\Delta\_\{i\}\\Bigl\[&f\(\\mathbf\{x\}\_\{i\}\)\\bigl\(1\-f\(\\mathbf\{x\}\_\{i\}\)\\bigr\)\\tilde\{\\mathbf\{x\}\}\_\{i\}\\\\ &\-f\(T\(\\mathbf\{x\}\_\{i\}\)\)\\bigl\(1\-f\(T\(\\mathbf\{x\}\_\{i\}\)\)\\bigr\)\\widetilde\{T\(\\mathbf\{x\}\_\{i\}\)\}\\Bigr\]\\end\{split\}The update rule for learning rateη\\etawith the new full objective gradient is

𝐰←𝐰−η\(∂ℒtask∂𝐰\+λ∂ℒsym∂𝐰\)\\mathbf\{w\}\\leftarrow\\mathbf\{w\}\-\\eta\\left\(\\frac\{\\partial\\mathcal\{L\}\_\{\\text\{task\}\}\}\{\\partial\\mathbf\{w\}\}\+\\lambda\\frac\{\\partial\\mathcal\{L\}\_\{\\text\{sym\}\}\}\{\\partial\\mathbf\{w\}\}\\right\)

## 3Experiment

We evaluate the loss regularization method across four synthetic datasets of increasing structural complexity\. All experiments use n=2000 samples, a 75/25 train/test split stratified by label where the positive rate exceeds 5%\. The violation metricVVis computed on both train and test sets using soft predicted probabilities\. For the loss regularization method we sweepλ∈\{0\.0,0\.5,1\.0,2\.0,5\.0,10\.0\}\\lambda\\in\\\{0\.0,0\.5,1\.0,2\.0,5\.0,10\.0\\\}, whereλ=0\\lambda=0recovers the unregularized baseline\. For datasetD2D\_\{2\}\(low bias and correlated\), we find the predictive accuracies to be comparable, as shown in Figure 1\. Comparing the violation metric, we see a stark 93\.2% decrease in the violation on the test set forD2D\_\{2\}, as visualized in Figures 2 and 3\. Accuracies and violations are visualized against theλ\\lambdasweep for all datasets in Figure 4\. Comparisons for violation between the baseline model and regularized model for all datasets can be found in the Appendix\.

![Refer to caption](https://arxiv.org/html/2606.06514v1/Frame-1.png)Figure 1:Comparison of accuracies of the baseline model vs\. the regularized model of the datasetD2D\_\{2\}\.![Refer to caption](https://arxiv.org/html/2606.06514v1/Frame-2.png)

Figure 2:Comparison of the violation by the baseline model vs\. the regularized model\.![Refer to caption](https://arxiv.org/html/2606.06514v1/Frame-3.png)

Figure 3:Scatter plots of the outputs of the baseline model vs\. the regularized model\.![Refer to caption](https://arxiv.org/html/2606.06514v1/Frame-4.png)

Figure 4:λ\\lambdavs\. Violation andλ\\lambdavs\. Accuracy for all datasetsD1D\_\{1\},D2D\_\{2\},D3D\_\{3\}andD4D\_\{4\}\.
## Related Work

Algorithmic bias has extensively been studied through the lens of statistical criteria as applied post\-hoc constraints\.Hardtet al\.\([2016](https://arxiv.org/html/2606.06514#bib.bib2)\)introduced equalized odds, requiring equal true and false positive rates across demographics\.Chouldechova \([2017](https://arxiv.org/html/2606.06514#bib.bib1)\)andKleinberget al\.\([2016](https://arxiv.org/html/2606.06514#bib.bib3)\)independently prove that calibration and equalized odds cannot simultaneously hold when base rates differ across groups, motivating our approach to study the structural invariance of an unbiased model\.

In\-processing methods, like the models proposed byKamishimaet al\.\([2012](https://arxiv.org/html/2606.06514#bib.bib7)\)add a prejudice regularizer555The mutual information between prediction and sensitive attributes\.to logistic regression during training\.Zhanget al\.\([2018](https://arxiv.org/html/2606.06514#bib.bib8)\)takes an adversarial approach by training an adversary to retrieve the sensitive attributes from the predictions of the model\.

Our work is most closely related to counterfactual fairness\(Kusneret al\.,[2017](https://arxiv.org/html/2606.06514#bib.bib5)\), which defines a predictor as unbiased if the output is invariant to interventions on sensitive attributes in a structural causal model\.Kilbertuset al\.\([2017](https://arxiv.org/html/2606.06514#bib.bib9)\)extend this line of work by blocking discriminatory pathways identified via causal path analysis\. Our work is a deliberate simplification in order to be functional in contexts without a causal graph, with the tradeoff of not accounting for causal mediation through correlated features, stress\-tested in datasetD2D\_\{2\}\.

## Discussion

Automated decision making systems are being deployed rapidly all over the world to aid in critical processes like hiring, welfare allocation, and financial analysis\(Okolo,[2020](https://arxiv.org/html/2606.06514#bib.bib10)\)\. These deployments often have less regulatory oversight on the demographics which are underrepresented in the datasets used to develop them\(Joseph,[2025](https://arxiv.org/html/2606.06514#bib.bib11)\), motivating research in fairness and bias mitigation methods\.

In this work, we have utilized synthetic data with built in structural bias as a low resource tool\. Popular fairness benchmarks666COMPAS, Adult Income, etc\.are predominantly western in origin and do not effectively encode a lot of prejudices and biases from countries from the Global South\(Sambasivanet al\.,[2021](https://arxiv.org/html/2606.06514#bib.bib12)\)\. The symmetry violation and regularizer were developed, validated and researched on datasets with a known data generation process, providing a framework for fairness research in data scarce environments\.

The data generation process is also general by design\. The framework does not require a redesign to include different protected groups, since the biases encoded inxsx\_\{s\}can be redefined and the coefficients readjusted to reflect the diverse socioeconomic landscapes of different regions, proving to be effective in modeling different industries with different sensitive attributes like healthcare, hiring, and lending777Attributes like age might be a sensitive factor in healthcare and lending and not hiring; similarly education might be a sensitive factor for hiring and lending, not healthcare\.\.

This bias mitigation paradigm is also computationally lightweight\. Since this design does not require causal graph knowledge or training an adversary to identify and mitigate bias, the model at cost of causal negligence provides a significant compute discount\.

## References

- A\. Chouldechova \(2017\)Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments\.Big Data5\(2\),pp\. 153–163\.External Links:[Document](https://dx.doi.org/10.1089/big.2016.0047)Cited by:[§1](https://arxiv.org/html/2606.06514#S1.p1.1),[Related Work](https://arxiv.org/html/2606.06514#Sx1.p1.1)\.
- T\. S\. Cohen and M\. Welling \(2016\)Group equivariant convolutional networks\.InProceedings of the 33rd International Conference on International Conference on Machine Learning \- Volume 48,ICML’16,pp\. 2990–2999\.Cited by:[§1](https://arxiv.org/html/2606.06514#S1.p2.8)\.
- C\. Dwork, M\. Hardt, T\. Pitassi, O\. Reingold, and R\. Zemel \(2012\)Fairness through awareness\.InProceedings of the 3rd Innovations in Theoretical Computer Science Conference,ITCS ’12,New York, NY, USA,pp\. 214–226\.External Links:ISBN 9781450311151,[Link](https://doi.org/10.1145/2090236.2090255),[Document](https://dx.doi.org/10.1145/2090236.2090255)Cited by:[§2\.2](https://arxiv.org/html/2606.06514#S2.SS2.p1.7)\.
- M\. Hardt, E\. Price, and N\. Srebro \(2016\)Equality of opportunity in supervised learning\.External Links:1610\.02413,[Link](https://arxiv.org/abs/1610.02413)Cited by:[§1](https://arxiv.org/html/2606.06514#S1.p1.1),[Related Work](https://arxiv.org/html/2606.06514#Sx1.p1.1)\.
- J\. Joseph \(2025\)Algorithmic bias in public health AI: a silent threat to equity in low\-resource settings\.Frontiers in Public Health13,pp\. 1643180\.External Links:[Document](https://dx.doi.org/10.3389/fpubh.2025.1643180)Cited by:[Discussion](https://arxiv.org/html/2606.06514#Sx2.p1.1)\.
- T\. Kamishima, S\. Akaho, H\. Asoh, and J\. Sakuma \(2012\)Fairness\-aware classifier with prejudice remover regularizer\.InMachine Learning and Knowledge Discovery in Databases,P\. A\. Flach, T\. De Bie, and N\. Cristianini \(Eds\.\),Berlin, Heidelberg,pp\. 35–50\.External Links:ISBN 978\-3\-642\-33486\-3Cited by:[Related Work](https://arxiv.org/html/2606.06514#Sx1.p2.1)\.
- N\. Kilbertus, M\. Rojas\-Carulla, G\. Parascandolo, M\. Hardt, D\. Janzing, and B\. Schölkopf \(2017\)Avoiding discrimination through causal reasoning\.InProceedings of the 31st International Conference on Neural Information Processing Systems,NIPS’17,Red Hook, NY, USA,pp\. 656–666\.External Links:ISBN 9781510860964Cited by:[Related Work](https://arxiv.org/html/2606.06514#Sx1.p3.1)\.
- J\. M\. Kleinberg, S\. Mullainathan, and M\. Raghavan \(2016\)Inherent trade\-offs in the fair determination of risk scores\.InInformation Technology Convergence and Services,External Links:[Link](https://api.semanticscholar.org/CorpusID:12845273)Cited by:[§1](https://arxiv.org/html/2606.06514#S1.p1.1),[Related Work](https://arxiv.org/html/2606.06514#Sx1.p1.1)\.
- M\. Kusner, J\. Loftus, C\. Russell, and R\. Silva \(2017\)Counterfactual fairness\.InProceedings of the 31st International Conference on Neural Information Processing Systems,NIPS’17,Red Hook, NY, USA,pp\. 4069–4079\.External Links:ISBN 9781510860964Cited by:[§1](https://arxiv.org/html/2606.06514#S1.p2.8),[Related Work](https://arxiv.org/html/2606.06514#Sx1.p3.1)\.
- C\. T\. Okolo \(2020\)AI in the ”real world”: examining the impact of ai deployment in low\-resource contexts\.External Links:2012\.01165,[Link](https://arxiv.org/abs/2012.01165)Cited by:[Discussion](https://arxiv.org/html/2606.06514#Sx2.p1.1)\.
- N\. Sambasivan, E\. Arnesen, B\. Hutchinson, T\. Doshi, and V\. Prabhakaran \(2021\)Re\-imagining algorithmic fairness in india and beyond\.External Links:2101\.09995,[Link](https://arxiv.org/abs/2101.09995)Cited by:[Discussion](https://arxiv.org/html/2606.06514#Sx2.p2.1)\.
- B\. H\. Zhang, B\. Lemoine, and M\. Mitchell \(2018\)Mitigating unwanted biases with adversarial learning\.InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society,AIES ’18,New York, NY, USA,pp\. 335–340\.External Links:ISBN 9781450360128,[Link](https://doi.org/10.1145/3278721.3278779),[Document](https://dx.doi.org/10.1145/3278721.3278779)Cited by:[Related Work](https://arxiv.org/html/2606.06514#Sx1.p2.1)\.

## Appendix AAppendix

### A\.1Bar Charts for violation comparison between the baseline model and loss regularized model

![Refer to caption](https://arxiv.org/html/2606.06514v1/BARD1D2D3D4.png)

Figure 5:Loss regularized models vs\. baseline model\.
### A\.2Scatter Plots for Violation Comparison Bet

![Refer to caption](https://arxiv.org/html/2606.06514v1/SYMD1D2.png)

Figure 6:Loss regularized models \(D1, D2\) vs\. baseline model\.![Refer to caption](https://arxiv.org/html/2606.06514v1/SYMD3D4.png)

Figure 7:Loss regularized models \(D3, D4\) vs\. baseline model\.
Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation

Similar Articles

Statistical and Structural Approaches to Algorithmic Fairness

Wait, am I Being Fair? Characterizing Deductive Stereotyping and Mitigating It with Fair-GCG

Fairness Pruning: Locating Demographic Bias in GLU-MLP Layers via Differential Activations

Fair Cognitive Impairment Detection Through Unlearning

Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions

Submit Feedback

Similar Articles

Statistical and Structural Approaches to Algorithmic Fairness
Wait, am I Being Fair? Characterizing Deductive Stereotyping and Mitigating It with Fair-GCG
Fairness Pruning: Locating Demographic Bias in GLU-MLP Layers via Differential Activations
Fair Cognitive Impairment Detection Through Unlearning
Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions