Loss Functions¶
- class fairret.loss.base.FairnessLoss¶
Bases:
abc.ABC
,torch.nn.modules.module.Module
Abstract base class for fairness losses, also referred to as fairrets.
- abstract forward(pred, sens, *stat_args, pred_as_logit=True, **stat_kwargs)¶
Abstract method that should be implemented by subclasses to calculate the loss.
- Parameters:
pred (torch.Tensor) – Predictions of shape \((N, 1)\), as we assume to be performing binary classification or regression.
sens (torch.Tensor) – Sensitive features of shape \((N, S)\) with S the number of sensitive features.
*stat_args (Any) – All arguments used by the statistic that this loss minimizes.
pred_as_logit (bool) – Whether the pred tensor should be interpreted as logits. Though most losses are will simply take the sigmoid of pred if pred_as_logit is True, some losses benefit from improved numerical stability if they handle the conversion themselves.
**stat_kwargs (Any) – All keyword arguments used by the statistic that this loss computes.
- Returns:
The calculated loss as a scalar tensor.
- Return type:
torch.Tensor
Violation-based Losses¶
- class fairret.loss.violation.ViolationLoss¶
Bases:
fairret.loss.base.FairnessLoss
Abstract base class for fairness losses that penalize the violation vector of a fairness constraint. The violation vector is computed as the gap between the statistics per sensitive feature and a target statistic.
Each subclass must implement the penalize_violation method.
- __init__(statistic)¶
- Parameters:
statistic (fairret.statistic.base.Statistic) – The statistic that should be used to calculate the violation vector. Preferably, a LinearFractionalStatistic is provided, as this allows for a straightforward calculation of the target statistic as the overall statistic.
- abstract penalize_violation(violation)¶
Penalize the fairness violation.
- Parameters:
violation (torch.Tensor) – The violation vector, i.e. the vector of gaps between the statistics per sensitive feature and the target statistic.
- Returns:
A scalar tensor.
- Return type:
torch.Tensor
- forward(pred, sens, *stat_args, pred_as_logit=True, target_statistic=None, **stat_kwargs)¶
Calculate the violation vector in relation to the target_statistic and penalize this violation using the
penalize_violation()
method implemented by the subclass.- Parameters:
pred (torch.Tensor) – Predictions of shape \((N, 1)\), as we assume to be performing binary classification or regression.
sens (torch.Tensor) – Sensitive features of shape \((N, S)\) with S the number of sensitive features.
*stat_args (Any) – All arguments used by the statistic that this loss minimizes.
pred_as_logit (bool) – Whether the pred tensor should be interpreted as logits. Though most losses are will simply take the sigmoid of pred if pred_as_logit is True, some losses benefit from improved numerical stability if they handle the conversion themselves.
target_statistic (torch.Tensor | None) – The target statistic as a scalar tensor. If not provided for a LinearFractionalStatistic, the overall statistic will be used by default.
**stat_kwargs (Any) – All keyword arguments used by the statistic that this loss computes.
- Returns:
The calculated loss as a scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.violation.NormLoss¶
Bases:
fairret.loss.violation.ViolationLoss
Fairness loss that penalizes the p-norm of the violation vector.
- __init__(statistic, p=1)¶
- Parameters:
statistic (fairret.statistic.base.Statistic) – The statistic that should be used to calculate the violation vector. Preferably, a LinearFractionalStatistic is provided, as this allows for a straightforward calculation of the target statistic as the overall statistic.
p (int) – The order of the norm. Default is 1.
- penalize_violation(violation)¶
Penalize the fairness violation.
- Parameters:
violation (torch.Tensor) – The violation vector, i.e. the vector of gaps between the statistics per sensitive feature and the target statistic.
- Returns:
A scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.violation.LSELoss¶
Bases:
fairret.loss.violation.ViolationLoss
Fairness loss that penalizes the log-sum-exp of the violation vector. The log-sum-exp is a smooth approximation of the maximum function, hence it approximates the maximum violation (or its \(\Vert \cdot \Vert_\infty\) norm)
- penalize_violation(violation)¶
Penalize the fairness violation.
- Parameters:
violation (torch.Tensor) – The violation vector, i.e. the vector of gaps between the statistics per sensitive feature and the target statistic.
- Returns:
A scalar tensor.
- Return type:
torch.Tensor
Projection-based Losses¶
- class fairret.loss.projection.ProjectionLoss¶
Bases:
fairret.loss.base.FairnessLoss
Abstract base class for fairness losses that penalize the statistical distance between a set of predictions and the fair projection of those predictions. The fair projection satisfies the linear fairness constraint corresponding to a LinearFractionalStatistic that is fixed to a target value (such as the overall statistic).
The projections are computed using cvxpy. Hence, any subclass is expected to implement the statistical distance between distributions in both cvxpy and PyTorch by implementing the
cvxpy_distance()
method and thetorch_distance()
method respectively.Optionally, the
torch_distance_with_logits()
method can be overwritten to provide a more numerically stable handling of predictions that are provided as logits. If left unimplemented,torch_distance()
will be called instead, after applying the sigmoid function to the predictions.Note
We use ‘statistical distance’ in a broad sense here, and do not require that the distance is a metric. See https://en.wikipedia.org/wiki/Statistical_distance for more information.
- __init__(statistic, force_proj_normalized=True, proj_eps=0., **solver_kwargs)¶
- Parameters:
statistic (fairret.statistic.linear_fractional.LinearFractionalStatistic) – The LinearFractionalStatistic that defines the fairness constraint. The projection is computed through convex optimization, so the constraint should be linear. This is achieved by fixing equality in the LinearFractionalStatistic values to the overall statistic.
force_proj_normalized (bool) – Whether to force the projected distribution to be normalized. This might not be the case if the optimization does not converge to a solution that satisfies the normalization constraint. Hence, setting this to True will renormalize the projected distribution to sum to 1.
proj_eps (float) – Every probability value in the projected distribution is clamped to the interval [proj_eps, 1 - proj_eps]. Default is 0. Setting this to a small, non-negative value helps prevent numerical instability if the optimization is not done to convergence.
solver_kwargs (Any) –
Any keyword arguments to be passed to the cvxpy solver. The default configuration is:
{ 'solver': 'SCS', 'warm_start': True, 'max_iters': 10, 'ignore_dpp': True }
- abstract cvxpy_distance(pred, proj)¶
Compute the statistical distance between pred and proj in cvxpy. Used for the convex optimization problem.
- Parameters:
pred (cp.Parameter) – The predicted distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
proj (cp.Variable) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance in shape (1,).
- Return type:
cp.Expression
- abstract torch_distance(pred, proj)¶
Compute the statistical distance between pred and proj in PyTorch. Used for computing the gradient of the distance between the predictions and the projection (with respect to the predictions).
- Parameters:
pred (torch.Tensor) – The predicted distribution in shape (N,1). As we assume binary classification, this is the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- torch_distance_with_logits(pred, proj)¶
A more numerically stable alternative method to
torch_distance()
, where pred is assumed to be logits.- Parameters:
pred (torch.Tensor) – The predicted distribution as logits, in shape (N,1). As we assume binary classification, this is the logit of the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2) as probabilities. As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- forward(pred, sens, *stat_args, pred_as_logit=True, **stat_kwargs)¶
Calculate the fairness loss by projecting the predictions onto the fair set and computing the statistical distance between the predictions and the projection.
- Parameters:
pred (torch.Tensor) – Predictions of shape \((N, 1)\), as we assume to be performing binary classification or regression.
sens (torch.Tensor) – Sensitive features of shape \((N, S)\) with S the number of sensitive features.
*stat_args (Any) – All arguments used by the statistic that this loss minimizes.
pred_as_logit (bool) – Whether the pred tensor should be interpreted as logits. Though most losses are will simply take the sigmoid of pred if pred_as_logit is True, some losses benefit from improved numerical stability if they handle the conversion themselves.
**stat_kwargs (Any) – All keyword arguments used by the statistic that this loss computes.
- Returns:
The calculated loss as a scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.projection.KLProjectionLoss¶
Bases:
fairret.loss.projection.ProjectionLoss
Fairness loss that penalizes the Kullback-Leibler divergence between the predicted distribution and the fair projection of the predictions.
- cvxpy_distance(pred, proj)¶
Compute the statistical distance between pred and proj in cvxpy. Used for the convex optimization problem.
- Parameters:
pred (cp.Parameter) – The predicted distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
proj (cp.Variable) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance in shape (1,).
- Return type:
cp.Expression
- torch_distance(pred, proj)¶
Compute the statistical distance between pred and proj in PyTorch. Used for computing the gradient of the distance between the predictions and the projection (with respect to the predictions).
- Parameters:
pred (torch.Tensor) – The predicted distribution in shape (N,1). As we assume binary classification, this is the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- torch_distance_with_logits(pred, proj)¶
A more numerically stable alternative method to
torch_distance()
, where pred is assumed to be logits.- Parameters:
pred (torch.Tensor) – The predicted distribution as logits, in shape (N,1). As we assume binary classification, this is the logit of the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2) as probabilities. As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.projection.JensenShannonProjectionLoss¶
Bases:
fairret.loss.projection.ProjectionLoss
Fairness loss that penalizes the Jensen-Shannon divergence between the predicted distribution and the fair projection of the predictions.
- cvxpy_distance(pred, proj)¶
Compute the statistical distance between pred and proj in cvxpy. Used for the convex optimization problem.
- Parameters:
pred (cp.Parameter) – The predicted distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
proj (cp.Variable) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance in shape (1,).
- Return type:
cp.Expression
- torch_distance(pred, proj)¶
Compute the statistical distance between pred and proj in PyTorch. Used for computing the gradient of the distance between the predictions and the projection (with respect to the predictions).
- Parameters:
pred (torch.Tensor) – The predicted distribution in shape (N,1). As we assume binary classification, this is the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.projection.TotalVariationProjectionLoss¶
Bases:
fairret.loss.projection.ProjectionLoss
Fairness loss that penalizes the Total Variation Distance between the predicted distribution and the fair projection of the predictions.
- cvxpy_distance(pred, proj)¶
Compute the statistical distance between pred and proj in cvxpy. Used for the convex optimization problem.
- Parameters:
pred (cp.Parameter) – The predicted distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
proj (cp.Variable) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance in shape (1,).
- Return type:
cp.Expression
- torch_distance(pred, proj)¶
Compute the statistical distance between pred and proj in PyTorch. Used for computing the gradient of the distance between the predictions and the projection (with respect to the predictions).
- Parameters:
pred (torch.Tensor) – The predicted distribution in shape (N,1). As we assume binary classification, this is the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.projection.ChiSquaredProjectionLoss¶
Bases:
fairret.loss.projection.ProjectionLoss
Fairness loss that penalizes the Chi-Squared Distance between the predicted distribution and the fair projection of the predictions.
- cvxpy_distance(pred, proj)¶
Compute the statistical distance between pred and proj in cvxpy. Used for the convex optimization problem.
- Parameters:
pred (cp.Parameter) – The predicted distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
proj (cp.Variable) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance in shape (1,).
- Return type:
cp.Expression
- torch_distance(pred, proj)¶
Compute the statistical distance between pred and proj in PyTorch. Used for computing the gradient of the distance between the predictions and the projection (with respect to the predictions).
- Parameters:
pred (torch.Tensor) – The predicted distribution in shape (N,1). As we assume binary classification, this is the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor
- class fairret.loss.projection.SquaredEuclideanProjectionLoss¶
Bases:
fairret.loss.projection.ProjectionLoss
Fairness loss that penalizes the Squared Euclidean Distance between the predicted distribution and the fair projection of the predictions.
- cvxpy_distance(pred, proj)¶
Compute the statistical distance between pred and proj in cvxpy. Used for the convex optimization problem.
- Parameters:
pred (cp.Parameter) – The predicted distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
proj (cp.Variable) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance in shape (1,).
- Return type:
cp.Expression
- torch_distance(pred, proj)¶
Compute the statistical distance between pred and proj in PyTorch. Used for computing the gradient of the distance between the predictions and the projection (with respect to the predictions).
- Parameters:
pred (torch.Tensor) – The predicted distribution in shape (N,1). As we assume binary classification, this is the probability of the positive class.
proj (torch.Tensor) – The projected distribution in shape (N,2). As we assume binary classification, the first column is the probability of the negative class and the second column is the probability of the positive class.
- Returns:
The statistical distance as a scalar tensor.
- Return type:
torch.Tensor