1sec.ai

Tag

#causal inference

Every item tagged causal inference, newest first.

1 item

Wasserstein Policy Learning for Distributional Outcomes

Offline policy learning is studied for distribution-valued outcomes, where each potential outcome is a probability measure on R and the reward is defined through a utility functional applied to the potential outcomes. The Wasserstein distance is used to define the reward, and the goal is to learn a policy that maximizes the empirical welfare defined as the mean of the scalar-valued potential outcomes.

Key takeaways
  • Offline policy learning studied for distribution-valued outcomes.
  • Wasserstein distance used to define reward.
  • Utility functional applied to define reward.