Side effects penalties
This is the code for the paper Penalizing side effects using stepwise relative reachability by Krakovna et al (2019). It implements a tabular Q-learning agent with different penalties for side effects. Each side effects penalty consists of a deviation measure (none, unreachability, relative reachability, or attainable utility) and a baseline (starting state, inaction, or stepwise inaction).
Instructions
Clone the repository:
git clone https://github.com/deepmind/deepmind-research/side_effects_penalties.git
Running an agent with a side effects penalty
Run the agent with a given penalty on an AI Safety Gridworlds environment:
python -m side_effects_penalties.run_experiment -baseline <X> -dev_measure <Y> -env_name <Z> -suffix <S>
The following parameters can be specified for the side effects penalty:
- Baseline state (
-baseline): starting state (start), inaction (inaction), stepwise inaction with rollouts (stepwise), stepwise inaction without rollouts (step_noroll) - Deviation measure (
-dev_measure): none (none), unreachability (reach), relative reachability (rel_reach), attainable utility (att_util) - Discount factor for the deviation measure value function (
-value_discount) - Summary function to apply to the relative reachability or attainable utility
deviation measure (
-dev_fun): max (0, x) (truncation) or |x| (absolute) - Weight for the side effects penalty relative to the reward (
-beta)
Other arguments:
- AI Safety Gridworlds environment name (
-env_name) - Number of episodes (
-num_episodes) - Filename suffix for saving result files (
-suffix)
Plotting the results
Make a summary data frame from the result files generated by run_experiment:
python -m side_effects_penalties.results_summary -compare_penalties -input_suffix <S>
Arguments:
- -bar_plot: make a data frame for a bar plot (True) or learning curve plot (False)
- -compare_penalties: compare different penalties using the best beta value for each penalty (True), or compare different beta values for a given penalty (False)
- If compare_penalties=False, specify the penalty parameters (
-dev_measure,-dev_funand-value_discount) - Environment name (
-env_name) - Filename suffix for loading result files (
-input_suffix) - Filename suffix for the summary data frame (
-output_suffix)
Import the summary data frame into plot_results.ipynb and make a bar plot or
learning curve plot.
Dependencies
- Python 2.7 or 3 (tested with Python 2.7.15 and 3.6.7)
- AI Safety Gridworlds suite of safety environments
- Abseil Python common libraries
- Numpy
- Pandas
- Six
- Matplotlib
- Seaborn
Citing this work
If you use this code in your work, please cite the accompanying paper:
@article{srr2019, title = {Penalizing Side Effects using Stepwise Relative Reachability}, author = {Victoria Krakovna and Laurent Orseau and Ramana Kumar and Miljan Martic and Shane Legg}, journal = {CoRR}, volume = {abs/1806.01186}, year = {2019} }