yawning_titan.envs.generic.core.reward_functions.one_per_timestep#

yawning_titan.envs.generic.core.reward_functions.one_per_timestep(args)[source]#

Give a reward for 0.1 for every timestep that the blue agent is alive.

Parameters:

args – A dictionary containing the following items: network_interface: Interface with the network blue_action: The action that the blue agent has taken this turn blue_node: The node that the blue agent has targeted for their action start_state: The state of the nodes before the blue agent has taken their action end_state: The state of the nodes after the blue agent has taken their action start_vulnerabilities: The vulnerabilities before blue agents turn end_vulnerabilities: The vulnerabilities after the blue agents turn start_isolation: The isolation status of all the nodes at the start of a turn end_isolation: The isolation status of all the nodes at the end of a turn start_blue: The env as the blue agent can see it before the blue agents turn end_blue: The env as the blue agent can see it after the blue agents turn

Returns:

0.1