yawning_titan.envs.generic.core.reward_functions.experimental_rewards#

yawning_titan.envs.generic.core.reward_functions.experimental_rewards(args)[source]#

Calculate the reward for the current state of the environment.

Actions cost a certain amount and blue gets rewarded for removing red nodes and reducing the vulnerability of nodes

Parameters:

args – A dictionary containing the following items: network_interface: Interface with the network blue_action: The action that the blue agent has taken this turn blue_node: The node that the blue agent has targeted for their action start_state: The state of the nodes before the blue agent has taken their action end_state: The state of the nodes after the blue agent has taken their action start_vulnerabilities: The vulnerabilities before blue agents turn end_vulnerabilities: The vulnerabilities after the blue agents turn start_isolation: The isolation status of all the nodes at the start of a turn end_isolation: The isolation status of all the nodes at the end of a turn start_blue: The env as the blue agent can see it before the blue agents turn end_blue: The env as the blue agent can see it after the blue agents turn

Returns:

The reward earned for this specific turn for the blue agent