yawning_titan.envs.generic.generic_env.GenericNetworkEnv#

class yawning_titan.envs.generic.generic_env.GenericNetworkEnv(red_agent, blue_agent, network_interface, print_metrics=False, show_metrics_every=1, collect_additional_per_ts_data=True, print_per_ts_data=False)[source]#

Bases: Env

Class to create a generic YAWNING TITAN gym environment.

Initialise the generic network environment.

Parameters:
  • red_agent – Object from the RedInterface class

  • blue_agent – Object from the BlueInterface class

  • network_interface – Object from the NetworkInterface class

  • print_metrics – Whether or not to print metrics (boolean)

  • show_metrics_every – Number of timesteps to show summary metrics (int)

  • collect_additional_per_ts_data – Whether or not to collect additional per timestep data (boolean)

  • print_per_ts_data – Whether or not to print collected per timestep data (boolean)

Note: The notes variable returned at the end of each timestep contains the per timestep data. By default it contains a base level of info required for some of the reward functions. When collect_additional_per_ts_data is toggled on, a lot more data is collected.

Methods

calculate_observation_space_size

Calculate the observation space size.

close

Override close in your subclass to perform any necessary cleanup.

render

Render the environment using Matplotlib to create an animation.

reset

Reset the environment to the default state.

seed

Sets the seed for this env's random number generator(s).

step

Take a time step and executes the actions for both Blue RL agent and non-learning Red agent.

Attributes

action_space = None#
observation_space = None#
reset()[source]#

Reset the environment to the default state.

Todo:

May need to add customization of cuda setting.

Returns:

A new starting observation (numpy array).

step(action)[source]#

Take a time step and executes the actions for both Blue RL agent and non-learning Red agent.

Parameters:

action – The action value generated from the Blue RL agent (int)

Returns:

A four tuple containing the next observation as a numpy array, the reward for that timesteps, a boolean for whether complete and additional notes containing timestep information from the environment.

render(mode='human', show_only_blue_view=False, show_node_names=False)[source]#

Render the environment using Matplotlib to create an animation.

Parameters:
  • mode – the mode of the rendering

  • show_only_blue_view – If true shows only what the blue agent can see

  • show_node_names – Show the names of the nodes

calculate_observation_space_size(with_feather)[source]#

Calculate the observation space size.

This is done using the current active observation space configuration and the number of nodes within the environment.

Parameters:

with_feather – Whether to include the size of the Feather Wrapper output

Returns:

The observation space size

close()#

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

metadata = {'render.modes': []}#
reward_range = (-inf, inf)#
seed(seed=None)#

Sets the seed for this env’s random number generator(s).

Note

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:

Returns the list of seeds used in this env’s random

number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

Return type:

list<bigint>

spec = None#
property unwrapped#

Completely unwrap this env.

Returns:

The base non-wrapped gym.Env instance

Return type:

gym.Env