yawning_titan.envs.generic.generic_env.GenericNetworkEnv#

class yawning_titan.envs.generic.generic_env.GenericNetworkEnv(red_agent, blue_agent, network_interface, print_metrics=False, show_metrics_every=1, collect_additional_per_ts_data=True, print_per_ts_data=False)[source]#

Bases: Env

Class to create a generic YAWNING TITAN gym environment.

Initialise the generic network environment.

Parameters:

red_agent – Object from the RedInterface class
blue_agent – Object from the BlueInterface class
network_interface – Object from the NetworkInterface class
print_metrics – Whether or not to print metrics (boolean)
show_metrics_every – Number of timesteps to show summary metrics (int)
collect_additional_per_ts_data – Whether or not to collect additional per timestep data (boolean)
print_per_ts_data – Whether or not to print collected per timestep data (boolean)

Note: The notes variable returned at the end of each timestep contains the per timestep data. By default it contains a base level of info required for some of the reward functions. When collect_additional_per_ts_data is toggled on, a lot more data is collected.

Methods

`calculate_observation_space_size`	Calculate the observation space size.
`close`	Override close in your subclass to perform any necessary cleanup.
`render`	Render the environment using Matplotlib to create an animation.
`reset`	Reset the environment to the default state.
`seed`	Sets the seed for this env's random number generator(s).
`step`	Take a time step and executes the actions for both Blue RL agent and non-learning Red agent.

Attributes

`action_space`
`metadata`
`observation_space`
`reward_range`
`spec`
`unwrapped`	Completely unwrap this env.

action_space = None#

observation_space = None#

reset()[source]#

Reset the environment to the default state.

Todo:: May need to add customization of cuda setting.
Returns:: A new starting observation (numpy array).

step(action)[source]#

Take a time step and executes the actions for both Blue RL agent and non-learning Red agent.

Parameters:: action – The action value generated from the Blue RL agent (int)
Returns:: A four tuple containing the next observation as a numpy array, the reward for that timesteps, a boolean for whether complete and additional notes containing timestep information from the environment.

render(mode='human', show_only_blue_view=False, show_node_names=False)[source]#

Render the environment using Matplotlib to create an animation.

Parameters:

mode – the mode of the rendering
show_only_blue_view – If true shows only what the blue agent can see
show_node_names – Show the names of the nodes

calculate_observation_space_size(with_feather)[source]#

Calculate the observation space size.

This is done using the current active observation space configuration and the number of nodes within the environment.

Parameters:: with_feather – Whether to include the size of the Feather Wrapper output
Returns:: The observation space size

close()#

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

metadata = {'render.modes': []}#

reward_range = (-inf, inf)#

seed(seed=None)#

Sets the seed for this env’s random number generator(s).

Note

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:

Returns the list of seeds used in this env’s random: number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

Return type:

list<bigint>

spec = None#

property unwrapped#

Completely unwrap this env.

Returns:: The base non-wrapped gym.Env instance
Return type:: gym.Env