yawning_titan.envs.generic.generic_env.GenericNetworkEnv#
- class yawning_titan.envs.generic.generic_env.GenericNetworkEnv(red_agent, blue_agent, network_interface, print_metrics=False, show_metrics_every=1, collect_additional_per_ts_data=True, print_per_ts_data=False)[source]#
Bases:
Env
Class to create a generic YAWNING TITAN gym environment.
Initialise the generic network environment.
- Parameters:
red_agent – Object from the RedInterface class
blue_agent – Object from the BlueInterface class
network_interface – Object from the NetworkInterface class
print_metrics – Whether or not to print metrics (boolean)
show_metrics_every – Number of timesteps to show summary metrics (int)
collect_additional_per_ts_data – Whether or not to collect additional per timestep data (boolean)
print_per_ts_data – Whether or not to print collected per timestep data (boolean)
Note: The
notes
variable returned at the end of each timestep contains the per timestep data. By default it contains a base level of info required for some of the reward functions. Whencollect_additional_per_ts_data
is toggled on, a lot more data is collected.Methods
Calculate the observation space size.
Override close in your subclass to perform any necessary cleanup.
Render the environment using Matplotlib to create an animation.
Reset the environment to the default state.
Sets the seed for this env's random number generator(s).
Take a time step and executes the actions for both Blue RL agent and non-learning Red agent.
Attributes
Completely unwrap this env.
- action_space = None#
- observation_space = None#
- reset()[source]#
Reset the environment to the default state.
- Todo:
May need to add customization of cuda setting.
- Returns:
A new starting observation (numpy array).
- step(action)[source]#
Take a time step and executes the actions for both Blue RL agent and non-learning Red agent.
- Parameters:
action – The action value generated from the Blue RL agent (int)
- Returns:
A four tuple containing the next observation as a numpy array, the reward for that timesteps, a boolean for whether complete and additional notes containing timestep information from the environment.
- render(mode='human', show_only_blue_view=False, show_node_names=False)[source]#
Render the environment using Matplotlib to create an animation.
- Parameters:
mode – the mode of the rendering
show_only_blue_view – If true shows only what the blue agent can see
show_node_names – Show the names of the nodes
- calculate_observation_space_size(with_feather)[source]#
Calculate the observation space size.
This is done using the current active observation space configuration and the number of nodes within the environment.
- Parameters:
with_feather – Whether to include the size of the Feather Wrapper output
- Returns:
The observation space size
- close()#
Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
- metadata = {'render.modes': []}#
- reward_range = (-inf, inf)#
- seed(seed=None)#
Sets the seed for this env’s random number generator(s).
Note
Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.
- Returns:
- Returns the list of seeds used in this env’s random
number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.
- Return type:
list<bigint>
- spec = None#
- property unwrapped#
Completely unwrap this env.
- Returns:
The base non-wrapped gym.Env instance
- Return type:
gym.Env