yawning_titan.envs.generic.wrappers.graph_embedding_observations.FeatherGraphEmbedObservation#
- class yawning_titan.envs.generic.wrappers.graph_embedding_observations.FeatherGraphEmbedObservation(env, max_num_nodes=100)[source]#
Bases:
ObservationWrapper
Gym Observation Space Wrapper that embeds the underlying environment graph using the Feather-G algorithm.
This wrapper uses the Feather-G Whole Graph embedding algorithm to embed the underlying environment graph and then re-creates the observation space to include the embedding and all other observation space settings from the configuration file.
Initialise a Feather-G observation space wrapper.
- Parameters:
env – the OpenAI Gym environment to be wrapped
max_num_nodes – the maximum number of nodes required to be supported in the observation space
Note
The max_num_nodes is for defining the maximum number of nodes you want the agent to support within its observation space. This is in order to support the training of agents which can work across a number of YAWNING TITAN environments with variable node counts.
For example, if set to 100 (like the default), the agent could be trained in an environment with 10 nodes, 50 nodes or 100 nodes.
Methods
Override close in your subclass to perform any necessary cleanup.
Create a FeaterGraph embedding from the inputted NetworkX graph.
Observation Transformation Function.
Renders the environment.
Resets the environment to an initial state and returns an initial observation.
Sets the seed for this env's random number generator(s).
Run one timestep of the environment's dynamics.
Attributes
dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2).
Built-in immutable sequence.
Completely unwrap this env.
- property observation_space#
- observation(observation)[source]#
Observation Transformation Function.
Generates a networkx graph object from the current adjacency matrix
Collects the current vulnerability scores and node status’s
Pads the returned arrays to ensure length is 100 (currently arbitrarily set)
Embeds the networkx graph using the Feather Graph algorithm from Karateclub
Concatenates the graph embedding, padded vulnerability scores and padded node status’s together
Returns new observation
- Parameters:
observation – The base, unwrapped observation generated by the environment
- Returns:
A newly formatted environment observation
- make_embedding()[source]#
Create a FeaterGraph embedding from the inputted NetworkX graph.
- Returns:
A numpy array containing the Feather embedding
- property action_space#
- classmethod class_name()#
- close()#
Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
- compute_reward(achieved_goal, desired_goal, info)#
- property metadata#
dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s
(key, value) pairs
- dict(iterable) -> new dictionary initialized as if via:
d = {} for k, v in iterable:
d[k] = v
- dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
- render(mode='human', **kwargs)#
Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
Note
- Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Parameters:
mode (str) – the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
… # pop up a window and render
- else:
super(MyEnv, self).render(mode=mode) # just raise an exception
- reset(**kwargs)#
Resets the environment to an initial state and returns an initial observation.
Note that this function should not reset the environment’s random number generator(s); random variables in the environment’s state should be sampled independently between multiple calls to reset(). In other words, each call of reset() should yield an environment suitable for a new episode, independent of previous episodes.
- Returns:
the initial observation.
- Return type:
observation (object)
- property reward_range#
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable’s items.
If the argument is a tuple, the return value is the same object.
- seed(seed=None)#
Sets the seed for this env’s random number generator(s).
Note
Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.
- Returns:
- Returns the list of seeds used in this env’s random
number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.
- Return type:
list<bigint>
- property spec#
- step(action)#
Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
- Parameters:
action (object) – an action provided by the agent
- Returns:
agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
- Return type:
observation (object)
- property unwrapped#
Completely unwrap this env.
- Returns:
The base non-wrapped gym.Env instance
- Return type:
gym.Env