yawning_titan.integrations.dcbo.dcbo_agent.DCBOAgent#

class yawning_titan.integrations.dcbo.dcbo_agent.DCBOAgent(action_space, initial_probabilities)[source]#

Bases: object

An agent class that provides the supporting methods for a DCBO based learner.

Methods

act

Act within the environment.

predict

Predict what action should be used next.

reset

Reset the Agent back to initial config by resetting the isolated_nodes.

update_probabilities

Update the DCBO action probabilities.

update_probabilities(new_probabilities)[source]#

Update the DCBO action probabilities.

Parameters:

new_probabilities – The output of a DCBO optimisation step

act(observation, reward, done)[source]#

Act within the environment.

This function is not completely implemented and is named to support random actions within OpenAI Gym envs.

reset()[source]#

Reset the Agent back to initial config by resetting the isolated_nodes.

predict(observation, reward, done, env)[source]#

Predict what action should be used next.

This is again named the same as an RL based learner but operates differently under the hood.

As DCBO calculates the action probabilities in time slices, the predict step here is sampling an action according to the action probabilities returned by the most recent DCBO step.