cherry.nn

cherry.nn.policy.Policy

[Source]

Abstract Module to represent policies.

Subclassing this module helps retain a unified API across codebases, and also automatically defines some helper functions (you only need that forward returns a Distribution instance).

Example
class RandomPolicy(Policy):

    def __init__(self, num_actions=5):
        self.num_actions = num_actions

    def forward(self, state):  # must return a density
        probs = torch.ones(self.num_actions) / self.num_actions
        density = cherry.distributions.Categorical(probs=probs)
        return density

# We can now use some predefined functions:
random_policy = RandomPolicy()
actions = random_policy.act(states, deterministic=True)
log_probs = random_policy.log_probs(states, actions)

__init__(self) -> None inherited special

act(self, state, deterministic = False)

Description

Given a state, samples an action from the policy.

If deterministic=True, the action is the model of the policy distribution.

Arguments
  • state (Tensor) - State to take an action in.
  • deterministic (bool, optional, default=False) - Where the action is sampled (False) or the mode of the policy (True).

forward(self, state)

Description

Should return a Distribution instance correspoinding to the policy density for state.

Arguments
  • state (Tensor) - State where the policy should be computed.

log_prob(self, state, action)

Description

Computes the log probability of action given state, according to the policy.

Arguments
  • state (Tensor) - A tensor of states.
  • action (Tensor) - The actions of which to compute the log probability.

cherry.nn.action_value.ActionValue

[Source]

Description

Abstract Module to represent Q-value functions.

Example
class QValue(ActionValue):

    def __init__(self, state_size, action_size):
        super(QValue, self).__init__()
        self.mlp = MLP(state_size+action_size, 1, [1024, 1024])

    def forward(self, state, action):
        return self.mlp(torch.cat([state, action], dim=1))

qf = QValue(128, 5)
qvalue = qf(state, action)

forward(self, state, action = None)

Description

Returns the scalar value for taking action action in state state.

If action is not given, should return the value for all actions (useful for DQN-like architectures).

Arguments
  • state (Tensor) - State to be evaluated.
  • action (Tensor, optional, default=None) - Action to be evaluated.
Returns
  • value (Tensor) - Value of taking action in state. Shape: (batch_size, 1)

cherry.nn.action_value.Twin

[Source]

Description

Helper class to implement Twin action-value functions as described in [1].

References
  1. Fujimoto et al., "Addressing Function Approximation Error in Actor-Critic Methods". ICML 2018.
Example
qvalue = Twin(QValue(), QValue())
values = qvalue(states, actions)
values1, values1 = qvalue.twin(states, actions)

__init__(self, *action_values) special

Arguments
  • qvalue1, qvalue2, ... (ActionValue) - Action value functions.

forward(self, state, action)

Description

Returns the minimum value computed by the individual value functions wrapped by this class.

Arguments
  • state (Tensor) - The state to evaluate.
  • action (Tensor) - The action to evaluate.

twin(self, state, action)

Description

Returns the values of each individual value function wrapped by this class.

Arguments
  • state (Tensor) - State to be evaluated.
  • action (Tensor) - Action to be evaluated.

cherry.nn.robotics_layers.RoboticsLinear

[Source]

Description

Akin to nn.Linear, but with proper initialization for robotic control.

Credit

Adapted from Ilya Kostrikov's implementation.

Example
linear = ch.nn.Linear(23, 5, bias=True)
action_mean = linear(state)

__init__(self, *args, **kwargs) special

Arguments
  • gain (float, optional) - Gain factor passed to robotics_init_ initialization.
  • This class extends nn.Linear and supports all of its arguments.

cherry.nn.epsilon_greedy.EpsilonGreedy

[Source]

Description

Samples actions from a uniform distribution with probability epsilon or the one maximizing the input with probability 1 - epsilon.

References
  1. Sutton, Richard, and Andrew Barto. 2018. Reinforcement Learning, Second Edition. The MIT Press.
Example
egreedy = EpsilonGreedy()
q_values = q_value(state)  # NxM tensor
actions = egreedy(q_values)  # Nx1 tensor of longs

__init__(self, epsilon = 0.05, learnable = False) special

Arguments
  • epsilon (float, optional, default=0.05) - The epsilon factor.
  • learnable (bool, optional, default=False) - Whether the epsilon factor is a learnable parameter or not.

cherry.nn.mlp.MLP

[Source]

Description

Implements a simple multi-layer perceptron.

Example
net = MLP(128, 1, [1024, 1024], activation=torch.nn.GELU)

__init__(self, input_size, output_size, hidden_sizes, activation = None, bias = True) special

Arguments
  • input_size (int) - Input size of the MLP.
  • output_size (int) - Number of output units.
  • hidden_sizes (list of int) - Each int is the number of hidden units of a layer.
  • activation (callable) - Activation function to use for the MLP.
  • bias (bool, optional, default=True) - Whether the MLP uses bias terms.

cherry.nn.misc.Lambda

[Source]

Description

Turns any function into a PyTorch Module.

Example
double = Lambda(lambda x: 2 * x)
out = double(tensor([23]))  # out == 46

__init__(self, fn) special

Description
  • fn (callable) - Function to turn into a Module.