Skip to content
Advertisement

OpenAI Gym: Walk through all possible actions in an action space

I want to build a brute-force approach that tests all actions in a Gym action space before selecting the best one. Is there any simple, straight-forward way to get all possible actions?

Specifically, my action space is

JavaScript

I know I can sample a random action with action_space.sample() and also check if an action is contained in the action space, but I want to generate a list of all possible action within that space.

Is there anything more elegant (and performant) than just a bunch of for loops? The problem with for loops is that I want it to work with any size of action space, so I cannot hard-code 4 for loops to walk through the different actions.

Advertisement

Answer

The actions in a gym environment are usually represented by integers only, this mean if you get the total number of possible actions, then an array of all possible actions can be created.

The way to get the total number of possible actions in a gym environment depends on the type of action space it has, for your case it’s a MultiDiscrete action space and thus the attribute nvec can be used as mentioned here by @Valentin Macé like so -:

JavaScript

Note that the attribute nvec stands for n vector, since its output is a multidimensional vector. Also note that the attribute is a numpy array.

Now that we have the array to convert it into a list of lists of actions assuming that since the action_space.sample function returns a numpy array of a random function from each of the dimensions of the MultiDiscrete action_space i.e. -:

JavaScript

So thus to convert the array to a list of lists of possible actions in each dimensions we can use list comprehensions like so -:

JavaScript

Note that this is scalable to any number of dimensions and is also quite efficient performance wise.

Now you can loop over the possible actions in each dimension using only two loops like so -:

JavaScript

For more info about the same I would like you to also visit this thread on github, with a somewhat similar issue being discussed incase you find the same useful.

EDIT: So as per the comment of yours @CGFoX I assume that you want it such that all the possible combination vectors of the actions can be generated as a list for any number of dimensions, somewhat like so -:

JavaScript

The same can be achieved like so using recursion and with only two loops, this is also expandable to as many dimensions as provided.

JavaScript

Once we have this function defined it can be used with our previously generated set of possibilities of actions to get all possible combinations like so -:

JavaScript

EDIT-2 : I have fixed some code which previously was returning a nested list, now the list returned is the one with the pairs and is not nested within another list.

EDIT-3-: Fixed my spelling mistakes.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement