GOAP - A form of AI planner

Hi everyone!

The topic of GOAP (Goal Oriented Action Planning) has been discussed on BlenderArtists.org at least once before, but I’ve not seen any resources associated with the concept.

GOAP is a specific example of an AI planner - which determines a series of actions that an AI might undertake in order to achieve an end goal. In essence, GOAP is just a means of expressing actions as a pair of conditions (to permit the action to occur) and consequences (effects) that each action will introduce to the world state. The role of the planner is simply to find the appropriate combination of Actions to fulfil the goal that is self-consistent.

Whilst FSM and behavior trees are often used to implement forms of AI, they are themselves limited by the complexity of the web of transitions that are later required to ensure a believable AI. Planners can be used to alleviate this complexity.

A planner can delegate the implementation of an action to an FSM, BTree, HFSM or even another planner (though the GOAP planner is suited to flat action lists (nesting a planner within a planner would be separate, unlike some planners (See SOAP)). The important point is simply that the action’s effects and preconditions can be described sufficiently for the planner to make use of the action.

Linked below is an implementation of a planner. It uses A* to find the least costly sequence of actions, from the goal to the first action in the sequence. Each ActionNode wraps an Action class instance, which has a cost and prevalence attribute, which correspond to the rough resource cost of performing the action, and the resolution order if two actions are both as costly as one another (which is the sum of their individual costs + the heuristic cost to the goal).

To run the planner with an FSM / Btree, one may first determine the plan, then pop actions off and run their corresponding implementations (I.e pop the FindCover action, and run the corresponding FSM state).

A number of useful links can be found here

Thanks to Jeff Orkin for answering a few questions I had when revisiting this demo.

Update 29/07/2016: Fixed some bugs, improved planner API and performance

pathfinder.py


from collections import deque, namedtuple
from heapq import heappop, heappush




class PathNotFoundException(Exception):
    pass




class AStarAlgorithm:


    def __init__(self, get_neighbours, get_h_score, get_g_score, check_completed):
        self.get_h_score = get_h_score
        self.get_g_score = get_g_score
        self.get_neighbours = get_neighbours
        self.is_finished = check_completed


    @staticmethod
    def reconstruct_path(node, path):
        result = deque()
        while node:
            result.appendleft(node)
            node = path.get(node)


        return result


    def find_path(self, start):
        open_set = {start}
        closed_set = set()


        f_scored = [(0, start)]
        g_scored = {start: 0}


        get_h_score = self.get_h_score
        get_g_score = self.get_g_score
        get_neighbours = self.get_neighbours
        is_complete = self.is_finished


        path = {}


        while open_set:
            current = heappop(f_scored)[1]
            if is_complete(current, path):
                return self.reconstruct_path(current, path)


            open_set.remove(current)
            closed_set.add(current)


            for neighbour in get_neighbours(current):
                if neighbour in closed_set:
                    continue


                tentative_g_score = g_scored[current] + get_g_score(current, neighbour)


                if not neighbour in open_set or tentative_g_score < g_scored[neighbour]:
                    path[neighbour] = current
                    g_scored[neighbour] = tentative_g_score
                    heappush(f_scored, (tentative_g_score + get_h_score(neighbour), neighbour))


                    if not neighbour in open_set:
                        open_set.add(neighbour)


        raise PathNotFoundException("Couldn't find path for given nodes")

goap.py

from pathfinder import AStarAlgorithmfrom collections import defaultdict
from copy import deepcopy


def total_ordering(cls):
    'Class decorator that fills-in missing ordering methods'
    convert = {
        '__lt__': [('__gt__', lambda self, other: other < self),
                   ('__le__', lambda self, other: not other < self),
                   ('__ge__', lambda self, other: not self < other)],
        '__le__': [('__ge__', lambda self, other: other <= self),
                   ('__lt__', lambda self, other: not other <= self),
                   ('__gt__', lambda self, other: not self <= other)],
        '__gt__': [('__lt__', lambda self, other: other > self),
                   ('__ge__', lambda self, other: not  other > self),
                   ('__le__', lambda self, other: not self > other)],
        '__ge__': [('__le__', lambda self, other: other >= self),
                   ('__gt__', lambda self, other: not other >= self),
                   ('__lt__', lambda self, other: not self >= other)]
    }
    if hasattr(object, '__lt__'):
        roots = [op for op in convert if getattr(cls, op) is not getattr(object, op)]
    else:
        roots = set(dir(cls)) & set(convert)
		
    assert roots, 'must define at least one ordering operation: < > <= >='
    root = max(roots)       # prefer __lt __ to __le__ to __gt__ to __ge__
    for opname, opfunc in convert[root]:
        if opname not in roots:
            opfunc.__name__ = opname
            opfunc.__doc__ = getattr(int, opname).__doc__
            setattr(cls, opname, opfunc)
			
    return cls




class Action:


    cost = 0
    precedence = 0


    effects = {}
    preconditions = {}


    def procedural_precondition(self):
        return True




class GoalAStarNode:


    def __init__(self, current_state, goal_state):
        self.current_state = current_state
        self.goal_state = goal_state


    @property
    def finished(self):
        return not self.unsatisfied_state


    @property
    def unsatisfied_state(self):
        current_state = self.current_state
        return [k for k, v in self.goal_state.items() if not current_state[k] == v]




@total_ordering
class ActionAStarNode:


    def __init__(self, action):
        self.action = action
        self.cost = action.cost
        self.current_state = action.effects.copy()
        self.goal_state = action.preconditions.copy()


    def __lt__(self, other):
        return self.action.precedence < other.action.precedence


    @property
    def unsatisfied_state(self):
        current_state = self.current_state
        return [k for k, v in self.goal_state.items() if not current_state[k] == v]


    @property
    def finished(self):
        return self.current_state == self.goal_state


    def update_state(self, agent_state, parent):
        action_preconditions = self.action.preconditions
        current_precondition_state = {k: agent_state.get(k) for k in action_preconditions}


        self.current_state = parent.current_state.copy()
        self.current_state.update(self.action.effects)
        self.current_state.update(current_precondition_state)


        self.goal_state = parent.goal_state.copy()
        self.goal_state.update(action_preconditions)


    def __repr__(self):
        return "<ActionNode {}>".format(self.action.__class__.__name__)
 


class Planner:


    def __init__(self, actions, agent_state):
        self.actions = actions
        self.agent_state = agent_state


        self.effects_to_actions = self.get_actions_by_effects(actions)


        self.finder = AStarAlgorithm(self.get_neighbours, self.get_heuristic_cost, self.get_g_cost, self.check_complete)


    def check_complete(self, node, path):
        if not node.finished:
            return False
        
        ordered_path = self.finder.reconstruct_path(node, path)
        current_state = self.agent_state.copy()


        for node in ordered_path:
            current_state.update(node.current_state)
        
        for key, goal_value in node.goal_state.items():
            current_value = current_state[key]
            if not current_value == goal_value:
                return False
             
        return True


    @staticmethod
    def get_actions_by_effects(actions):
        mapping = defaultdict(list)


        for action in actions:
            for effect in action.effects:
                mapping[effect].append(action)


        return mapping


    def get_g_cost(self, node_a, node_b):
        node_b.update_state(self.agent_state, node_a)
        
        return node_b.cost


    @staticmethod
    def get_heuristic_cost(node):
        return len(node.unsatisfied_state)


    def get_neighbours(self, node):
        unsatisfied_effects = node.unsatisfied_state
        effects_to_actions = self.effects_to_actions
        
        for effect in unsatisfied_effects:
            try:
                actions = effects_to_actions[effect]


            except KeyError:
                continue
            
            yield from [ActionAStarNode(a) for a in actions if a.procedural_precondition()]
        
    def build(self, goal_state):
        agent_state = self.agent_state
        current_state = {k: agent_state.get(k) for k in goal_state}


        goal_node = GoalAStarNode(current_state, goal_state)
        print(current_state, goal_state)
        node_path = self.finder.find_path(goal_node)


        path = [node.action for node in list(node_path)[1:]]
        path.reverse()


        return path




def main():
    actions = []
    agent_state = {"has_parents": True, "has_axe": True}
    planner = Planner(actions, agent_state)
    path = planner.build({"has_firewood": True})
    print(path)




if __name__ == "__main__":
    main()

Example file
goap.blend (601 KB)

I like it, very powerful and fast.
The result is a very intelligent AI, who will go and get an axe and then cut down a tree to get wood, but only if they can see the axe and if they don’t already have any axes, firewood and there are trees nearby.

Another idea could be that if the player is immune to lasers they will switch to rockets, and if they run out of rockets they will run away, unless they have grenades.

I wonder how many blender users have the ability to implement something like this though. I know I can understand most of it, but I think I’d have trouble integrating it in my own project because if something went wrong I wouldn’t know how to fix it.

BTW:
Do you use two monitors when developing on Blender? Whenever I download one of your example Blends I get an extra blender window which is a kind of weird. I can press space on it to get the contextual menu but it displays only rendering ghosts and strange artifacts.

I get that as well…

I am having the same problem with it.

I use multiple monitored, and sometimes keep an additional window open. I’ll check it tonight

It’s not a problem as such, it just always confuses me. You can close the window and it doesn’t have any effect on Blender.

I think this is a great resource. Thanks for putting it together. :slight_smile:

awesome script, you rock buddy! the logic is understood but, if this is not too much effort, could you tag the logic parts and variables in the script with 1 or 2 words for explanation and better understanding? would be easier to read for beginners then…

When i press the escape button the blender game engine does not cut off with goap.Why does that happen?

Anyone can do this - just go to window - duplicate window in the header options.

Concerning the escape key, I have written a proprietary addon which modifies the exit key to f12

Or you can go to the game properties tab (usually render properties tab) and find an option to change the exit key. I usually use escape, but some people prefer F4 or F12

No, I mean that i need to ensure that the user doesn’t modify that settings, to be able to successfully intercept the ESC key - it forces it to F12, though I can just as easily do this in game now.

@Smoking Mirror,
“Anyone can do this - just go to window - duplicate window in the header options.” refers to the additional window you’re seeing, though in this case that window does seem corrupted for some reason.

I’ve updated this resource with a new, functional planner.

The example included is a very basic shooter AI, which would benefit from many different changes (but demonstrates some useful principles)

Hello agoose77,

I just reading some slides on slideshare, where the monolith-team discriped their ai-system and the next moment I see your imlementation in the BGE. The phyton you used is way over my head. Right now I’m learning the basics of python, so I’m eager to return to your example and wrap my mind around it. Good job and again thanks alot for your example blend.

ParnoidAndroid

The new demo worked very well, when all the enemies are dead he does just stand around a bit though. Maybe he should go and find somewhere to hide, although camping on the ammo seems a next best attitude.

What would be the next step? I guess you could collect data from bots to find out which action is more/less valuable than others. Then you could adjust weighting on the fly to create a graph which is better weighted to profitable actions.

What I mean is, the AI doesn’t know which type of gun is best at start, but tries each one with the same frequency. They soon learn which one is better by adjusting weighting to avoid the gun used by the most often killed bot and use the one which was used by the AI most successfully. Over a number of generations you could create an AI which is very efficient at eliminating the player.

In this demo, the interface for actions is still rather clunky. In my GitHub repo, I’ve passed in a Controller object instance to the action instead of the blackboard, to allow more functionality.

In addition, I only enabled the AI to complete a single goal, whereas in reality there would be goals for Idle, Patrol and other “goal-like” behaviors. It may not always be desirable for an action to actually modify the world symbol state after completing its action (e.g, rather than setting target_is_dead to True when an enemy is killed, leave it so that the goal is still valid, and other enemies will be targeted in future). I also added support for this in my repo (as well as improved the planner interface for interrupting actions).

Anyway, the basics are in the planner. Quite a lot of logic still remains required to make the planner effective - feeding the planner appropriate information (from a sensory system) and allowing the planner to modify the world (through things like an FSM for managing different states (Animate / GOTO / [GOTO & Animate]). The sensory system is interesting as it allows AIs to fail to understand the entire world as a human would (locality of knowledge).

Ideally, you would manually assign appropriate costs for actions in the planner, and avoid any kind of learning behavior within. It doesn’t matter to the planner how a player is killed, just that there exists an action which may do so. Hence, the weapons manager may well use some kind of learning algorithm to achieve its goals - the planner is just one component!

Impressive … maybe the AI could also learn / create new goals? Each game object with a tag could be added as an ontological entity when the AI encounters them, allowing the AI to modify the game world … this would make a good rogue-like RPG dungeon opponent for a major enemy character - the problem would be then in designing the worlds … however I guess taking a puzzle/adventure game approach to segmenting levels would work out.

I was considering writing an A* algorithm for creating PCB style layouts for pipes or level art (halo style panels for example) … the pipes concept would be a simple PCB design algorithm where the pipes have to route through a maze between 2 locations, this involves representing the game level as a graph data structure, and also creating a tree search that sorts through the different solutions to avoid intersections between multiple pipes. Also the idea of moving to 3d to create halo style structures via interleaving solid geometry is interesting, however I have to make some steps toward understanding the 3d art first.

This is amazing.
But the code is like a brick: its robust and really useful but like a brick to digest.

It seems like the only way to join the goap club is to build your own goap first…

I tried to do a simple version, with simple djikstra with moving from start to goal(opposite is more effective).
The code was surprisingly short and performed awfully.
That was with only 8 actions to find a 9 step solution of how to bake a cake.

The questions what I am trying to ask are:

  • Given the proper optimization how much of performance hit is there with adding additional actions? ( mine was most likely n!)

  • How do you do heuristics?
    [LIST]

  • Like if there are n amount of smart object to interact on a action, how do you know which one is the closest/best? Like which weapon rack should use, closest in Manhattan distance or path distance?

  • Do you rate the the actions which satisfy the prerequisites most higher?

  • etc

  • This is probably a feature which could actually benefit from multicore and faster languages. Should I go straight to java or c++ and use sockets?
    [/LIST]