In-depth explanation on how STUP works with code snippets

On this page, I will explain how STUP works with the help of code snippets. I will do it in the same way as the detailed STUP explanation with examples with a flowchart that shows the steps and what is happening. When making STUP me and another programmer made a UML for how we wanted to structure the STUP code.

In order for STUP to work we need to get the current worldstate. The worldstate is a set of variables that say what the state of the world is which will be used as a start for GOAP to make the plans. There will be a better explanation of how this is used in the GOAP section

With STUP we make 1 goal for every action we can do on every object in the game we can do actions on. Every goal has an action which has preconditions which need to be fulfilled which applies certain effects on the worldstate. This is what is used to make a plan for the goal and will be explained in more detail in the GOAP section. The Score and Multipier will end up being used to weigh the goals increase/decreasing the chance of them being chosen and is important in step 1, 3, and 4 of Utility. Rank will be used to eliminate goals in the 2nd step of Utility

This is pretty simple we just call OnUpdate() from blueprints to start the STUP algorithm. The reason for the Plan being returned the way it is will be explained in the "Return plan to blueprint section"

Now we start with the first step of Utility. What we want to do is eliminate any goal that is impossible or undesired. This is down by removing every goal with a Score or Multiplier of 0 or lower as we don't want to make a plan for those.

To do Utility step 2 we first check which goal has the highest rank and then we eliminate every goal with a lower rank. The reason we do this is that certain actions will have a higher priority in general than others. Let's say you are in a battle but somewhere in the environment is a TV. You would never go watch TV while there is an enemy trying to kill you so watching TV has a lower rank and will thus be eliminated by the presence of the enemy.

Now that we have eliminated the goals that we cannot do or do not want to do we can make plans for those we have left. In this function we use the A* algorithm to find a path from our current worldstate to the goal worldstate if possible. Each time we finish the loop in the image below we change the current worldstate and check if it meets the preconditions for the goal. If it does then we have made a plan and we can add the action to reach the goal and the cost. If we don't find the goal however we continue checking which actions we can do on the current worldstate before switching it. 

In the loop to the left, we check which actions can be done on the current worldstate. In this function we have 2 vectors: OpenList and ClosedList. The ClosedList contains all the world states we have already visited so we don't end up with a lot of duplicate worldstates. The OpenList has all the worldstates we have yet to finish and will be expanded as we reach new nodes. This list is ordered based on Heurestic which is a value that tells you how close you are to the goal with the worldstates with the lowest Heurestic being at the beginning of the list. Before this algorithm we remove the first worldstate from the OpenList and make it our Current worldstate. After that we go over each action and see if the preconditions are fulfilled. If they are fulfilled we check if this worldstate is on the ClosedList and if it is we exit. Next we check if it is member of the OpenList and if it isn't we add it, if it is on there we check if the cost of the current plan is cheaper and if it is we update the worldsate in the OpenList and make sure it is still in the right location.

We now have a cost for every goal which we are going to use in the next step of Utility. In this step we first get the highestUtilityPerCost with Utility being Score * Multiplier. Once we have this we check every goal to see if the UtilityPerCost of the goal is lower than the highestUtilityPerCost * CutOffRatio(number between 0-1). If it is we will put it on a list of actions to potentially remove. Afterward, we check every goal and add up the UtilityPerCost off all the to be removed goals and if that total is higher than the highestUtilityPerCost * CutOffRatio the goals will not be removed. The reason we do this is that we want to eliminate the goals that normally wouldn't be taken even though they are an option but if they together are significant we shouldn't remove them.

An example is lets say you are a sniper and you shoot at a battalion of troops. The leader will be a very important target and thus have a high UtilityPerCost so lets say it is 10. Now there are also 20 soldiers which each has a UtilityPerCost of 0.5. With a CutOffRatio of 0.2 all the soldiers would not be up for consideration anymore leaving the leader as the guaranteed target every time. This is why the last part is there because now we can check if 10 * 0.2 > 0.5 * 20 and because the combined total is still relevant they will still be considered and not removed.

Now that we have a plan for every goal and only significant goals left we just need to randomly pick one. Now if we just randomly pick one the whole scoring at the start would have been for nothing. To avoid that we are going to use the UtilityPerCost as a weight to give a higher chance for some goals to be picked. First we need the totalUtilityPerCost of all the goals. Next we get a random number between 0 and totalUtilityPerCost. Now that we have a random number we will check if the UtilityPerCost of the first goal is higher than the random number. If it isn't we add the first goals UtilityPerCost to the currentCombinedUtilityPerCost of all the goals we have already checked. Now we do the same for the next goal but we add the currentCombinedUtilityPerCost to the goals UtilityPerCost and check if this is higher. Repeat this until it is higher and now we have our goal.

Now we return the plan to the blueprint and execute it. The reason why we return a string and the affected actor and not the enum and not the affected actor is that we couldn't figure out how to point to a single part of an enum. We made STUP easy to be used by designers so the enum we use is stored as a uasset and not in C++. Because of this, I made some conversions for being able to convert the string back to the enum so it can be used by the Behavior Tree.

We have enums that will be done after each other and to execute the actions we use a behavior tree. This tree will only be called when the next action is called which happens at the end of all the tasks.