Would it be possible to make a 4X game AI that plays well with MILP?

4X games like civilization, Stellaris, endless legends, etc… are known to have bad AI. The games often fix this issue by giving AI a massive handicap. This often restricts players into going into strategies that can negate the handicap, or sometimes it backfires by allowing the players to trade/manipulate/steal from/etc the AI, turning the AI handicap into player’s advantage.

And if you play these game and are familiar with mixed integer linear programming, you might wonder if there’s a way to get the AI to play well with it. It looks quite fitting. From pop growth constraint to building constraint, these could be encoded in the MILP.
There may be some unpredictable factors like exploration, but these factors could be giving out a flat positive score that the MILP tries to optimize for.
So, would it be possible? I think the reason this was not done is because MILp, while not an uncommon tool in many industries, is not a standard tool in game devs’ toolkit. However, it would be funny if a big issue that caused a lot of complaints in the player base of virtually every 4x games can be solved with some clever math tricks.

There’s only one way to find out :smile:

p.s. As a comment: you might find this post is more productive if you can ask a concrete question, like “Given input data A and decision variables B, how do I model constraint C” rather than a general question like “is it possible to…”.

1 Like

This question is a bit awkward no matter where you ask it. Ask it on game forums and few would understand linear programming. Ask it on a forum of mathematical optimization enthusiasts and not many would play such games.

Encoding individual constraints into the solver shouldn’t be too difficult. The problem is that such games often require a lot of interacting mechanisms. Some mechanisms are also slightly random or involves unknown variables so you would need to approximate these random events as expected value and the risk you’re willing to accept. This would require a lot of tuning to make sure the AI doesn’t play too conservatively, but doesn’t take too many unnecessary risks either. This could become a nightmare really quickly. I only have a feeling that this should be theoretically possible. To test this, I would need to

  1. Get access to a 4x game AI. This could be done by modding a game, working with the game devs, find an open source 4x game, or creating my own game.
  2. Access the AI part and make it able to use Julia and JuMP. Since most games weren’t written in Julia, this would require a lot of language interops.
  3. Encode the game mechanics as constraints. This would require a lot of time to encode the mechanics, as well as tricks to evaluate uncertain potential risks and reward.
  4. Tune the risk/reward parameters. Not trivial either.
  5. Finally hopefully get a good AI!

And the project may fail.

I like the idea. In general I think the branching factor is really high in such games, so this might be a problem. Otherwise reinforcement learning is a nice , although not guaranteed optimal, alternative for this. Bertsimas had a paper called Mixed Integer Optimization in milliseconds, which would seems appropriate. And describing the will be a nightmare :sweat_smile:

I am on my phone right now, so no fancy links sorry :melting_face:

The branching factor can become an issue. Some MILP problems with hundreds or even thousands of variables can be optimally solved in less than a second. However, the hard cases can be really difficult to solve. I’m not sure if optimizing your build order in these games is an easy case or a hard case either. Human players tend to do it quite well, but I’m not sure how would a machine fare with the problem formulated as MILP.

I’ve also toyed with the idea of trying to make a Civ 6 AI - the built-in heuristic-based “AI” follows the rules, but that’s the kindest thing you can say about it.

But your idea won’t work. The main problem is that Civ is a multiplayer game of incomplete/imperfect information. A key element of Civ is the “fog of war” (even during peace), so you can’t account for what’s happening elsewhere using classical optimization. There’s also some RNG (combat, goody huts, natural disasters, etc.).

But here’s an interesting idea for you if you want to see Julia make a contribution to a better Civ AI. Develop a package that encodes all the rules of Civ 6. Form structs that describe the complete game state and potential decisions in each state. Then get it to simulate a complete game from beginning to end using 100% random decisions for each player.

Yes, that’s utter chaos, but if you can get it to run a full game in milliseconds then you have a perfect sandbox for experimentation with AI strategies and tactics. As they say, “If you build it they will come”. Slap an MIT license on it, announce it on Civfanatics and /r/civ on Reddit and call it a day. A performant Civ game framework like this would be used by academia, game AI devs (maybe even Firaxis itself), and might even form a better core for human multiplayer games than the built-in multiplayer server. You could change the (Civ) world.

1 Like

I know the issue. My idea that makes it possible is to have the potential risk/reward encoded in the objective function. Cutting corners and spending every last bit of your money on your plan or leaving your cities undefended would increase the risk score. Making exploring units would increase the potential reward score. Here’s an example of what can be done.

  1. Pre-optimization. Assess the situation and calculate the duration of the plan as well as the reward. More uncertainty and unexplored realms mean shorter plans and more rewards for exploration. This would mean, for example, that if you do not have a planned location for your first 3-4 cities, you encode production of exploration units in the objective. If a neighbor is likely to be aggressive, encode defensive buildings in the objective.
  2. Optimization: After formulating the objectives as MILP, solve it.
  3. Execution: Get the result of what to do at the current turn. Discard the rest.

I am not saying that the variables optimized in MILP are what will happen. Instead, it is the plan and the objective function is the assessed feasibility of the plan.
I’m not talking about civilization in particular. I’m talking about 4x games in general. This might have to change based on games.

You call this pre-optimization, but what you are describing are heuristics. That’s exactly how the current AI works, so it’s probably difficult to improve significantly upon.

As for solving the rest as a MILP - every Civ turn (or 4X game in general) has on the order of some hundreds of discrete decisions that can be made every turn. E.g. every city can put something in the build queue, every unit can move to a few different hexes and take some action, then you have empire-wide research and civics to choose, neighbors to negotiate with, etc. Those integer variables quickly add up to a MILP problem that can take hours to solve on a beefy machine (if it solves at all), but your time budget is around 1-5 seconds per civilization to have a game that flows nicely. Even neglecting the issue with imperfect information, solving a large optimization problem several times per turn is probably intractable. That’s why I claim it won’t work.

That said, I hope you play with the idea and report your experience here. I would definitely follow your progress with interest.

1 Like

The issue of MILP taking hours is understandable. Honestly, I don’t know if this would be the case in practice. Easy cases of MILP with thousands or perhaps even millions of variables can be solved in less than a few seconds. I do not know either whether the civilization case is an easy case or a hard case.

Heuristics are unavoidable in AI as complicated as this. Let’s go to chess engines. You could absolutely make a chess playing program which says “look at the board, use some rule-based heuristics to know which move to play next.” This would be a valid program, but it wouldn’t be very strong. Compare this to how chess-playing programs work by searching “what if I do this, then what could be my opponent response, then what could be my response to that, and so on, and then use heuristic-based evaluation perhaps 20 moves down the line to say “and after all those moves, I’d be in an advantageous position.” This makes chess engines really strong.

Imperfect information games have imperfect information, but that doesn’t mean you can’t have a plan. The MILP reduction is the equivalent of “search for all the possible plans and evaluate the feasibility of each plan and pick the best plan.”, which is likely more powerful than “look at the game information and evaluate right now what to do.”.

So, these are the reasons I think it can work.

My first worry wouldn’t be performance but whether it’s possible to design a useful MILP at all. I.e. decide what your variables mean and determine whether you can formulate all constraints as linear inequalities in those variables and the objective function as a linear function. I’m not saying that it can’t be done for this application but there’s more than one time when I’ve thought that I could solve some problem with a MILP but it turned out that I couldn’t in fact formulate linear constraints.

To get a proof of concept I would recommend scaling down the problem as far as possible without losing relevant characteristics, possibly designing a toy scale 4X game. If the MILP approach works for that game, it’s time to scale things back up.

1 Like