My thesis was completed and handed in last month, and I had my viva last week. Academically it has exceeded every expectation, but no one is interested in that, what's important is the results and analysis from the project. Now, I don't think I can share the full thesis until after my mark is finalised (which is my plan), but I can summarise my findings now. Essentially the result was that the computer tuned AI sucked, but wasn't a complete failure. This first attempt was a shot in the dark though, and honestly things worked better than they had any right to be. Details below.
The final conclusion was that everything worked great, except the GA fitness algorithm. This basically just tracked a few statistics about each game and then turned them into predicted survey responses. The algorithm was based entirely on guesswork, I had no previous studies to work with, and no experience with game analytics; this study has helped with both of those. So to realise the benefits, I think all we need to do is sort out the analytics side and the automated analysis of that. The analytics used in this study were all home made, done on an extremely tight schedule (i.e. a couple of days) and I know for sure there are better ways of doing it, unity has everything you could want here. That just leaves the automated analysis.
On the automated side there's no standard system for doing it, but there were some survey metrics that did have some level of success with prediction here, certainly nothing particularly accurate, but something at least. I'm certain that with decent analytics info, and some calibration tests, we could produce a solid fitness test. The thing with this is that the fitness doesn't have to be extremely accurate or detailed to produce valuable results. If the automated system can get an AI this complicated half tuned, in an afternoon then that's a big improvement over what we have now. Additionally there's not just one "perfect" tuning, there's going to be multiple generally good solutions, and countless more specialised tunings for different tastes; you just can't experiment that much doing it manually.
End result is that I'm excited for this technology, I'll be pushing ahead with it at some point in the future for sure. Do I think a GA tuning is necessary or the right idea for the early Hunter's Moon releases? No. Do I think it'll be invaluable at a later stage, especially for the general simulator project (whatever we end up naming it)? Yes, absolutely.
I'm hoping the thesis will be up in the next month or so, whenever I get my final grade. There's a couple issues, typos, etc. with it but it's presentable enough.