Aller au contenu

Photo

(solved) Script instruction execution limit in combat mode?


  • Veuillez vous connecter pour répondre
4 réponses à ce sujet

#1
DarthGizka

DarthGizka
  • Members
  • 867 messages

I've heard rumours about an instruction limit imposed on scripts and now I may finally have run afoul of those limits... Does anyone know specifics?

In particular, I've written a script that enumerates hostiles and dumps some info to the log file. It works like a charm outside of combat but in combat mode it just craps out and stops in the earliest stages. Sometimes it works a bit if there are only two enemies left and I restrict the scan radius but not always. The same enumeration and computation logic works perfectly fine in another script that puts similar info into floaties above each enemy, even in mass battle zones like the Dead Trenches.

The only salient difference is that the logging script gathers slightly more information and that it makes heavy use of structures.

The fact that combat mode b0rkens things reduces the utility of the script a lot, since there are quite a few areas that start in combat mode right away.

When I get home I'll do the job of the compiler, eliminating all redundancy and most structure uses via manual optimisation and heavy inlining. That may improve things a bit but probably not a whole lot, depending on how tight the limits are and how heavy structure use is penalised. Without a clear idea what the limits are and without an easy way to gauge/measure them, things are going to be extremely tricky...

In case anyone's curious, the script dumps info about enemy rank, level and combat XP, and all the data needed for computing combat XP for any player level. With a bit of SQL fu this information can be used to compute kill counts and exact XP for any mission/quest/area depending on player level, with near zero manual work except for factoring in enemy pools and enemies that are initially non-hostile. With a bit of manual tagging it can tell things about critical paths, MinXP runs, or XP loss/gain depending on chosen quest order or the presence of a proficient lockpicker in the party.

Log lines example:

Script   log_xp_info_nss__20140707 >>> 3 1887 (DAO_PRC_GIB) [0] gib100ar_smithy 21000 [15,35,1,25,25] (Fiona) [25]
Script   log_xp_info_nss__20140707 XPInfo,301281,gib_cr_revenant,gib_cr_revenant,-61.393009186,-29.866451263,-0.021365324,123.970001221,4,26,1,-1,0,0,26,0,-1
Script   log_xp_info_nss__20140707 XPInfo,301277,gib_cr_arcane_horror,gib_cr_arcane_horror,-46.686077118,-22.440023422,-0.02136532,57.67250061,3,25,0,-1,0,0,12,0,-1
... 70 lines snipped ...
Script   log_xp_info_nss__20140707 XPInfo,301607,pool_gib_skeleton_boss0,gib_skeleton_boss.utc,65505,65505,0,123.970001221,4,26,1,15,0,0,50002,0,-1
Script   log_xp_info_nss__20140707 <<< 3 1887 (DAO_PRC_GIB) [0] gib100ar_smithy 21000 [15,35,1,25,25] (Fiona) [25]


#2
DarthGizka

DarthGizka
  • Members
  • 867 messages

Okay, I've optimised the hell out of the script and now it manages to log more than a hundred critters.
 
Eliminating intermediate structures by inlining brought absolute no change. There doesn't seem to be much of a difference between combat mode and exploration mode, not even with several dozen darkspawn beating on the Warden (Dead Trenches, PC_IMMORTAL). Floaties get even more unreliable than usual in that situation but there are no such problems with the log file.
 
The biggest difference came from drastically reducing the work load of a preprocessing stage that scanned all critters in the area, by using things like hashing and early-out strategies.
 
Logging a counter showed that the engine crapped out after about 395 calls to the following function (a condensed version of the code the game uses to scale enemies according to player level), with no difference between exploration mode and combat mode with several dozen assistantsenemies:
 

int rank_is_normal_or_lower_ (int rank);
 
int compute_target_level_ (object target, int player_level = 0)
{
   if (player_level == 0)
   {
      player_level = GetLevel(GetHero());
   }

   ///// AB_GetAreaTargetLevel() in sys_areabalance.nss /////

   int area_id = GetAreaId(GetArea(target));
   int min_level = GetM2DAInt(TABLE_AREA_DATA, "MinLevel", area_id);  // areadata.xls
   int max_level = GetM2DAInt(TABLE_AREA_DATA, "MaxLevel", area_id);

   int area_level = Max(1, Max(min_level, Min(player_level, max_level)));

   ///// AS_GetCreatureLevelToScale in sys_autoscale_h.nss /////

   // Diff_GetAutoScaleTable() == TABLE_AUTOSCALE == [creatureranks.xls]autoscale
   int rank = GetCreatureRank(target);
   int rank_delta = GetM2DAInt(TABLE_AUTOSCALE, "nLevelScale", rank);

   int level = Max(1, area_level + rank_delta);

   int ignore_max_A = GetM2DAInt(225, "IgnoreAppMaxLevel", area_id);               // toolset: none
   int ignore_max_M = GetLocalInt(GetModule(), DISABLE_APPEARANCE_LEVEL_LIMITS);   // toolset: none

   if (!IsSummoned(target) && !ignore_max_A && !ignore_max_M)
   {
      // inlist(rank, CREATURE_RANK_ONE_HIT_KILL, CREATURE_RANK_CRITTER, CREATURE_RANK_WEAK_NORMAL, CREATURE_RANK_NORMAL)
      if (rank_is_normal_or_lower_(rank))
      {
         // APR_base.xls
         int max_level = GetM2DAInt(TABLE_APPEARANCE, "MaxScaleLevel", GetAppearanceType(target));

         if (max_level > 0 && level > max_level)
         {
            level = max_level;
         }
      }
   }

   // 14 Marjolaine
   // 15 Flemeth
   // 15 'Andraste'
   // 20 Gaxkang

   level = Max(level, GetLocalInt(target, MIN_LEVEL));

   int climax_army = GetLocalInt(target, CLIMAX_ARMY_ID);

   // doesn't work right for some ogres in the market district...

   if(climax_army > 0 && GetTag(target) != "cli000cr_army_legion")
   {
      int max_level = GetM2DAInt(TABLE_PLOTACTIONS, "MaxLevel", climax_army);  // plotactions.xls

      if (max_level > 0 && level > max_level)
      {
         level = max_level;
      }
   }

   if (rank == CREATURE_RANK_ONE_HIT_KILL)
   {
      level = Min(level, 10);
   }

   return level;
}
 
//--------------------------------------------------------------------------------------------------
 
int rank_is_normal_or_lower_ (int rank)
{
   switch (rank)
   {
      case CREATURE_RANK_ONE_HIT_KILL:
      case CREATURE_RANK_CRITTER:
      case CREATURE_RANK_WEAK_NORMAL:
      case CREATURE_RANK_NORMAL:
      {
         return TRUE;
      }
   }

   return FALSE;
}



#3
Sunjammer

Sunjammer
  • Members
  • 926 messages

The most likely candidate is the TMI (Too Many Instructions) limit which is a hard limit of 131,072 instructions. In the past referred to as the "20K rule" because 131,072 in hex is 0x20000. Also in the past there were ways to get round using DelayCommand, AssignCommand, etc. to change the context and give you a fresh 20k to play with. However there are other limits on the virtual machine such as the number of recursion levels (a maximum of 8) which may be relevant.

 

Also, at least in NWN, there was a concept of an "AI frame" and any command not performed within that time slice was simply lost into the ether. I'm not sure if Dragon Age has the same or similar concept.


  • DarthGizka aime ceci

#4
DarthGizka

DarthGizka
  • Members
  • 867 messages

Thanks for the pointer, it confirmed a suspicion that I got by farting around with small bits of code and it sped up my research tremendously by giving me a working hypothesis that turned out to be correct. Consider the following:

void main ()
{
   int n = 1;

   for ( ; ; )
   {
      SetLocalInt(GetMainControlled(), "CREATURE_SPAWN_DEAD", n++);
   }
}

This code advanced the counter to 16383 before biting the dust. In order to get a clearer picture I needed the actual instruction counts involved. Hence I fed a DAOified version of nwscript.nss to the KotOR edition of nwnnsscomp and pointed the program at the NCS:

00000008 42 00000074              T 00000074

0000000D 1E 00 00000008           JSR fn_00000015
00000013 20 00                    RETN

00000015 02 03                    RSADDI
00000017 04 03 00000001           CONSTI 00000001
0000001D 01 01 FFFFFFF8 0004      CPDOWNSP FFFFFFF8, 0004
00000025 1B 00 FFFFFFFC           MOVSP FFFFFFFC

0000002B 04 03 00000001           CONSTI 00000001
00000031 1F 00 0000003B           JZ off_0000006C
00000037 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
0000003F 24 03 FFFFFFF8           INCISP FFFFFFF8
00000045 04 05 0013 str           CONSTS "CREATURE_SPAWN_DEAD"
0000005C 05 00 00A8 00            ACTION GetMainControlled(00A8), 00
00000061 05 00 0040 03            ACTION SetLocalInt(0040), 03
00000066 1D 00 FFFFFFC5           JMP off_0000002B

0000006C 1B 00 FFFFFFFC           MOVSP FFFFFFFC
00000072 20 00                    RETN

The loop contains 8 instructions, and 2^17 / 8 happens to be 2^14 or 16384, which is certainly suggestive. Two tests with inserted statements confirmed the basic relation, but testing with the toolset turned out to be extremely cumbersome and time-consuming. Hence I used my favourite Swiss Army knife (FoxPro) to reduce the code to its essentials, like what a real compiler would have produced:

00000008 42 00000048              T 00000048

0000000D 04 03 00000001           CONSTI 00000001

00000013 03 01 FFFFFFFC 0004      CPTOPSP FFFFFFFC, 0004
0000001B 04 05 0013 str           CONSTS "CREATURE_SPAWN_DEAD"
00000032 05 00 00A8 00            ACTION GetMainControlled(00A8), 00
00000037 05 00 0040 03            ACTION SetLocalInt(0040), 03
0000003C 24 03 FFFFFFFC           INCISP FFFFFFFC
00000042 1D 00 FFFFFFD1           JMP off_00000013

The same surgical tool made it easy to produce variations - for example with no-op instructions like "JMP $+6" inserted in strategic places - and to write the resulting NCS to the override directory, ready to be invoked from the running DAO. That made the total round-trip time from test to test little more than two Alt-TABs and two or three key presses in each place.

The path to the first execution of SetLocalInt() contains 5 instructions, one cycle around the loop contains 6. Hence the expected number of completed calls to SetLocalInt() should be floor(1 + (0x1FFFF - 5) / 6), or 21845. With one or two no-ops before the loop the result should remain unchanged, and with three of them it should drop to 21844. That's exactly what happened when I ran the experiment.

Armed with this information I (he)x-rayed the executable with IDA Pro and within seconds I homed in on the following instructions:

.text:00A0B37B 83 46 60 01                    add     [esi+$CScrEng.executed_insns], 1
.text:00A0B37F 81 7E 60 00 00 02 00           cmp     [esi+$CScrEng.executed_insns], 20000h
.text:00A0B386 0F 8D E9 18 00 00              jge     @@return_error_278

The structure field was named after I found the code, not before. Just in case anyone was wondering. The instruction limit is checked before an instruction gets executed, which explains why my earlier formula contained 0x1FFFF instead of 0x20000.

I switched back to my Swiss Army knife to poke the value 0x40000 into the immediate operand field of the comparison instruction. Re-running test scripts produced 32767 and 43690 respectively, and Bob was my uncle.

Patching the executable on disk would open a can of legal worms, hence the best way to adjust the limit would be in memory. E.g. FindWindow(), OpenProcess(), compute section hashes to verify that the binary is exactly as expected, then WriteProcessMemory().

If adjusting the limit is not an option for some reason - e.g. for mods - then there are only two ways out. One: reduce the instruction count by optimising, manual inlining and pushing processing into the engine, regardless of casualties. Macro preprocessors and manual assembly could help in really tight places. Two: split the processing into packets and chain by posting events, exactly like eons ago under 16-bit Windows. ExecuteScript() and HandleEvent() do not help here; they execute synchronously and do not grant a new quantum.
 

#5
DarthGizka

DarthGizka
  • Members
  • 867 messages

(... TMI 0x20000 ...) However there are other limits on the virtual machine such as the number of recursion levels (a maximum of 8) which may be relevant.


If found a limit of 9 for nested script invocation (not counting the outer script), and 124 levels of nested function calls below the level of main(). Invoked scripts run on the same stack and see the remaining space reduced by two, one for main() and one for the hidden "JSR main" that the broken compiler generates instead of putting main() at offset 0 or generating a JMP.

 

Adding 2 for the main() of the outer script and its hidden entry point function gives 126, which is somewhat odd seeing that the hard-coded limit is 128. Even postulating a fence-post error one would expect 127 here...

 

Both limits stem from fixed-size arrays embedded in the script engine object, meaning there are no easy fixes.
 

Also, at least in NWN, there was a concept of an "AI frame" and any command not performed within that time slice was simply lost into the ether. I'm not sure if Dragon Age has the same or similar concept.

 

Probably not. A time limit was my first guess but calling engine functions with a bit of a workload simply resulted in the game becoming unresponsive for a long time, up to a quarter of an hour, without the script getting aborted.

I used the moral equivalent of FindSubString(Space(N), Space(K) + "*") to create arbitrary delays in the bowels of the engine, since such parameters elicit the O(N * K) worst-case behaviour of the function. For example, with N = 2^22 and K = 2^11, one call took 3.77 seconds.

It is my impression that the whole script thing has not advanced a whole lot since the times of NWN; on the contrary, it seems to have suffered quite a bit of decay...