As it turns out, these games exploit several tricks and undefined behaviors that make emulating them challenging. This appears to be a deliberate attempt to dissuade copying these games. In the interest of accuracy, I have painstakingly investigated, implemented and chronicled all of the unusual things I've found these games to do. [...]
The processor in the Game Boy Advance has a pipeline that has three relevant stages for accurate emulation: fetching, decoding and executing. In the fetching stage, the memory bus is queried for the memory associated with an instruction. This is then passed to the decoding stage, where the processor figures out which instruction it is. Finally, the processor actually executes the instruction. A naïve interpreter may merge all three stages, either for hypothesized speed reasons, or just an uninformed idea of how processors work. mGBA was actually assuming the decoding and execution stages were combined until recently. However, an important observation was made while digging through the Classic NES Series games' code: the game was modifying an instruction that was very close in proximity to where code was being executed already. [...]
What's imperative to understanding what's going on in this block of code is to realize that, once the instruction has been fetched by the pipeline, changing the memory that backs that address is irrelevant. This is similar to how cache coherence works, but is even more stringent. This means that if your pipeline is long enough, the instruction that enters into the pipeline during the write is the one that stores 255. If it's too short, it stores 0. As it turns out, the games will fail to boot if it finds the value 0 in register r1, but boots fine if it's 255.
Emulator countermeasures are so vicious because the stakes are so low.