I know it is an old topic but I was searching around as I had the same question.
So from what I can tell, and I could well be wrong.
The 68000 has 24 address lines so can address 16Mb. From what I can tell it technically has 23 and one more line which seems to be more of a upper/lower memory select (a bank select kind of) but it is essentially another address line – not sure why they did this rather than just label it A0.
The MMU on the STE only has 20 address lines (well seems to, the pins are labelled funny), so 4Mb. Interestingly I think the MMU on the standard ST only has 19 lines so can only support 2Mb. From what I can tell this is also the case for several of the custom chips in the Amiga so samples, screen, sprites, etc had be in the lower 2mb (I understand later chipsets “addressed” these limitations) although I think the MMU supported 4mb for the CPU. The STE DMA/Chips hardware could address all 4mb.
The data bus is 16bit so only a byte at a time is moved, I am not sure if a move.b only does one read from the BUS or if it was forced to do two reads.
The 68030 could address 16Gb so I am guessing the Falcon MMU only had 24 address lines, however, it could only have 14mb and the last 2mb were masked off, I suspect this was done DMA or DSP reasons. I am not sure what the TT was capable of, but I think some of the chips were constrained to the lower 2mb.
That said it may be that you could write software for the ST were a move.b $000,£ff8240 had the same result as move.b $000,£ffff8240 (as the upper 8bits were masked off) so it could be for compatibility. This is speculation, I have not done the math/research on this.
There were upgrades for both the ST and STE to go beyond their limits and I assume they tapped directly into the CPUs extra address lines and then some bodged external MMU logic. Any memory above 4mb (2Mb on the ST) was classed as TT RAM (Fast RAM) so only programs and their data could be stored here (as the other chips did not have the address lines). Now I am unsure if this was technically “fast RAM” as I understand the purpose of calling it this was because when the CPU was accessing this memory it did not have to share the BUS cycles with the other chips, I wonder if this may not be true in this situation.
If anyone wants to point out anything I am happy to learn.