With all the supplementary operations out of the way, time to analyze the arithmetic floating point operations: MR, DR, AR and SR. First up is MR.
To understand the code, let's first look at the math involved. Suppose we have two real numbers N1 and N2. In the IBM360 format these will be expressed as
S1 x 0.M1 x 16 ^ E1
S2 x 0.M2 x 16 ^ E2
The product will be:
S1 x 0.M1 x 16 ^ E1 x S2 x 0.M2 x 16 ^ E2
which is the same as:
(S1 x S2) x (0.M1 x 0.M2) x (16 ^ E1 x 16 ^ E2)
which is the same as:
(S1 x S2) x (0.M1 x 0.M2) x 16 ^ (E1 + E2)
This last formula is what the code calculates.
The code begins with:
; entry point for MR
08F4 C138 MOV *R8+, R4 ; is multiplier equal to zero?
08F6 1357 JEQ >09A6 ; yes: set FPAC to zero & finish
This handles the case where the accumulator is multiplied by zero: the result is zero.
Next comes a subroutine that handles the exponents and the sign bits:
08F8 06A0 BL @>0A4C ; separate & add exponents
08FC A187 DATA >A187 ; = "A R7, R6" (for MR add exponents)
08FE FFC0 DATA -64 ; = subtract double excess 64
The subroutine is followed by two data words, which make it usable for both multiplication and division. The function of the data words will become clear when walking through the subroutine code.
; subroutine for MR and DR: calculate result exponent and sign
0A4C C000 MOV R0, R0 ; is FPAC zero?
0A4E 13C4 JEQ >09D8 ; yes: set flags & finish
0A50 C158 MOV *R8, R5 ; fetch 2nd word of operand
The subroutine starts with a check for FPAC equalling zero (i.e. the multiplicand or the numerator is zero); in that case the result is zero too. Next, it fetches the second word of the operand which had not been fetched earlier. The code can now rely on both FPAC (R0,R1) and the operand (R4,R5) being in standard normalized format. The first thing it does is separating the mantissa from the sign bits and exponents:
0A52 C180 MOV R0, R6 ; save exponents in R6 and R7
0A54 C1C4 MOV R4, R7
0A56 7000 SB R0, R0 ; remove exponents from mantissas
0A58 7104 SB R4, R4
The next thing is multiplying the sign bits:
0A5A C207 MOV R7, R8 ; figure out sign of result in R8
0A5C 2A06 XOR R6, R8
Multiplying two bits is the same as taking their exclusive OR. Note that the top bit in R8 will have the sign of the result, but the other 15 bits are not zero -- the other bits are meaningless to the multiplication. This is followed by placing the excess-64 exponents as proper integers in R6 and R7:
0A5E 06C6 SWPB R6 ; place FPAC exponent in R6
0A60 0246 ANDI R6, >007F
0A64 06C7 SWPB R7 ; place operand exponent in R7
0A66 0247 ANDI R7, >007F
Now we are ready to add the two exponents together (or subtract them for division). This is where the two data words that followed the subroutine call are used:
0A6A 04BB X *R11+ ; MR: "A R7,R6", DR: "S R7,R6"
0A6C A1BB A *R11+, R6 ; MR: -64, DR: +64
0A6E 0286 CI R6, >007F ; exponent in range?
0A72 15D7 JGT >0A22 ; jump on overflow
0A74 1B9D JH >09B0 ; jump on underflow
First it executes the instruction in the first data word. For MR this is "A R7,R6", which adds the exponents. However, by adding the exponents the excess 64 is now included twice and must be removed once. The next data word contains "-64", which is added to the exponents. The result is that the right excess-64 exponent is now in R6. This is followed by a range check. Here the utility of the excess-64 encoding becomes clear to see. I'm not sure why the jump to overflow at >097C is done via >0A22: the real target is (just) within range. The subroutine finishes by merging the result sign bit back into the result exponent:
0A76 0A18 SLA R8, 1 ; put sign bit back in exponent
0A78 1702 JNC >0A7E
0A7A 0226 AI R6, >80
0A7E 045B RT
END OF SUBROUTINE
Now we can go back to the main MR routine at >0900. This happens to be the 32 x 32 -> 64 bit multiplication routine that we already saw as part of the MM instruction:
0900 C085 MOV R5, R2 ; long multiply in four 16x16 bit steps
0902 3881 MPY R1, R2
0904 C205 MOV R5, R8
0906 3A00 MPY R0, R8
0908 C284 MOV R4, R10
090A 3A81 MPY R1, R10
090C 3804 MPY R4, R0
090E 002A AM R10, R8 ; add the partial results
0912 1701 JNC >0916
0914 0580 INC R0
0916 002A AM R8, R1
091A 1701 JNC >091E
091C 0580 INC R0
I'll not discuss it again, simply scroll up to the analysis of the MM instruction for detail on the above code. In essence it multiplies R0,R1 by R4,R5 leaving its result in R0..R3.
Next comes the bit of MR code that was skipped in the MM discussion. That code is:
091E D186 MOVB R6, R6 ; is this a MR or MM instruction?
0920 1607 JNE >0930 ; jump if MM
0922 D001 MOVB R1, R0 ; MR: prenormalize mantissa
0924 06C0 SWPB R0
0926 D042 MOVB R2, R1
0928 06C1 SWPB R1
092A 06C2 SWPB R2
092C C0C6 MOV R6, R3
092E 10C2 JMP >08B4
First the flag byte in the upper half of R6 is checked. For MR this will be zero, as the exponent cannot be larger than >007F.
Next the code pre-normalizes the mantissa by moving it two hex digits (one byte) to the left. The simple way to think about this is that we are multiplying two 24 bit mantissa's into a 48 bit result. We are only interested in the top 24 bits of that result and moving two digits to the left places these 24 bits in R0,R1 properly aligned for combination with the sign and exponent. The more precise way to think about this is that we are doing fixed point arithmetic here, and that a six digit shift right is needed to keep the decimal point in the right place; shifting two hex digits to the left and taking the high two words is functionally the same (and leaves some extra digits available).
However, we are not done as it is possible that the first hex digit is still zero. This is easy to see when using two decimal examples:
0.10 x 0.10 = 0.01 and 0.99 x 0.99 = 0.98
Even though we have kept the decimal point in the right place, the first digit can still be zero in some cases. To normalize this there is a routine that is shared by the other arithmetical operations. This routine expects the result sign/exponent in R3 and so it is moved there first. It also expects the next hex digit in the top of R2.
The shared tail routine is:
; normalize FPAC mantissa (leftward)
08B4 0280 CI R0, >000F ; is the highest nibble 0?
08B8 1509 JGT >08CC ; no: mantissa is normalized
08BA 24E0 CZC @>0BD6, R3 ; exponent already 0?
08BE 1378 JEQ >09B0 ; yes: underflow
08C0 0603 DEC R3 ; reduce exponent & shift mantissa one nibble
08C2 001D SLAM R0,4
08C6 09C2 SRL R2, 12 ; shift in one nibble extra precision
08C8 A042 A R2, R1
08CA 10F4 JMP >08B4
0BD6 007F DATA >007F ; exponent bits
First it checks that the first mantissa digit is zero. If not, the mantissa is already normalized. If it is it checks the exponent. If it is already zero, the mantissa cannot be shifted further: it would require the exponent to be reduced by one and puts it out of range (the excess-64 exponent would move from -64 to -65). In that case an underflow is reported.
In the other case, the exponent is reduced and the mantissa shifted left by one. To keep accuracy, a 'spare' extra digit of precision kept in R2 is shifted in. Because it is a common tail, the routine will check if further shifts are necessary, but in in the case or MR it will only ever perform one shift. After that, only merging the result exponent back in remains:
08CC 06C3 SWPB R3 ; merge exponent back in
08CE D003 MOVB R3, R0
08D0 1071 JMP >09B4 ; store FPAC & set status bits
The code for underflow is simple, and very similar to the code for overflow:
; underflow: additionally set AF status bit
09B0 026F ORI R15, >0800
<continues with normal exit code at >09B4>
Underflow only sets the arithmetic fault (AF) status bit. This allows the user program to distinguish overflow (C bit also set) from underflow.