Part 6 of 11 -- Simple Assembly for Atari BASIC - Various Bit Manipulations
Bit Operations
==============================================================
Part 1 - Introduction
http://atariage.com/forums/blog/576/entry-13175-part-1-of-11-simple-assembly-for-atari-basic/
Part 2 - Learn 82.7% of Assembly Language in About Three Pages
http://atariage.com/forums/blog/576/entry-13176-part-2-of-11-simple-assembly-for-atari-basic/
Part 3 - The World Inside a USR() Routine
http://atariage.com/forums/blog/576/entry-13177-part-3-of-11-simple-assembly-for-atari-basic/
Part 4 - Implement DPEEK()
http://atariage.com/forums/blog/576/entry-13178-part-4-of-11-simple-assembly-for-atari-basic/
Part 5 - Implement DPOKE
http://atariage.com/forums/blog/576/entry-13180-part-5-of-11-simple-assembly-for-atari-basic/
Part 6 - Various Bit Manipulations
http://atariage.com/forums/blog/576/entry-13181-part-6-of-11-simple-assembly-for-atari-basic/
Part 7 - Convert Integer to Hex String
http://atariage.com/forums/blog/576/entry-13182-part-7-of-11-simple-assembly-for-atari-basic/
Part 8 - Convert Integer to Bit String
http://atariage.com/forums/blog/576/entry-13183-part-8-of-11-simple-assembly-for-atari-basic/
Part 9 - Memory Copy
http://atariage.com/forums/blog/576/entry-13184-part-9-of-11-simple-assembly-for-atari-basic/
Part 10 - Binary File I/O Part 1 (XIO is Broken)
http://atariage.com/forums/blog/576/entry-13185-part-10-of-11-simple-assembly-for-atari-basic/
Part 11 - Binary File I/O Part 2 (XIO is Broken)
http://atariage.com/forums/blog/576/entry-13186-part-11-simple-assembly-for-atari-basic-the-end/
==============================================================
The fun and games in programming for the Atari environment often requires manipulating hardware registers. Many features are enabled and disabled by individual bits in a byte. Additionally, the data used for many graphics features in the Atari 8-bit environment makes more sense when dealt with at the bit level. For instance, a character on the Atari's normal text screen (ANTIC Mode 2) can be flipped between inverse video and normal video by switching the high bit in the character byte on and off. The same bit in ANTIC modes 4 and 5 chooses between COLPF2 or COLPF3 color for certain bit patterns in the character data. The text color in ANTIC modes 6 and 7 is controlled by manipulating the two highest bits of the character byte. Pixels in a map mode are specified by single bits, pairs of bits, or groups of four bits.
Bit-wise operations, simple in 6502 machine code, are painful and very slow in BASIC. Isolating a byte's individual bits in BASIC code is a horrible sight to behold. This is nothing like the two-byte DPeek and DPoke that BASIC can duplicate (slowly) with a multiply or divide. Bit work in BASIC requires repeated value testing and computation to identify which bits are set and which are clear. A simple example of the terror is presented here:
100 REM DISSECT BITS IN 16-BIT WORD 105 REM SHOWBITS.BAS 110 REM 115 WORD=35235:REM $89A3 120 BVAL=32768:REM $8000 125 ? "WORD = ";WORD 130 ? "BITS ="; 135 FOR BIT=0 TO 15 140 IF INT(BIT/4)=BIT/4 THEN ? " "; 145 IF WORD>=BVAL THEN ? "1";:WORD=WORD-BVAL:GOTO 155 150 ? "0"; 155 BVAL=BVAL/2 160 NEXT BIT 165 ? 170 END
The code starts with the value of the highest (that is, 16th) bit of a word. If the test value is greater than or equal to the bit value then the bit is set. If not, then the bit is 0. Then the value of the bit position is divided by 2 to get the value of the next position. This evaluation and test action loops 16 times until all the bits are evaluated. One optional extra is inserted – before every four bits it outputs a space to separate the bits into groups corresponding to nybbles (half bytes).
In terms of code this is about as simple as it gets. The routine could be made faster by packaging the bit values in an array and testing from the array. But the same problem remains: BASIC is looping and performing multiple operations in each loop. In this simple example it is scanning one 16-bit integer. Implementing useful bit operations requires dissecting two 16-bit integers and assembling output in another 16-bit integer. We really don't want to go there in BASIC.
In defense of Atari BASIC, though it lacks bit operations, so do most other BASIC languages, because the original purpose of BASIC is to protect beginner programmers from the hardware. But, we're working on an Atari here and want to embrace the hardware.
OSS BASIC XL provides several operator symbols for actions on bits:
The 6502 supports these operations and also supports other bit manipulations that BASIC XL (and other more modern BASICs) do not support. Shifting bits and rotating bits in a byte are simple in the 6502. These could be interesting in a utility for BASIC. So, we will plan for machine language routines that do bitwise AND, OR, EOR operations and also support left and right shift, and left and right rotate.
Since the USR() environment automatically provides 16-bit argument values to the machine language code a choice needs to be made: implement the routines to work on 16-bit values or just a single byte. In the case of AND, OR, and EOR there's little difference between supporting a byte or 16-bit operation. If the supplied argument values are only 8-bits (with 0 as the high byte) then the resulting computation will also appear to be 8-bits.
Rotating and Shifting bits is a different situation. The results of shifting and rotating an 8-bit value is different from the same action on a 16-bit value. In most situations the Atari environment uses 8-bits as data (character sets, map graphics, player/missile data), so these bit operations are more likely to be useful on just a byte.
Below is the list of operations to design. Each operation has two values, an operator, and a return value:
Given the earlier Dpeek and Dpoke examples most readers are probably not looking forward to the tedious repetition of the argument setup and validation code for seven similar, separate routines. Nothing like that will happen. Instead, all the operations can be combined into one utility. This is simple (and sensible) to do, because all these operations use the same number of arguments and all return a value to BASIC.
Since BASIC converts all arguments to 16-bit values, the operations that work on 8-bit values will use only the low byte of the value and ignore the high byte. The reader will be relieved to learn there will be just one instance of the initialization code for managing the stack and arguments. This is where the perceived overkill factor of the generic initialization code pays off.
The next problem is how to choose one of seven different operations in one machine language utility. There are always several ways to solve a problem. One possibility is that each routine will have its own entry point relative to the start. Each bit operation routine would have to call the initialization and then somehow find its way back to the proper bit operation code. This has the problem that it is difficult to share the initialization code between all the operations and keep the routines relocatable. Simple relative branching keeps the code relocatable, but using JSR (Jump To Subroutine) to call the initialization code or the actual routines makes the code non-relocatable without a heap of other work.
Another possibility is that the machine language routine uses the value identifying the bit operation as the basis for branching to the specific routine. This has more promise. It means the machine language code will test the argument for each bit operation identifier and then branch accordingly. A table of addresses for JMP or JSR would be faster, but it would not be easily relocatable.
Keeping the code relocatable requires the entry for all routines fall within branching range of the decision making code. Generally speaking, the maximum distance of a branch is plus or minus about 127 bytes. The assembler is always more than happy to tell the programmer when target code is out of range of a branch. These bit-wise features are simple concepts for 6502 machine language, so the routines will not be large, but even if they were, there are ways around the problem of the half-page branch distance.
BITS in Mac/65 Assembler Code
1000 ; BITS.M65 1005 ; 1010 ; Perform bit level operations 1015 ; for Atari BASIC. 1020 ; 1025 ; USR 3 arguments: 1030 ; Oper == Operation (1,2,...7) 1035 ; Val1 == First value. 1040 ; Val2 == Second Value. 1045 ; 1050 ; Operations: 1055 ; 1 = 16-bit Val1 OR 16-bit Val2 1060 ; 2 = 16-bit Val1 AND 16-bit Val2 1065 ; 3 = 16-bit Val1 EOR 16-bit Val2 1070 ; 4 = 8-bit Val1 LSR 8-bit num bits Val2 1075 ; 5 = 8-bit Val1 LSL 8-bit num bits Val2 1080 ; 6 = 8-bit Val1 ROR 8-bit num bits Val2 1085 ; 7 = 8-bit Val1 ROL 8-bit num bits Val2 1090 ; 1095 ; USR return value is the result. 1100 ; 1105 ; Use the FR0/FR1 FP register. 1110 ; The return value for BASIC 1115 ; goes in FR0. 1120 ; No FP is used so all of FR0 1125 ; (and more FP registers) can 1130 ; be considered available. 1135 ; 1140 ZRET = $D4 ; FR0 $D4/$D5 Return value 1145 ZARGS = $D5 ; $D6-1 for arg Pulldown loop 1150 ZVAL2 = $D6 ; FR0 $D6/$D7 Value 1155 ZVAL1 = $D8 ; FR0 $D8/$D9 Value 1160 ZOPER = $DA ; FR1 $DA/$DB Operation 1165 ; 1170 ; Define operations1175 ; 1180 OPER_OR = $01 1185 OPER_AND = $02 1190 OPER_EOR = $03 1195 OPER_LSR = $04 1200 OPER_LSL = $05 1205 OPER_ROR = $06 1210 OPER_ROL = $07 1215 ; 1220 .OPT OBJ 1225 ; 1230 ; Arbitrary. This is relocatable. 1235 ; 1240 *= $9500 1245 ; 1250 INIT 1255 LDA #$00 ; Make sure return 1260 STA ZRET ; value is cleared 1265 STA ZRET+1 ; by default. 1270 PLA ; Get argument count 1275 TAY 1280 BEQ EXIT ; Shortcut for no args 1285 ASL A ; Now number of bytes 1290 TAY 1295 CMP #$06 ; Value1, Value2, Oper 1300 BEQ PULLDOWN 1305 ; 1310 ; Bad args. Clean up for exit. 1315 ; 1320 DISPOSE ; any number of args 1325 PLA 1330 DEY 1335 BNE DISPOSE 1340 RTS ; Abandon ship 1345 ; 1350 ; This code works the same 1355 ; for 1, 4, 8 ... arguments. 1360 ; 1365 PULLDOWN 1370 PLA 1375 STA ZARGS,Y 1380 DEY 1385 BNE PULLDOWN 1390 ; 1395 ; Like a Switch/Case 1400 ; 1405 LDY ZOPER 1410 BEQ EXIT ; Zero operator 1415 ; 1420 DEY ; #$01 OR 1425 BEQ DO_OR 1430 ; 1435 DEY ; #$02 AND 1440 BEQ DO_AND 1445 ; 1450 DEY ; #$03 EOR 1455 BEQ DO_EOR 1460 ; 1465 DEY ; #$04 LSR 1470 BEQ DO_LSR 1475 ; 1480 DEY ; #$05 LSL 1485 BEQ DO_LSL 1490 ; 1495 DEY ; #$06 ROR 1500 BEQ DO_ROR 1505 ; 1510 DEY ; #$07 ROL 1515 BEQ DO_ROL 1520 ; 1525 EXIT 1530 RTS ; bye. 1535 ; 1540 ; 1 = 16-bit Val1 OR 16-bit Val2 1545 DO_OR 1550 LDA ZVAL1 1555 ORA ZVAL2 1560 STA ZRET 1565 LDA ZVAL1+1 1570 ORA ZVAL2+1 1575 STA ZRET+1 1580 RTS 1585 ; 1590 ; 2 = 16-bit Val1 AND 16-bit Val2 1595 DO_AND 1600 LDA ZVAL1 1605 AND ZVAL2 1610 STA ZRET 1615 LDA ZVAL1+1 1620 AND ZVAL2+1 1625 STA ZRET+1 1630 RTS 1635 ; 1640 ; 3 = 16-bit Val1 EOR 16-bit Val2 1645 DO_EOR 1650 LDA ZVAL1 1655 EOR ZVAL2 1660 STA ZRET 1665 LDA ZVAL1+1 1670 EOR ZVAL2+1 1675 STA ZRET+1 1680 RTS 1685 ; 1690 ; 4 = 8-bit Val1 LSR 8-bit num bits Val2 1695 DO_LSR 1700 LDA ZVAL1 1705 STA ZRET 1710 LDX ZVAL2 1715 BEQ EXIT 1720 LSR_LOOP 1725 LSR A 1730 DEX 1735 BNE LSR_LOOP 1740 STA ZRET 1745 RTS 1750 ; 1755 ; 5 = 8-bit Val1 LSL 8-bit num bits Val2 1760 DO_LSL 1765 LDA ZVAL1 1770 STA ZRET 1775 LDX ZVAL2 1780 BEQ EXIT 1785 LSL_LOOP 1790 ASL A 1795 DEX 1800 BNE LSL_LOOP 1805 STA ZRET 1810 RTS 1815 ; 1820 ; 6 = 8-bit Val1 ROR 8-bit num bits Val2 1825 DO_ROR 1830 LDA ZVAL1 1835 STA ZRET 1840 LDX ZVAL2 1845 BEQ EXIT 1850 ROR_LOOP 1855 CLC 1860 ROR A 1865 BCC OVER_ROR_INC 1870 ORA #$80 1875 OVER_ROR_INC 1880 DEX 1885 BNE ROR_LOOP 1890 STA ZRET 1895 RTS 1900 ; 1905 ; 7 = 8-bit Val1 ROL 8-bit num bits Val2 1910 DO_ROL 1915 LDA ZVAL1 1920 STA ZRET 1925 LDX ZVAL2 1930 BEQ EXIT 1935 ROL_LOOP 1940 CLC 1945 ROL A 1950 BCC OVER_ROL_INC 1955 ORA #$01 1960 OVER_ROL_INC 1965 DEX 1970 BNE ROL_LOOP 1975 STA ZRET 1980 RTS 1985 ; 1990 .END
Now for the take-apart. The common initialization has these extra lines of code:
1255 LDA #$00 ; Make sure return 1260 STA ZRET ; value is cleared 1265 STA ZRET+1 ; by default.
The program starts by clearing the response data. Since early exit may occur for different reasons in different places of the code, it makes sense to do this once rather than in several different places. When the code finds an error it can just return (RTS) to BASIC.
There is nothing new with the DISPOSE and PULLDOWN loops – they are the same as prior examples.
The next section is a simple switch/case-like decision blocks to identify which operator to use:
1395 ; Like a Switch/Case 1400 ; 1405 LDY ZOPER 1410 BEQ EXIT ; Zero operator 1415 ; 1420 DEY ; #$01 OR 1425 BEQ DO_OR 1430 ; 1435 DEY ; #$02 AND 1440 BEQ DO_AND . . . 1510 DEY ; #$07 ROL 1515 BEQ DO_ROL
Zero is not a valid operation identifier, so that can be discarded quickly.
Ordinarily, testing a list of values would look something like this which requires four bytes for each test:
CPY #$01 BEQ DO_OR
Since operator identifiers are sequential beginning from 1 the code can use a shortcut – it decrements the operator identifier and branches when reaching zero to the appropriate routine. This is still two instructions, but only three bytes (for a total of 21 bytes rather than 28 bytes to test 7 values).
The OR, AND, and EOR routines are similar:
1540 ; 1 = 16-bit Val1 OR 16-bit Val2 1545 DO_OR 1550 LDA ZVAL1 1555 ORA ZVAL2 ; <- OR 1560 STA ZRET 1565 LDA ZVAL1+1 1570 ORA ZVAL2+1 ; <- OR 1575 STA ZRET+1 1580 RTS
The only difference between each routine is the actual 6502 operation (ORA, AND, EOR) combining Value 1 and Value 2 which it then outputs to the return value for BASIC.
The shift routines must loop since the 6502 shift instructions move the data by only one bit:
1690 ; 4 = 8-bit Val1 LSR 8-bit num bits Val2 1695 DO_LSR 1700 LDA ZVAL1 1705 STA ZRET 1710 LDX ZVAL2 1715 BEQ EXIT 1720 LSR_LOOP 1725 LSR A ; <- Can only move bits one position at a time 1730 DEX 1735 BNE LSR_LOOP 1740 STA ZRET 1745 RTS
Each routine loops to shift only the low byte of Value 1 by the number of bits in Value 2 with the result output in the low byte of the return value for BASIC. Since the initialization cleared the entire return value, these routines do not need to worry about clearing the return value's high byte.
The bit rotation routines:
1820 ; 6 = 8-bit Val1 ROR 8-bit num bits Val2 1825 DO_ROR 1830 LDA ZVAL1 1835 STA ZRET 1840 LDX ZVAL2 1845 BEQ EXIT 1850 ROR_LOOP 1855 CLC 1860 ROR A 1865 BCC OVER_ROR_INC 1870 ORA #$80 1875 OVER_ROR_INC 1880 DEX 1885 BNE ROR_LOOP 1890 STA ZRET 1895 RTS
This is a little more complicated. The 6502 bit rotations can be thought of as operations using nine bits, not eight. The Carry bit does double duty. It provides the source bit value to move into the byte and then accepts the value of the bit that was rotated out of the byte. The problem here is that the bit moving out of the byte has to immediately be moved back into the other end of the byte. How the code solves this: it forces the Carry bit to zero (CLC) insuring a zero bit always rotates into the byte. After the rotation the code checks the bit that rotated out of the byte (in Carry). If the Carry bit is set then the code turns on the bit that should have rotated into the other end of the byte (here, ORA #$80).
Testing BITS
The Atari BASIC program below, TESTBITS.BAS, demonstrates the routines:
100 REM TEST BIT OPERATIONS UTILITY 105 REM 110 GRAPHICS 0:POKE 710,0:POKE 82,0 115 DIM B(3,1),N$(21) 120 GOSUB 10000:REM BITS UTILITY 125 RESTORE 155:REM BIT PATTERNS 130 FOR X=0 TO 1 135 FOR Y=0 TO 3 140 READ D:B(Y,X)=D 145 NEXT Y 150 NEXT X 155 DATA 0,257,0,257,0,0,258,258 160 REM ENUMERATE BIT OPERATIONS 165 READ BOR,BAND,BEOR,BLSR,BLSL,BROR,BROL 170 DATA 1,2,3,4,5,6,7 175 N$="OR ANDEORLSRLSLRORROL" 180 REM 185 REM TEST OR, AND, EOR 190 REM 195 FOR OPER=BOR TO BEOR 200 FOR Y=0 TO 3 205 V=USR(BITS,OPER,B(Y,0),B(Y,1)) 210 ? B(Y,0);" "; 215 ? N$(OPER*3-2,OPER*3);" "; 220 ? B(Y,1);" = ";V 225 NEXT Y 230 ? 235 NEXT OPER 240 REM 245 REM TEST SHIFT AND ROTATE 250 REM 255 V=129:REM $81 260 FOR OPER=BLSR TO BROL 265 FOR Y=0 TO 8 270 W=USR(BITS,OPER,V,Y) 275 ? V;" "; 280 ? N$(OPER*3-2,OPER*3);" "; 285 ? Y;" = ";W 290 NEXT Y 295 ? 300 NEXT OPER 305 END 9997 REM 9998 REM SETUP BITS ML UTILITY 9999 REM 10000 DIM BT$(162) 10001 BITS=ADR(BT$) 10002 RESTORE 27000 10003 FOR I=0 TO 161 10004 READ D:POKE BITS+I,D 10005 NEXT I 10006 RETURN 26996 REM H1:BITS.OBJ 26997 REM Size = 162 26998 REM Start = 38144 26999 REM End = 38305 27000 DATA 169,0,133,212,133,213,104,240 27001 DATA 43,10,168,201,6,240,5,104 27002 DATA 136,208,252,96,104,153,213,0 27003 DATA 136,208,249,164,218,240,21,136 27004 DATA 240,19,136,240,29,136,240,39 27005 DATA 136,240,49,136,240,61,136,240 27006 DATA 73,136,240,90,96,165,216,5 27007 DATA 214,133,212,165,217,5,215,133 27008 DATA 213,96,165,216,37,214,133,212 27009 DATA 165,217,37,215,133,213,96,165 27010 DATA 216,69,214,133,212,165,217,69 27011 DATA 215,133,213,96,165,216,133,212 27012 DATA 166,214,240,208,74,202,208,252 27013 DATA 133,212,96,165,216,133,212,166 27014 DATA 214,240,193,10,202,208,252,133 27015 DATA 212,96,165,216,133,212,166,214 27016 DATA 240,178,24,106,144,2,9,128 27017 DATA 202,208,247,133,212,96,165,216 27018 DATA 133,212,166,214,240,158,24,42 27019 DATA 144,2,9,1,202,208,247,133 27020 DATA 212,96
The program starts by building a table of bit combinations for testing OR, AND, and EOR. The non-zero decimal values correspond to integers with one bit set in the high byte, and a bit set in the low byte. However the low byte values are different bits. This helps demonstrate that the bit operations are working on both the high bytes and low bytes, and that the bits are combined as expected by the operation.
The code loops through the OR, AND, and EOR operations, and for each operation loops through the table of bits displaying the results of the operations.
The next section loops through the shift and rotate operations and for each operation it performs zero through 8 bit shifts displaying the results.
Test program output:
That shows the operations are working as expected. That is, assuming one is willing to sit and visualize the bit patterns corresponding to each decimal value. For these purposes the output is less than ideal. Atari BASIC prints numbers as decimals. There is no built-in option to display bytes and integer values in ways more meaningful for 8-bit computer programming. There must be a better way… (Stay tuned for the next episode.)
Below are the source files and examples of how to load the machine language routine into BASIC included in the disk image and archive:
BITS File List:
BITS.M65 Saved Mac/65 source
BITS.L65 Mac/65 source listing
BITS.T65 Mac/65 source listed to H6: (linux)
BITS.ASM Mac/65 assembly listing
BITS.TSM Mac/65 assembly listing to H6: (linux)
BITS.OBJ Mac/65 assembled machine language program (with load segments)
BITS.BIN Assembled machine language program without load segments
BITS.LIS LISTed DATA statements for BITS.BIN routine.
BITS.TLS LISTed DATA statements for BITS.BIN routine to H6: (linux)
MAKEBITS.BAS BASIC program to create the BITS.BIN file. This also contains the BITS routine in DATA statements.
MAKEBITS.LIS LISTed version of MAKEBITS.BAS
MAKEBITS.TLS LISTed version of MAKEBITS.BAS to H6: (linux)
SHOWBITS.BAS Example BASIC program dissecting bits in a 16-bit integer.
SHOWBITS.LIS LISTed version of SHOWBITS.BAS
SHOWBITS.TLS LISTed version of SHOWBITS.BAS to H6: (linux)
TESTBITS.BAS BASIC program that tests the BITS USR() routines.
TESTBITS.LIS LISTed version of TESTBITS.BAS.
TESTBITS.TLS LISTed version of TESTBITS.BAS to H6: (linux)
ZIP archive of files:
Bits_Disk.zip
Tar archive of files (remove the .zip after download)
Bits_Disk.tgz.zip
For everyone who has been born of God overcomes the world. And this is the victory that has overcome the world—our faith.
1 John 5:4
0 Comments
Recommended Comments
There are no comments to display.