Leaderboard

APX Pascal Architecture, part one

Bill Lange has been blogging about Atari Pascal since early February at https://insideataripascal.blogspot.com/, so here's my own small contribution after spending an afternoon poking around in APX Pascal and looking for the core interpreter. If we look at the PASCAL runtime on the APX Pascal disk, it's a simple enough image. It loads itself from disk to $3300-$59ff and then starts running from $3300. So what does that initial bootstrap code do? Here's the preamble: 3300: A2 00 LDX #0 3302: A9 0C LDA #$0C 3304: 9D 42 03 STA ICCMD,X 3307: 20 56 E4 JSR CIOV 330A: AD A7 33 LDA $33A7 330D: 85 F0 STA $F0 330F: AD A8 33 LDA $33A7+1 3312: 85 F1 STA $F0+1 3314: AD A9 33 LDA $33A9 3317: 85 F2 STA $F2 3319: AD AA 33 LDA $33A9+1 331C: 85 F3 STA $F2+1 331E: AD AB 33 LDA $33AB 3321: 85 F4 STA $F4 3323: AD AC 33 LDA $33AB+1 3326: 85 F5 STA $F4+1 3328: 20 73 33 JSR $3373 This code closes IOCB #0, then sets up $F0-F5 using values at $33A7-$33AC and then calls a subroutine. Those values are: 33A7: 00 3A .WORD $3A00 ; source 33A9: 00 A0 .WORD $A000 ; destination 33AB: 00 20 .WORD $2000 ; count 33AD: 00 1D .WORD $1D00 And the subroutine looks like: 3373: A0 00 LDY #0 3375: B1 F0 LDA ($F0),Y 3377: 91 F2 STA ($F2),Y 3379: A5 F0 LDA $F0 337B: 18 CLC 337C: 69 01 ADC #1 337E: 85 F0 STA $F0 3380: A5 F1 LDA $F0+1 3382: 69 00 ADC #0 3384: 85 F1 STA $F0+1 3386: A5 F2 LDA $F2 3388: 18 CLC 3389: 69 01 ADC #1 338B: 85 F2 STA $F2 338D: A5 F3 LDA $F2+1 338F: 69 00 ADC #0 3391: 85 F3 STA $F2+1 3393: A5 F4 LDA $F4 3395: 38 SEC 3396: E9 01 SBC #1 3398: 85 F4 STA $F4 339A: A5 F5 LDA $F4+1 339C: E9 00 SBC #0 339E: 85 F5 STA $F4+1 33A0: A5 F4 LDA $F4 33A2: 05 F5 ORA $F4+1 33A4: D0 CF BNE $3375 33A6: 60 RTS This is just a block copy routine, which relocates all the code at $3A00-$59FF to $A000-$BFFF. This block of code (which is most of the PASCAL executable), is the actual runtime. A simpler way to do this would have been to use a multi-segment load file, but this works well enough. $A000-$BFFF is the cartridge address space for an 8k cart, so clearly this was intended at one point to be shipped as a cartridge. What happens next: 332B: AD AD 33 LDA $33AD 332E: 85 80 STA $80 3330: AD AE 33 LDA $33AE 3333: 85 81 STA $81 3335: A9 00 LDA #0 3337: 85 82 STA $82 3339: 85 83 STA $83 333B: 20 00 A2 JSR $A200 ... A200: 4C 12 B8 JMP $B812 This copies the word in $33AD ($1D00) to $80,$81 and zeros $82,$83 before invoking a routine at $A200, which vectors to $B812. That routine does several things, including: B834: A5 80 LDA $80 B836: 85 D0 STA $D0 B838: A5 81 LDA $81 B83A: 85 D1 STA $D1 B83C: AD 78 A2 LDA $A278 B83F: 85 CE STA $CE B841: AD 79 A2 LDA $A278+1 B844: 85 CF STA $CE+1 B846: AD 7A A2 LDA $A27A B849: 85 D2 STA $D2 B84B: AD 7B A2 LDA $A27A+1 B84E: 85 D3 STA $D2+1 B850: 20 6B AE JSR $AE6B where: A278: 00 A0 .WORD $A000 A27A: 78 02 .WORD $0278 So we move the word at $80,$81 ($1D00) to $D0,D1, and set $CE,$CF to $A000 and $D2,D3 to $0278, before calling another block copier. AE6B: A0 00 LDY #0 AE6D: A6 D3 LDX $D3 AE6F: F0 0E BEQ $AE7F AE71: B1 CE LDA ($CE),Y AE73: 91 D0 STA ($D0),Y AE75: C8 INY AE76: D0 F9 BNE $AE71 AE78: E6 CF INC $CF AE7A: E6 D1 INC $D1 AE7C: CA DEX AE7D: D0 F2 BNE $AE71 AE7F: A6 D2 LDX $D2 AE81: F0 08 BEQ $AE8B AE83: B1 CE LDA ($CE),Y AE85: 91 D0 STA ($D0),Y AE87: C8 INY AE88: CA DEX AE89: D0 F8 BNE $AE83 AE8B: 60 RTS So we relocate the first $0278 bytes of the "cartridge" to address $1D00-$1F77. The first $200 bytes are just a series of addresses (more on those soon), the next $78 bytes are a set of JMP vectors, e.g. 1F00: 4C 12 B8 JMP $B812 1F03: 4C B1 AB JMP $ABB1 1F06: 4C B6 AB JMP $ABB6 ... 1F72: 4C 87 B8 JMP $B887 1F75: 4C 5F BC JMP $BC5F After we return from this the code continues with: B853: A5 80 LDA $80 B855: 85 82 STA $82 B857: A5 81 LDA $81 B859: 85 83 STA $83 B85B: E6 83 INC $83 B85D: E6 83 INC $83 B85F: 20 EB B9 JSR $B9EB The word at $80,$81 gets moved to $82,$83 and incremented by $200, so the word at $82,$83 is now $1F00. The subroutine called looks like: B9EB: A2 21 LDX #$21 B9ED: A0 00 LDY #0 B9EF: B9 CA B9 LDA $B9CA,Y B9F2: 99 92 00 STA $0092,Y B9F5: C8 INY B9F6: CA DEX B9F7: D0 F6 BNE $B9EF B9F9: A5 81 LDA $81 B9FB: 85 AD STA $AD B9FD: 85 B2 STA $B2 B9FF: E6 B2 INC $B2 BA01: 60 RTS This copies the code at $B9CA into page zero, and patches the value at $81 ($1D) into $AD and the $B2 and then increments $B2, so we end up with the following: 0092: 18 CLC 0093: 65 A4 ADC $A4 0095: 85 A4 STA $A4 0097: 90 0A BCC $00A3 0099: E6 A5 INC $A5 009B: B0 06 BCS $00A3 009D: E6 A4 INC $A4 009F: D0 02 BNE $00A3 00A1: E6 A5 INC $A5 00A3: AD FF FF LDA $FFFF 00A6: 0A ASL A 00A7: B0 05 BCS $00AE 00A9: 85 AC STA $AC 00AB: 6C 00 1D JMP ($1D00) 00AE: 85 B1 STA $B1 00B0: 6C 00 1E JMP ($1E00) This is the core of the Pascal interpreter, similar to the Forth NEXT routine I discussed in http://atariage.com/forums/blog/734/entry-15007-dealer-demo-part-4-some-forth-at-last/. It has three parts, and self-modifies its code as it runs. If you enter at $0092, it increments the current p-code pointer (located at $A4,$A5) by the accumulator. If you enter at $009D, it increments the current p-code pointer by 1. In both cases, it then proceeds to the third part (which can be called directly as well) which reads the p-code value, multiplies the value by two and then patches one of two jump vectors with that value depending on whether the multiply overflowed or not. This allows us to dispatch all 256 possible p-codes, and each code will then jump back into this routine, keeping the interpreter running forever. Of course, we haven't actually gotten into the interpreter yet, only set it up. We'll discuss that in a future post, but we've made decent progress towards separating the runtime from the monitor. In particular, it's clear we could move the 8K runtime in $A000-$BFFF into a cartridge image and modify the PASCAL file to skip the initial relocation and rely on the cartridge. That focuses our attention on the remaining 1.8K of PASCAL to isolate the code that sets up the runtime and loads the MON program. Hopefully we can adapt that code to load another program directly, and thus produce binaries that can run without loading the monitor.

January 3, 2021

1 point

APX Pascal Architecture, part two

In the last post, we worked through layers of the APX Pascal runtime to find the main interpreter loop, which in fact resides entirely in page zero. In this post, we're going to dig into some of the opcodes to get a flavor for the runtime implementation. As we discussed last time, the each opcode is represented by a JMP value in a 512-byte table that is copied into $1D00 when the runtime starts. If you peruse though the table, the most common JMP target is $B9B5, in 81 entries. This is the not-implemented opcode, hitting any of these in code would be an error. The code seems to be an infinite loop. B9B5: 38 SEC B9B6: A0 00 LDY #0 B9B8: A9 67 LDA #$67 B9BA: 20 EA B8 JSR $B8EA B9BD: 4C B5 B9 JMP $B9B5 The next most common opcode is a 4-way tie, for opcodes $90-$97 ($AA65), $98-9F ($AA7A), $E0-E7 ($A2F5) and $E8-EF ($A8EC). Let's investigate each in turn. AA65: 4A LSR A AA66: 29 07 AND #7 AA68: 48 PHA AA69: A0 01 LDY #1 AA6B: B1 A4 LDA (IP),Y AA6D: 18 CLC AA6E: 65 C8 ADC $C8 AA70: 85 A4 STA IP AA72: 68 PLA AA73: 65 C9 ADC $C8+1 AA75: 85 A5 STA IP+1 AA77: 4C A3 00 JMP $00A3 This appears to be some kind of unconditional branch/jump opcode. The opcode is shifted right and masked, yielding a number 0-3. It then adds this to the value in $C8,C9 plus the value following the opcode. We set the IP to this value and continue execution. AA7A: A8 TAY AA7B: BD 00 06 LDA $0600,X AA7E: E8 INX AA7F: E8 INX AA80: 4A LSR A AA81: B0 03 BCS $AA86 AA83: 98 TYA AA84: 90 DF BCC $AA65 AA86: A9 02 LDA #2 AA88: 4C 92 00 JMP $0092 This pulls the top of data stack (the stack is at $0600 and indexed by X), if it's odd we're done, otherwise we call the branch function above. So it's a conditional branch. Both of these codes are very odd. The only reason I can think of to encode part of the branch offset into the opcode is to extend the range beyond 256 bytes, but in that case, why not just have a separate opcode for long branches. Also, using both BCC and BCS isn't optimal. A little thought shows removing the BCS achieves the same result, but faster. In general the runtime code looks like it could have used a little more optimization. This project started out as a port from the 8080, perhaps the author never developed enough 6502 experience to tighten up the code in the time allowed. The next two routines are similar: A2F5: 29 0F AND #$0F A2F7: A8 TAY A2F8: B1 C6 LDA ($C6),Y A2FA: C8 INY A2FB: CA DEX A2FC: CA DEX A2FD: 9D 00 06 STA $0600,X A300: B1 C6 LDA ($C6),Y A302: 9D 01 06 STA $0601,X A305: 4C 9D 00 JMP $009D and: A8EC: 29 0F AND #$0F A8EE: A8 TAY A8EF: B1 B6 LDA ($B6),Y A8F1: C8 INY A8F2: CA DEX A8F3: CA DEX A8F4: 9D 00 06 STA $0600,X A8F7: B1 B6 LDA ($B6),Y A8F9: 9D 01 06 STA $0601,X A8FC: 4C 9D 00 JMP $009D Both of these routine move a value to the top of the stack, just using different pointers to source the value ($C6 and $B6). Most of the remaining of the opcodes have unique implementations. Some of the interesting ones to look at are "load string" ($2C at $A88F), load small constants ($F0-$F7), load 1, 2 and 4 bytes ($24, $25, $26), call ($A2 at $AB2C) and return ($A6 at $AC96). What's most interesting to me is that having identified these, you might notice they don't match the "Functional Specification" (https://archive.org/details/AtariPascalFunctionalSpecification) at all. Apparently the paper design for the interpreter presented to Atari underwent major revisions by the time it was published. I expected some revisions, but it appears little of the original opcode design survived. In our next post, we'll examine the p-code that exists in the PASCAL runtime object, and write a very basic p-code disassembler.

January 3, 2021

1 point

APX Pascal Architecture, part four

The last blog entry introduced the tools I'm using to explore the Pascal runtime, and included a preliminary (i.e. rough) disassembly. Now we'll start refining that disassembly and start discussing more of the opcodes. Firstly, the last listing was erroneous around $B959 to $B991. There are strings there I somehow missed when spot checking the disassembly, so I've fixed up that part of the disassembly. There were also a couple of missing $9B's as well after strings, and the p-code disassembly had a couple of errors as well which I've now fixed. Now let's discuss some more opcodes. The simplest opcode in the listing is opcode DB. It is just: AF9D: E8 INX AF9E: E8 INX AF9F: 4C 9D 00 JMP NEXT_OP1 Since X is the current evaluation stack pointer, and it grows downwards, this opcode drops the topmost entry of the stack, so let's call it DROP. Another simple opcode is $DA, which disassembles as: AF8C: CA DEX AF8D: CA DEX AF8E: BD 03 06 LDA EVALPAGE+3,X AF91: 9D 01 06 STA EVALPAGE+1,X AF94: BD 02 06 LDA EVALPAGE+2,X AF97: 9D 00 06 STA EVALPAGE,X AF9A: 4C 9D 00 JMP NEXT_OP1 This adds one entry to the stack, and copies the (previous) top element to it, so we can call this DUP. Opcode D2 is a bit longer, but just involves moving things around the stack, so that the first two elements are exchanged, so let's call it SWAP. AF5F: BC 00 06 LDY EVALPAGE,X AF62: BD 02 06 LDA EVALPAGE+2,X AF65: 9D 00 06 STA EVALPAGE,X AF68: 98 TYA AF69: 9D 02 06 STA EVALPAGE+2,X AF6C: BC 01 06 LDY EVALPAGE+1,X AF6F: BD 03 06 LDA EVALPAGE+3,X AF72: 9D 01 06 STA EVALPAGE+1,X AF75: 98 TYA AF76: 9D 03 06 STA EVALPAGE+3,X AF79: 4C 9D 00 JMP NEXT_OP1 Some other simple stack-only opcodes are 30 (AND), 32 (OR), 34 (NOT), 36 (EOR), 38 (NEG), 40 (ADD) and 44 (SUB). All of these replace the top two values on the stack with the result of the operation. Opcodes 60 and 70 oddly point to the same code, which looks like this: B185: BD 01 06 LDA EVALPAGE+1,X B188: DD 03 06 CMP EVALPAGE+3,X B18B: D0 5C BNE $B1E9 B18D: BD 00 06 LDA EVALPAGE,X B190: DD 02 06 CMP EVALPAGE+2,X B193: D0 54 BNE $B1E9 B195: F0 5F BEQ $B1F6 ... B1E9: E8 INX B1EA: E8 INX B1EB: A9 00 LDA #0 B1ED: 9D 00 06 STA EVALPAGE,X B1F0: 9D 01 06 STA EVALPAGE+1,X B1F3: 4C 9D 00 JMP NEXT_OP1 B1F6: E8 INX B1F7: E8 INX B1F8: A9 01 LDA #1 B1FA: 9D 00 06 STA EVALPAGE,X B1FD: A9 00 LDA #0 B1FF: 9D 01 06 STA EVALPAGE+1,X B202: 4C 9D 00 JMP NEXT_OP1 If the top two values are equal, we replace them with a 1, otherwise we replace them with a 0. So let's call them EQU. Opcodes 62 and 72 reverses this, so let's call them NEQ. Now why are there two equivalent opcodes? Well, let's look at opcode 64 and 74. 64 is simply: B1A9: 20 2F BE JSR $BE2F B1AC: F0 3B BEQ $B1E9 B1AE: 30 39 BMI $B1E9 B1B0: 10 44 BPL $B1F6 and 74 is similar: B1C9: 20 2F BE JSR $BE2F B1CC: F0 1B BEQ $B1E9 B1CE: 90 19 BCC $B1E9 B1D0: B0 24 BCS $B1F6 with BE2F: BD 02 06 LDA EVALPAGE+2,X BE32: DD 00 06 CMP EVALPAGE,X BE35: F0 0B BEQ $BE42 BE37: BD 03 06 LDA EVALPAGE+3,X BE3A: FD 01 06 SBC EVALPAGE+1,X BE3D: 09 01 ORA #1 BE3F: 70 0A BVS $BE4B BE41: 60 RTS BE42: BD 03 06 LDA EVALPAGE+3,X BE45: FD 01 06 SBC EVALPAGE+1,X BE48: 70 01 BVS $BE4B BE4A: 60 RTS BE4B: 49 80 EOR #$80 BE4D: 09 01 ORA #1 BE4F: 60 RTS This difference here seems to be whether the 16-bit comparisons here are done signed or unsigned. The 6x opcodes are signed comparisons, and the 7x opcodes are unsigned comparisons. 60 is EQU and 70 is UEQU, which happen to have identical implementations, and 62 and 72 are similarly NEQ and UNEQ. 64, 66, 68 and 6A seem to be greater than (GT), less than (LT), greater than or equal (GTE) and less than or equal (LTE) respectively. 74, 76, 78 and 7A appear to be same, only unsigned. To further complicate matters, the 8x opcodes also implement comparisons (the same six EQU, NEQ, GT, LT, GTE, LTE operations), but for other types than signed and unsigned integers. The second byte after determines the type, with 00 => bool, 01 => string (both from the stack, so both of these sequences consume 2 bytes), and 02, 03 and 04 being various byte comparisons consuming an additional 2 bytes after the type byte. So our simple p-code disassembler which assumes all opcodes but 2C are fixed size needs to be modified to handle these opcodes a little differently. That's enough for this post. The runtime disassembly is certainly starting to make a bit more sense, but there are plenty of mysteries left to explore. pascal3.zip

January 3, 2021

1 point

London Blitz (Avalon Hill)

It is pretty much the same game with a larger map that is possibly an accurate map of London, better graphics, pretty much the same sounds with a few new ones like a bomb dropping sound, and different bomb mechanics (it uses dials instead of sliders). All of my control complaints hold true with this game as well except everything is now extremely sluggish even on the maze screen. Honestly the 2600 version seems more playable as you are able to move along at a much faster pace

March 7, 2019

1 point

Sign In

phaeron

Points

Posts

DrVenkman

Points

Posts

Flojomojo

Points

Posts

CyranoJ

Points

Posts

Popular Content

APX Pascal Architecture, part one

APX Pascal Architecture, part two

APX Pascal Architecture, part four

London Blitz (Avalon Hill)

Apps

My Activity Streams

More