enthusi Posted August 23, 2018 Share Posted August 23, 2018 Hi, I am trying to fully get what's going on there and write an own loader. I base the first tests on Karris nice small loader: *=$200 stz mapctl lda #$04 sta serctl ldx #$00 loop lda cart0 sta $f000,x inx bne loop jmp $f000 .dsb $200+50-*,0 this loads a full page (the ripple counter is supposedly somewhere in the middle of Bank0 now). stage2 is only ~100 bytes long currently. However, my own stage2 loader starts by setting block 1 (512 Byte Blocksize), so the ripple counter should be reset and the further data is taken from Block 1, 0 offset or in terms of ROM at 0x200 and in terms of LNX at 0x240 (correct??). Any error in thinking here (as it doesnt work). I encrypt the above code with lynxenc to (only stage1): 00000000 ff cc e7 22 43 9e 5a e6 4e c1 47 ba 12 48 0d ff |..."C.Z.N.G..H..|00000010 f6 ed 22 8e 00 6d 47 57 ac cb c1 6f 79 82 87 99 |.."..mGW...oy...|00000020 42 e7 71 9c aa de 7f f6 75 a6 fa 1a 3d 01 97 75 |B.q.....u...=..u|00000030 22 99 43 11 |".C.| lynxdec agrees on the content at least Hard to tell in handysdl or mednafen if that part worked at all. I thought a BRK also exits handysdl but even starting the stage1 loader with BRK results in 'nothing' happening. My LNX header is pretty certainly ok: 00000000 4c 59 4e 58 00 02 00 00 01 00 41 73 73 65 6d 62 |LYNX......Assemb|00000010 6c 6f 69 64 73 00 00 00 00 00 00 00 00 00 00 00 |loids...........|00000020 00 00 00 00 00 00 00 00 00 00 50 72 69 6f 72 41 |..........PriorA|00000030 72 74 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |rt..............|00000040 ff cc e7 22 43 9e 5a e6 4e c1 47 ba 12 48 0d ff |..."C.Z.N.G..H..|00000050 f6 ed 22 8e 00 6d 47 57 ac cb c1 6f 79 82 87 99 |.."..mGW...oy...|00000060 42 e7 71 9c aa de 7f f6 75 a6 fa 1a 3d 01 97 75 |B.q.....u...=..u|00000070 22 99 43 11 a9 1a 8d 8a fd a9 0b 8d 8b fd a9 02 |".C.............|00000080 8d 87 fd a9 01 85 10 20 39 f0 a9 14 85 14 a0 02 |....... 9.......|.... The stage2 loader starts right after the last byte of the encryted stage1 code. Stage2 is not encrypted anymore. My stage 2 loader does this to set a block address: (not optimized at all, I know) set_block.( lda iodat and #%11111101 sta tmpiodat lda block sta tmp ;pre-roll once into bit 0 (not yet 1) rol tmp rol tmp+1 ldx #7loop_block rol tmp rol tmp+1 ;bit 7 of block -> bit 1 of tmp+1 lda tmp+1 and #%00000010 ora tmpiodat sta iodat lda #%00000011 ;0 sta sysctl1 lda #%00000010 ;1 sta sysctl1 dex bpl loop_block rts.) and then it starts loading. The payload starts in the LNX image at 0x240. Can any of you spot an error (in thinking or code) here? That would be most appreciated Thanks and thanks for all the help/documentation so far! Martin Quote Link to comment Share on other sites More sharing options...
42bs Posted August 23, 2018 Share Posted August 23, 2018 If you use handybug you can set a break point in your stage2 loader and then step through your code. And, why re-invent the wheel (means block select)? Do you think you can make a rounder one? Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 23, 2018 Author Share Posted August 23, 2018 I am running on linux only, so no handydebug unfortunately as far as I know? This is all part of the learning process. I like loaders anyway, wrote several for C64 and would hate to use a black box on Lynx now. There is always room for improvement. Either in code or at least in own understanding of things In fact I even consider it vital for a community to hold up knowledge about the 'inner workings'. BLL is great (in fact it is pretty awesome) but I don't intend to make use of it (other than of course learning from it). Probably worth digging up that handydebug somehow and worst case set up some notebook with it - thanks for the hint! 1 Quote Link to comment Share on other sites More sharing options...
42bs Posted August 23, 2018 Share Posted August 23, 2018 Well, a VM with win XP should be enough. BLL's select block is the "official" one from boot ROM, only register saving added.Its first action is to reset strobe and output 0. Maybe "hand"-debug your code. Be sure to restore IODAT on leave! Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 23, 2018 Author Share Posted August 23, 2018 Ah, MS even has a free XP image for VirtualBox it seems. Will try. I also went for 'hand debug' but even BRK didnt really do what I had hoped it would, so I have doubts that it is even the set_block code that fails (even though I spotted an error). I will report back here if I make progress. Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 23, 2018 Author Share Posted August 23, 2018 Hm, XP (32) complains that the handybug.exe I used (from the testes wanted thread) is no proper Win32 Application. Quote Link to comment Share on other sites More sharing options...
42bs Posted August 23, 2018 Share Posted August 23, 2018 (edited) Maybe try mine: http://www.monlynx.de/download/handybug.7z (Only the exe) Edit: Tried in on ubuntu with wine: No GFX output (as Linux adict you might fix this), but you can single step etc. Edited August 23, 2018 by 42bs Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 23, 2018 Author Share Posted August 23, 2018 Thanks! Yours works in VirtualBox at least. Seems my Wine is somehow borked on top of that but Im fine with something working somewhere Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Splendid, I got it fixed. Will improve it a alitte and post here. Thanks 42BS und Sage! Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Find attached a minimal loader that loads a 5500 Byte test program. Can't be much faster I guess. Stage2 fits into stack and payload starts at $0200 then. Stage1 is this now: *= $0200 stz mapctl ;BIOS seems to set this to 3 which I think is fine for most cases lda #$04 sta serctl ldx #162 ;(256-94 size of stage2)loop lda cart0 sta !$005e,x inx bne loop jmp $100 I assume 64KB LNX files are not that common? I haven't tested this on real hardware yet. Thanks for your help Cheers, Martin love.zip Quote Link to comment Share on other sites More sharing options...
+karri Posted August 24, 2018 Share Posted August 24, 2018 (edited) I kind of see the value in minimizing code. But the Stage1 cannot be smaller than 51 bytes. Would it be possible to use the extra zero-bytes for something useful? There is also the case of AUDIN. As the Lynx I and Lynx II start up with AUDIN in different states I thought it would be a good idea to set it in Stage1. One thing that I was concerned about was to be able to load in data in any place. That is why I put my 2nd loader at the same sport ar Mickey ROM and SCREEN bufffers. That area is a bit wasted area anyway as you cannot easily run code there because of registers. Suzy on the other hand does not care. So the screen space and the registers can reside on the same memory locations without problems. FFF8-FFFF sacred registers FC00-FFF7 registers E018-FFF7 Screen buffer FB68-FBFF 2nd loader 151 bytes Just thought of sharing this as well. Of course using the stack area is pretty safe too. OMG!!! Your <3 love <3 is soooo cute! Edited August 24, 2018 by karri Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 I see no benefit in a 2nd stage loader. The first stage loader (the one which is decrypted) can be large enough to load the game unless the reason is to start it at $200. But since every game needs variables, it is easier to put those from $200 upward and let the code start later. Currently, you are "wasting" 29 bytes in stage1. Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Yes, of course stage 1 can set up all kinds of things, including a small logo sprite/registers I will see if I fit in a full loader as well I think it is a bit too tight, but worth an attempt. Going for 2nd stage is certainly faster than decoding a 2nd 51 byte chunk. Currently I even waste most of Block0, too. Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 ROM space is usually enough available. But I agree, that decoding takes some time. So the challenge is to fit a complete loader into 51 bytes You do not need the last JMP if you load to $200+52 Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 into 50 bytes as far as I know? But good point about the JMP Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 Right, 50. Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Got it! Would be cool if someone could test this on a real Lynx. A 1-shot loader. I was stupid not thinking of this earlier. There is even still room for improvement but no longer required: lda #1 sta block jsr $fe00 ;harccoded target length lda #11 sta blocks2load load_loopload_a_full_block ldy #2;2 pages = 512 Bytes Blocksize ldx #0pageloop lda cart0target sta $0300,x inx bne pageloop inc target+2 dey ;pages bne pageloop inc block lda block jsr $fe00 dec blocks2load bne load_a_full_blockready jmp $0300 1shotload.zip Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 You could store the blocks to load at byte 52 in the ROM and start the loader with reading it. lda cart0 // get number of full blocks to load sta blocks2load stz block load_full_block: inc block lda block jsr $fe00 tay // a == 0 after fe00, x == 2 pageloop lda cart0target sta $0300,y iny bne pageloop inc target+2 dex ;pages bne pageloop dec blocks2load bne load_a_full_blockready jmp $0300 => 37 bytes, 13 to go Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Enough space left to try the same for the target, right now I assemble and encrypt everything in a Makefile anyway, but might be more generic this way indeed. Works! Now not that much space left (0 in the code below) block = $02blocks2load = $03exe=$04 lda cart0 // get number of full blocks to load sta blocks2load ldx #1l1 lda cart0 sta target+1,x sta exe,x dex bpl l1 stz blockload_a_full_block: inc block lda block jsr $fe00 tay // a == 0 after fe00, x == 2pageloop lda cart0target sta $0300,y iny bne pageloop inc target+2 dex ;pages bne pageloop dec blocks2load bne load_a_full_blockready jmp (exe) That's the new generic loader now: 00000000 ff 27 8e df 7a ec e5 9d 40 62 6c 4e 39 0a 36 05 |.'..z...@blN9.6.|00000010 23 a0 00 ff 7c 51 78 34 3a de d4 da 96 17 3a 61 |#...|Qx4:.....:a|00000020 6c 26 91 20 be e5 41 e5 51 f5 52 b2 1f 68 ae ed |l&. ..A.Q.R..h..|00000030 4d ec cb 31 |M..1| I use it as: .bin 0,0,"1shotload.enc".byte (end_of_game-start_of_game)/BLOCKSIZE+1 ;size of game in blocks.byte $03,$00 ;big endian!.dsb (BLOCKSIZE*STARTBLOCK)-*,0;align to a full block for my own loaderstart_of_game.bin 0,0,"game.bin" end_of_game Quote Link to comment Share on other sites More sharing options...
+karri Posted August 24, 2018 Share Posted August 24, 2018 This is so cool. Congrats! Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Thank you very much Here is a version that runs itself from stack and loads (hardcoded though) to $0200 but I like the more generic one more. ;A and X are 0 on entry ;ldx #0l2 lda code,x sta $100,x inx bpl l2 jmp $100code*=$100 lda cart0 // get number of full blocks to load sta blocks2load stz blockload_a_full_block: inc block lda block jsr $fe00 tay // a == 0 after fe00, x == 2pageloop lda cart0target sta $200,y iny bne pageloop inc target+2 dex bne pageloop dec blocks2load bne load_a_full_blockready jmp $200 Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 *hmm* I would place the code at $200-sizeof(loader) to remove the last JMP Based on Monty Python: Every byte that's wasted ... Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Considered that but it's a bit less trivial due to the JSRs in the code (and us hogging the stack . You'd need to fiddle TXS for that, more ugly than 3 jump bytes in my book EDIT: this will do, but no bytes 'saved' anywhere, just a bit more obfuscation at work ldx #$ff txs ldx #33l2 lda code,x pha dex bpl l2 jmp $1de code*=$1de lda cart0 // get number of full blocks to load sta blocks2load stz blockload_a_full_block: inc block lda block jsr $fe00 tay // a == 0 after fe00, x == 2pageloop lda cart0target sta $200,y iny bne pageloop inc target+2 dex bne pageloop dec blocks2load bne load_a_full_blockready ;jmp $200code_size=*-$1de init_size=code-$200#print code_size#print init_size#print 50-(code_size+init_size) .dsb 50-(code_size+init_size),0 Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 (edited) :-) thought of "pha" also ... Since X is 0 on entry, you can use "dex" instead "ldx #$ff" txs ldx #ready-stack_code copy: lda code,x pha dex bpl copy bra $1de or ldx #ready-stack_code copy: lda code,x pha dex BPL copy LDA cart0 ; get number of full blocks to load sta blocks2load STZ block bra $1e5 code ;.org $1e5 stack_code: load_a_full_block: inc block lda block jsr $fe00 tay ; a == 0 after fe00, x == 2 pageloop lda cart0 target sta $200,y iny bne pageloop inc target+2 dex bne pageloop dec blocks2load bne load_a_full_block ready less pushes.(I like these nonsense-optimizations .... :-) ) Edited August 24, 2018 by 42bs Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 24, 2018 Author Share Posted August 24, 2018 Sweet! See? Reinventing the wheel can be most productive and/or/eor fun Attached is a little speed showcase. Loading 64512 Byte as first file. However, this shows already a downside Only loads full blocks currently and not hiding screen in the stack-running version. No time now for a proper example. Still 160x800 pixels loaded impressively fast I think. This was fun. Cheers, Martin bigpic.zip 2 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.