Jump to content

Lee Stewart

+AtariAge Subscriber
  • Content Count

    5,030
  • Joined

  • Last visited

Community Reputation

3,151 Excellent

3 Followers

About Lee Stewart

  • Rank
    Quadrunner

Profile Information

  • Gender
    Male
  • Location
    Silver Run, Maryland

Recent Profile Visitors

19,168 profile views
  1. Each field is preceded by a length byte, giving the extra 4 bytes of record length. The first field is a string, padded on the end with spaces. I do not remember whether the length byte is always 10 or counts the length up to the first space. Nonetheless, the first field is always 11 bytes. The last three fields are floating point numbers always preceded with an ‘8’ because floating point numbers are always 8 bytes long, so each of those fields is 9 bytes wide. ...lee
  2. I pay $50 every 2 years, which amounts to ~$2.08 per month. I used to pay $30 every other year, but this forum is so much a part of my life and everyday enjoyment that a regular subscription is a no-brainer. ...lee
  3. Working on it. I probably won’t be able to do much until this evening. ...lee
  4. In post #1234, I forgot to change the AORG directive to what Rich (@RXB) was using. I corrected it in the previous post and here it is again, just in case: ...lee
  5. OK—here is the comparison of my rolling routines (LESROLL, see post #1234) with Rich’s (@RXB's) corrected rolling routines (RXBROLL, see post #1236)— LESROLL— UP (1000 rows)..........22 seconds DOWN (1000 rows)........21 seconds RIGHT (1000 columns)....22 seconds LEFT (1000 columns).....21 seconds RXBROLL— UP (1000 rows)..........47 seconds DOWN (1000 rows)........46 seconds RIGHT (1000 columns)....48 seconds LEFT (1000 columns).....48 seconds Rich’s routines take a hair more than twice as long as mine. The reason for this is that every byte that RXBROLL moves involves 2 VDP address writes, whereas, LESROLL moves 32 bytes for each row to RAM and back using 2 VDP address writes for vertical rolls; 32 bytes for each row to RAM and back using 3 VDP address writes for horizontal rolls. Here are the number of VDP address writes used to roll 1 column or 1 row: LESROLL— UP.......50 DOWN.....50 RIGHT....72 LEFT.....72 RXBROLL— UP.....1600 DOWN...1600 RIGHT..1630 LEFT...1630 The RXBROLL code is pretty tight at 280 bytes. I could probably tighten up LESROLL a bit from its 422 bytes, but surely not to 280 bytes. The efficiency costs a little room. ...lee
  6. Actually, your code does not work for Right and Down scrolling because you are copying overlapping code in the wrong direction, i.e., each copy destroys the source for the next copy. There were a couple of other errors, as well. You also had a couple of redundant routines. I fixed all of that in the spoiler below: ...lee
  7. Actually, all my code was fine except for one small problem. I was reusing R0 from one row to the next, while forgetting that writing the VDP write address would change it by ORing >4000 every time. All I needed to do was compensate for that by stripping that change with ANDI R0,>3FFF just before returning from writing the VDP write/read address. The corrected code is in the spoiler: Now, I need to compare my code to Rich’s for speed. ...lee
  8. I have tested my routines. Up and down rolls work as expected. I have work to do on the left and right rolls—I obviously made some wrong row and column calculations. I will have them fixed later tonight or early tomorrow. ...lee
  9. I will take a closer look at your routines later. For now, a couple of comments: Your DROLL routine’s return should be B *R8 * RETURN Your MIT routine does not need to save the R11 return because it does not contain any BL instructions that would trash it: MIT MOVB @>83E7,*R15 * Write out read address MOVB R3,*R15 MOVB @>8800,R7 * Read a byte MOVB @>83EB,*R15 * Write out write address ORI R5,>4000 * Enable VDP write MOVB R5,*R15 MOVB R7,@>8C00 * Write the byte RT * RETURN ...lee
  10. I think you will find the increase in speed is not minimal. I am copying 32 characters with only one VDP-Write-Address set of instructions. You are doing 32 of those, once for every byte you read/write from/to VRAM. That is certainly much slower. ...lee
  11. As Rich mentioned, the E/A VMBW, etc. are invoked with BLWP, which is difficult to do with RXB because extended RAM is not required and there is not enough room in scratchpad RAM for both the extra registers and the row buffer I wanted—one or the other, but not both. When you use BL branching, you must worry about register use and saving returns if BLing more than one level. I generally use my own VMBW, etc. (practically identical to the E/A versions) in fbForth, except when I want very fast transfers. Then I waste the space for faster inline code. BL/RT is a good bit faster than BLWP/RTWP. You do lose a little of that advantage when you must save/restore returns for multiple levels, but it is still faster. You also need to ensure that interrupts are disabled when accessing VRAM. I presume that Rich has handled that prior to calling the roll routines. That is why I did not use LIMI instructions in the above code. ...lee
  12. OK, Rich (@RXB), here is the complete suite: I have not had time to test it. The roll routines can probably be tightened up because there is quite a bit of redundancy, especially in UROLL and DROLL. ...lee
  13. Here is my first pass at your roll routines. The following spoiler has copy routines for VRAM to RAM and RAM to VRAM as well as the first roll routine, RROLL, that uses them: ...lee
  14. Indeed. I kind of thought of that after I posted (I think). I was a little punchy and needed to get back to bed. 😴 ...lee
  15. At first blush, You cannot use MV05 for overlapping copies to lower destination addresses because it will destroy the overlap region. This means you cannot use it for LROLL. UROLL is fine because there is no overlap, but there is a more efficient way I will work on. To use MV05 to copy more than one byte, you must pass the end source and destination addresses, not the beginning. You need to add one less than the byte count to each address. You are using R3 and R5 without realizing that MV05 is corrupting them. In RROLL, you are adding the saved column to the end (where you got it) rather than the beginning (where it belongs). I will work on this some more, but first this question: What is the largest block of free RAM in scratchpad for RXB? Can we use the FAC – ARG area (>834A – >836D)? If so, we could buffer a row there so we could use VDP multibyte copies for UROLL and DROLL. ...lee
×
×
  • Create New...