Jump to content
Sign in to follow this  
Qwe

Let's talk about double buffering

Recommended Posts

Hi,

On Atari 8 bit is not usual to use double buffering.
Which is the better way to implement it?
On Commodore Amiga you have to do a polling over the scan line register and change the new pointer of video memory in copper list. This is all you need on amiga; is it something similar also on atari 8 bit?
On Atari is better polling on vcount or wait a vbi to swap video memory?
Is required a stop of dma in the same way I start a new display list?

If I remember correctly Antic has a limit of addressing of 4 Kbytes; any suggestion to not waste the memory is welcome.

Share this post


Link to post
Share on other sites

Sure, you can swap DLIST pointer in VBI or just change LMS addresses in DLIST. But you can have double buffering without any "manual" swapping. You just need to have two DLISTs (for 50hz) and at the end of each DLIST point to the other. (DLIST1 will have the last instruction JVB DLIST2 and DLIST2 has JVB DLIST1)

Edited by MaPa

Share this post


Link to post
Share on other sites

same options like on amiga...

 

you can wait for vcount register, you can implement double buffering into VBL interrupt/or DLI, you can trace the beam etc... you can alter the display list pointers (same like the copperlist plane pointers) or you can double buffer the complete display list etc...

 

only issue is that you need to take care of the Antic 4k limit per scanline...

 

I guess your code is not fast enough to update vram per 50 fps? if so I am doing stuff like this in main loop:

 

main_loop

lda frame_counter

wait_loop

cmp frame_counter

beq wait_loop

lda #<vram2

sta dlist_ad

lda #>vram2

sta dlist_ad+1

...

now do the fx into vram1

...

da frame_counter

wait_loop

cmp frame_counter

beq wait_loop

lda #<vram1

sta dlist_ad

lda #>vram1

sta dlist_ad+1

...

now do the fx into vram2

...

jmp main_loop
and in VBL interrupt
...
inc frame_counter
...
displaylist would be something like
...
.byte $4f ;lms command
dlist_ad .word vram1
...

Share this post


Link to post
Share on other sites

re: memory layout

 

depends on what you need for your code... I assume bitmap layout...

 

I put bitmaps always on $xx10 adresses.

 

now... In Arsantica 2 I used the double buffering technique from Rescue on Fractalus...

 

you have fex. 40 bytes video mode...

 

now luckily Atari... you can "interlave" 2 video rams.

 

so... byte 0-39 vram1 byte 40-79 video ram2 per scanline

 

so display list looked like

 

dlist1

$4f,.word vram1+line*80

...

 

dlist2

$4f,.word vram2+line*80

...

 

so my code worked with an vram offset to point into correct video ram...

 

the drawing code could easily pointed into right video ram with the index register...

 

so

ldx #79

...

sta vram1,x

...

dex

cpx #39

bne loop

 

just an idea.

 

in Arsantica 3 intro

 

i used 256 byte scanlines... so pixel adressing was easier (base video ram adress+x+y*256) and I could interleave again 2 screens including offscreen buffers left/right... so my 3d object did not needed to get clipped ;).

 

all of the above ways are tradeoff because of ANTIC stealing DMA from CPU due to long dlists...

Share this post


Link to post
Share on other sites

on Atari you have same tons of options like on Amiga... while fex. on C64, spectrum, CPC, Atari ST you are more limited in terms of "freely screen mode" design... that's why Amiga and Atari are child and parent ;)

Edited by Heaven/TQA

Share this post


Link to post
Share on other sites

Hi Quadrunner,

This is an interesting way to implement double buffering, but I do not quite understand the advantage of this technique.
Could you explain in more detail?

Thank you.

Share this post


Link to post
Share on other sites

which techique do you mean? the interleave one by doubling scanlines horizontal?

 

well... it reduces codesize for the unrolled loops I had... and I only needed to write code once for both... if you separate the screens you might fight against RAM footprint as you need to have separate routines (think of clear screen, EOR fillers, collum drawers etc) for each vram...

 

so... it is really easier if you say what you try to achieve so we can have a more proper look into what might be best solution...

 

have a look at my Voxel Planet 5200... it uses the interleave method, too... because ROM space was tight... when porting that Voxel engine to C64.... I needed to double each of the drawing routines because C64 can double buffer but "fixed" layout... (not talking about opening border tricks here on c64 or Atari ST).

 

think of Amiga... you can setup the modulo offset per line...so you extend the lines virtually.

 

if you look at my newest child: (sneak preview):

 

 

the double buffer technique is completly different... as I went here for low DMA footprint by keeping display list short and used the so called VSCROL trick to achieve low-CPU cycle stealing by Antic...

 

ps... Guys, don't spread this link... it's not official released yet as I am fighting with running from 5,25 disk.

Edited by Heaven/TQA
  • Like 3

Share this post


Link to post
Share on other sites


main_loop
lda #0 ;counter for angle is in main_loop+1
sta angle

lda:cmp:req 20 ;wait 1 frame
lda #<(vram2+2*line_length)
sta vramad+1
lda #>(vram2+2*line_length)
sta vramad+2


lda #0
sta fps
...


dlist
.byte $80
vramad .byte $6f,<(vram+1*line_length),>(vram+1*line_length)
.rept 34
; .byte $00
.byte $8f
.byte $2f
.endr

.byte $41
.word dlist



.proc NMI
bit $d40f
bpl VBL
jmp dummy_dli
dliv equ *-2

VBL
sta regA
stx regX
sty regY
jsr mpt_player.play
sta $d40f ;reset NMI flag
inc 20
inc fps
lda #<top_dli
sta dliv
lda #>top_dli
sta dliv+1
.if col8_flag=0
lda #0
sta $d01a
lda #$71
sta $d01b
.else
lda #$b1
sta $d01b
.endif
quit
lda regA
ldx regX
ldy regY
rti

 

Edited by Heaven/TQA

Share this post


Link to post
Share on other sites

Code snipped from:

 

 

 

org $2000
dlist1 
:3 dta $4f,$10,>vram  
.rept 73
.byte $4f,$10,>vram+1+#
.byte $4f,$10,>vram+1+#
.byte $4f,$10,>vram+1+#
; .byte 0
.endr
.byte $a0,$42
text_pointer1 .word text
 
.byte $41,<dlist1,>dlist1 
 
.align $200
 
dlist2 
:3  dta $4f,$50,>vram  ;* self generating: + (3+4)*2 = 14bytes
.rept 73
.byte $4f,$50,>vram+1+#
.byte $4f,$50,>vram+1+#
.byte $4f,$50,>vram+1+#
; .byte 0
.endr
.byte $a0,$42
text_pointer2 .word text
.byte $41,<dlist2,>dlist2 
 

loop
lda cloc
cmp cloc
beq *-2 
 
lda screen
bne _flp
shakr_test
lda #<dlist1
sta $d402
lda #>dlist1 
sta $d403
lda #34
sta fade_flag
 
lda demo_state
cmp #2
beq @+
jsr clr_screen2
bne _flp2
@
jsr clr_stars2
bne _flp2
_flp 
lda #<dlist2
sta $d402
lda #>dlist2
sta $d403
lda demo_state
cmp #2
beq @+
jsr clr_screen1
bne _flp2
@
jsr clr_stars1
 
_flp2 lda screen
eor #$40
sta screen
lda demo_state
cmp #2
bne @+
jsr render_stars
@
jsr render_scene
 
lda demo_state
cmp #3
beq @+ 
cmp #5
beq @+1
cmp #7
beq @+2
cmp #9
bne last_part
jmp @+3
last_part
cmp #11
bne continue
jmp @+4
continue
cmp #13
bne continue2
jmp @+5
continue2
jmp loop

Share this post


Link to post
Share on other sites

and here an example of the interleave screen I was talking about... it's from the Fractal flight at beginning of AD:6502

 

dlist 
.byte $f0
.byte $4e
.word topplanetgfx
:15 .byte $0e
;debug line
; .byte $46
; .word charline
.byte $80
.rept 64
.byte $4f,<(vram+#*$60),>(vram+#*$60)
.byte $4f,<(vram+#*$60),>(vram+#*$60)
.endr 
.byte $80,$30,$4e
.word topplanetgfx+16*40
:54 .byte $0e
.byte $41
.word dlist
 
dlist2
.byte $f0
.byte $4e
.word topplanetgfx
:15 .byte $0e
;debug line
; .byte $46
; .word charline
.byte $80
.rept 64
.byte $4f,<(vram+#*$60+48),>(vram+#*$60+48)
.byte $4f,<(vram+#*$60+48),>(vram+#*$60+48)
.endr
.byte $80,$30,$4e
.word topplanetgfx+16*40
:54 .byte $0e
.byte $41
.word dlist2

and here the collum EOR filler of that fx... as you see... Y-reg is used to point to the correct video buffer:

 

 

filler2
 
lda #0
.rept 64
eor vram+[#]*$60,y
sta vram+[#]*$60,y
.endr
rts
Edited by Heaven/TQA

Share this post


Link to post
Share on other sites
...

 

the double buffer technique is completly different... as I went here for low DMA footprint by keeping display list short and used the so called VSCROL trick to achieve low-CPU cycle stealing by Antic...

 

ps... Guys, don't spread this link... it's not official released yet as I am fighting with running from 5,25 disk.

 

Sorry any information on this trick? This is the first I have heard of it.

Share this post


Link to post
Share on other sites

here you go... Fox's answer:

 

Hi,

>hope you are fine. can you remember how you discovered the VSCROL trick
>(mode9++) for Numen? Just for historical reasons.

That was during reverse-engineering ANTIC for better emulation in Atari800.

BTW. if I knew about C64 coding, I'd call it FLI instead. :)

0xF

  • Like 2

Share this post


Link to post
Share on other sites

In regards to double-buffering: I did something similar, involving two DLIST's, in the picture at this post, on my Super IRG thread:

 

http://atariage.com/forums/topic/188370-doing-pictures-using-super-irg-2-and-other-ice-modes/?p=3181643

 

In this case, however, I was swapping not only two DLISTS, but also two DLI's every VBLANK as well. The screen memory is unchanged, but the font data (8 sets of it) and the DLI pointers get changed.

 

The idea here was to try and get a Graphics 9 screen with 256 colors per every 8 scanlines (and about 4,096 colors onscreen at once, but that's just a rough estimate, there are 136 unique color palettes to choose from), by altering DLI's each cycle. I am wondering if anything could be done to improve the performance, by using the VSCROL tricks in the demos above and cut the resolution in half? As it stands now, I am using text mode, and the color changes interleave every 8 scanlines, it would be great to be able to cut down on that, and maybe have the color changes interleave every scanline.

Edited by Synthpopalooza

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...