Jump to content
IGNORED

New GUI for the Atari 8-bit


flashjazzcat

Recommended Posts

That's why I mentioned the duality.

 

Type (1,2,5) LUT of multiple ROL instuctions = equivalent to type (7,6,3) ( = (3,6,7)) LUT of multiple ROR instruction.

 

So, it doesn't really matter. But, yes, when applying it to the mod 8 rule, then we should express all in #of ROR steps.

 

Then 7 * ROR = 1 * ROL, which can be executed directly with the {CMP #$80: ROL @} or {ASL @:ADC #$00} sequence. Then the 2 * ROL table will be translated to a 6 * ROR table....etc.

Link to comment
Share on other sites

Well... I'm lost. icon_smile.gif

 

I had a thought. Say the character data is:

 

1111 1100

 

I want to rotate this, say, 5 places to the right across TEMP and TEMP+1. If the bit shift table is set up so that low-order bits fall off the right and go straight back into the high-order bits of a byte, I could use a mask to take care of TEMP+1.

 

ldx data

lda shift_5,x

 

This would result in:

 

1110 0111

 

What I want to see is:

 

TEMP TEMP+1

 

0000 0111 1110 0000

 

How about using a table of masks to take the high-order bits of TEMP and place them where they need to go:

 

sta temp

ldx pixels ; "pixels" being the number of shifts we want, in this case 5

and masktab-1,x ; mask out low order bits for TEMP+1

sta temp+1

lda temp

and masktab2-1,x ; mask out high order bits for TEMP

sta temp

 

masktab:

 

%1000 0000

%1100 0000

%1110 0000

%1111 0000

 

%1111 1000

%1111 1100

%1111 1110

 

masktab2:

 

%0111 1111

%0011 1111

%0001 1111

%0000 1111

 

%0000 0111

%0000 0011

%0000 0001

 

In the above example:

 

ldx data (a = 1111 1100)

lda shift_5,x (a = 1110 0111)

sta temp

ldx pixels (x = 5)

and masktab-1,x (1110 0111 & 1111 1000 = 1110 0000)

sta temp+1

lda temp

and masktab2-1,x (1110 0111 & 0000 0111 = 0000 0111)

sta temp

 

This still doesn't take care of how we quickly arrive at "lda shift_5,x" without some kind of jump table or branching.

Edited by flashjazzcat
Link to comment
Share on other sites

Yes, the table-lookup isn't exactly automatic, you'd either need self-modifying code or a branch decision taking you to the right place.

 

The mask tables like you described are used to discriminate the parts of the rotated byte you put in each bitmap cell. No point doing 2 shift operations on your source data when you can just do the one and use masking.

 

 

Another shortcut from the 6502 Killer hacks thread that might come in handy:

 

Perform a double-rotate left (with carry) :

asl a

adc #$80

rol a

 

With an extra rol a that should = a 6 bit shift right, cost = 8 cycles.

 

To swap nybbles (same as a 4-bit rotate without carry) :

asl a

adc #$80

rol a

asl a

adc #$80

rol a

 

 

I suppose the method you employ in the end will come down to cycle-counting and memory considerations. The other headache would be that you have the proportional font spacing calculations ongoing, not to mention having to do bounds testing.

Edited by Rybags
Link to comment
Share on other sites

Yes, the table-lookup isn't exactly automatic, you'd either need self-modifying code or a branch decision taking you to the right place.

It gets less and less enticing by the minute, although I suspect the saving might start to show up with 24 bit wide fonts.

 

The mask tables like you described are used to discriminate the parts of the rotated byte you put in each bitmap cell. No point doing 2 shift operations on your source data when you can just do the one and use masking.

I know this: that's why I wrote the routine and posted it here. :)

 

Another shortcut from the 6502 Killer hacks thread that might come in handy:

 

Perform a double-rotate left (with carry) :

asl a

adc #$80

rol a

 

With an extra rol a that should = a 6 bit shift right, cost = 8 cycles.

 

To swap nybbles (same as a 4-bit rotate without carry) :

asl a

adc #$80

rol a

asl a

adc #$80

rol a

That nybble swap is pretty neat. I could probably use that in the fixed-width 4-bit 80 column routine somewhere.

 

I suppose the method you employ in the end will come down to cycle-counting and memory considerations. The other headache would be that you have the proportional font spacing calculations ongoing, not to mention having to do bounds testing.

Indeed so. I also considered simply unrolling the bitshifting loop and jumping into the code at a point which would yield the desired number of in-line shifts.

 

Now I have MADS compiler trouble... it doesn't like my structs. :(

Link to comment
Share on other sites

With the bit rotating for character renders, you have to calculate the # of rotations anyway, regardless of the method you choose to do the rotation and bit extraction.

 

So since you have that number, it'd be a simple case of just doing something to it then using it as a Branch offset, or modifier to a JMP instruction.

Link to comment
Share on other sites

I was thinking of using indexed indirect mode, and storing #shifts (plus a base offset) into the MSB of the pointer. Then we could say:

 

lda #< shift_0

sta ptr

 

...

 

lda #> shift_0

clc

adc shifts

sta ptr+1

ldy char

lda (ptr),y

 

Also, the masking technique we discussed earlier extends nicely across 16 bit and 24 bit wide characters; one just continues to mask the current byte's high order bits and place them in the next byte along. I really think this will come into its own when dealing with the larger characters.

 

I was thinking about italic characters, too. With those, the background mask changes on every alternate line of the character, so the fact I've already eliminated the bit shifts in the background masking is good news. When italicised, the upper area of a seven bit character can extend past the right hand side of a sixteen bit range, so I guess I'll have the chance to try out these optimisations soon.

Link to comment
Share on other sites

OK. I've implemented the shift tables using the following method:

 

; set up bit shift pointer at top of character render routine
lda xpix
clc
adc #> shift_table
sta shiftptr+1 ; index into shift tables

 

And this code replaces the bit shifting loop:

 

ldy temp ; character bitmap data
lda (shiftptr),y
sta temp
ldy xpix
and xpixtab-1,y
sta temp+1
lda temp
and rightpixtab-1,y
sta temp

 

This code is skipped if there are no bit shifts required. I calculate that at best this code executes in 31 cycles, and at worst 34.

 

Contrasted with the bit shifting loop:

 

lda temp
shiflp
lsr temp
lsr
ror temp+1
dex
bne shiflp
sta temp

 

This loop executes in 16 or 17 cycles, depending on whether the branch crosses a page boundary. Assuming it doesn't, at best it's 16 cycles (1 shift right), and at worst (7 shifts right) it's 112 cycles. So - if my math is correct - that's an average of 64 cycles.

 

Already a considerable improvement, I think, and it will be even more pronounced when the source data is sixteen bits (or more) wide, requiring two "passes" through the shift logic.

 

fontrender1.wmv

Edited by flashjazzcat
Link to comment
Share on other sites

Interesting.

 

Now we see it works, the next step could be saving some memory by replacing the 6 and 7 table by the fast rotate tricks:

 

 

• Rotating 7 to the right is equivalent to rotating 1 to the left, and we can do this with

 

 cmp #$80
rol

 

(if I'm correct this will take 5 cycles each time)

 

 

• Rotating 6 to the right is equivalent to rotating 2 to the left, and we can do this with

 

 asl @
adc #$80
rol @

 

as Rybags pointed out.

Link to comment
Share on other sites

I'm not sure whether you need the x-register there, but if not you can replace

 

ldy temp ; character bitmap data
lda (shiftptr),y
sta temp
ldy xpix
and xpixtab-1,y
sta temp+1
lda temp
and rightpixtab-1,y
sta temp

 

by

 

ldy temp ; character bitmap data
lda (shiftptr),y
tax
ldy xpix
and xpixtab-1,y
sta temp+1
txa
and rightpixtab-1,y

 

to gain some cycles.

 

("sta temp" at the end removed, as it's a temp value anyway. Can as well be saved in x-reg)

Link to comment
Share on other sites

Obviously, this thread is considerably beyond my comprehension, and I possess merely enough intelligence to recognize that you three are bordering on genius.

 

But I wanted to ask - for the GUI - where the fonts are coming from? Would it be possible to allow it to use Windows TTF? Maybe a utility could be written to pare TTF files down to work on the Atari? Then you could use lots of fonts, and not have to create them?

 

.....and now, superior intellects, please resume.... :)

Link to comment
Share on other sites

Where the fonts are coming from? Would it be possible to allow it to use Windows TTF? Maybe a utility could be written to pare TTF files down to work on the Atari? Then you could use lots of fonts, and not have to create them?

The only fonts that will be "created", so to speak, are ones that will be tailored to the GUI and it's desired dimensions and style. Other than that it's only a matter of taste as to whether any existing fonts used need to be "edited", as there are an innumerable amount of fonts that can easily be parsed straight in. Yes, the whole thing could be done without creating a single font. :)

Link to comment
Share on other sites

Interesting.

 

Now we see it works, the next step could be saving some memory by replacing the 6 and 7 table by the fast rotate tricks:

 

 

• Rotating 7 to the right is equivalent to rotating 1 to the left, and we can do this with

 

 cmp #$80
rol

 

(if I'm correct this will take 5 cycles each time)

 

 

• Rotating 6 to the right is equivalent to rotating 2 to the left, and we can do this with

 

 asl @
adc #$80
rol @

 

as Rybags pointed out.

I might see about implementing this later if the tables prove too expensive in terms of RAM. I should be able to save some space by no longer having to pre-render the mouse pointer (which is drawn in the VBL) now that we have very fast bit shifting routines. In fact, all this stuff is re-usable for icon and UI element rendering. Things are really flying now. icon_smile.gif

 

I'm not sure whether you need the x-register there, but if not you can replace

 

ldy temp ; character bitmap data
lda (shiftptr),y
sta temp
ldy xpix
and xpixtab-1,y
sta temp+1
lda temp
and rightpixtab-1,y
sta temp

 

by

 

ldy temp ; character bitmap data
lda (shiftptr),y
tax
ldy xpix
and xpixtab-1,y
sta temp+1
txa
and rightpixtab-1,y

 

to gain some cycles.

 

("sta temp" at the end removed, as it's a temp value anyway. Can as well be saved in x-reg)

Don't forget that TEMP is actually the left hand half of the character data to be ORed into the screen RAM. It has to find its way into a ZP register at some point.

 

Where the fonts are coming from? Would it be possible to allow it to use Windows TTF? Maybe a utility could be written to pare TTF files down to work on the Atari? Then you could use lots of fonts, and not have to create them?

The only fonts that will be "created", so to speak, are ones that will be tailored to the GUI and it's desired dimensions and style. Other than that it's only a matter of taste as to whether any existing fonts used need to be "edited", as there are an innumerable amount of fonts that can easily be parsed straight in. Yes, the whole thing could be done without creating a single font. icon_smile.gif

If anyone wants to write a utility to convert TTF/GEM/Mac fonts to work with this system, I doubt myself or Mr Fish would raise any objections. We're still finalizing the font format, however (and I still need to suggest to Mr Fish that we abandon the Atari internal character sequence), and once we have a reasonable collection of fonts in a few sizes (to prove the font renderer), the sky's the limit. However, creating the fonts is no trivial task: Mr Fish is doing some amazing work creating bespoke fonts which will make the most of screen space and still look great. I've seen the screen mock-ups!

 

ldy temp ; character bitmap data
lda (shiftptr),y
tax
ldy xpix
and xpixtab-1,y
sta temp+1
txa
and rightpixtab-1,y

 

ldy temp ; character bitmap data
lax (shiftptr),y
ldy xpix
and xpixtab-1,y
sta temp+1
txa
and rightpixtab-1,y

 

 

Illegals for the win icon_wink.gif

Nice use of LAX! icon_smile.gif

 

By the way: I'd like to thank all the experts for their insight and for helping me brainstorm this problem. We now have a font renderer 100% faster than the original.

Edited by flashjazzcat
Link to comment
Share on other sites

By better management on the mouse pointer enable/disable routine (which waits a jiffy to sync with the interrupt), I've doubled the speed of the menu renderer:

 

 

I don't think pre-rendered bitmap menu panels would be much faster than that. icon_smile.gif

Edited by flashjazzcat
Link to comment
Share on other sites

Wow, looks fabulous.

 

 

By the way: I'd like to thank all the experts for their insight and for helping me brainstorm this problem.

Well, I don't feel addressed as an "expert" here ;). I didn't really write similar code before.

 

However, I'm not sure whether the routine will be that fast when Italic is included. I expect you'd need to recompute the 16bit index register multiple times then. But, possibly it's not really a problem, as usually Italic isn't used that much.

Link to comment
Share on other sites

The menus work exactly like their Windows counterparts at the moment: they only pull-down when hovered over if you've already pulled a menu down with a click. I don't want to use the old Mac method of making the menus roll up as soon as the button is released. However, I see no reason why the hover-pull-down behaviour can't be made configurable. icon_smile.gif

 

Usually, italics are shifted right one bit every alternate line (looking at the character from the bottom up). The two line rule applies to any font size, to keep the slant uniform. We then have the slightly unusual situation of potentially shifting a byte more than seven places to the right. It may be worth re-evaluating the bit shift and dividing it into bytes, and loading the character data directly into the adjacent byte and then applying a smaller shift.

 

For example, if the top two lines of an 8-point character are to be shifted right three places, and the "xpix" offset of the character is already 7, 10 / 8 = 1 byte. So we load the top two lines into TEMP+1, then apply a 2 place right shift. The character data may then shift beyond 16 bits, of course, which is where TEMP+2, etc, come into play. Then - if we're drawing in a window - we need to do some clipping to make sure the shifted parts don't overwrite the scroll bar. I used a lot of counters in The Last Word's block marking routines, and they'd probably work well for clipping (which will always be on byte boundaries).

 

Simlar provision will need to be made for objects partly crossing the left hand side of a window. That's a headache for later, though. icon_smile.gif

Edited by flashjazzcat
Link to comment
Share on other sites

By better management on the mouse pointer enable/disable routine (which waits a jiffy to sync with the interrupt), I've doubled the speed of the menu renderer:

 

 

I don't think pre-rendered bitmap menu panels would be much faster than that. icon_smile.gif

That looks incredibly responsive! Really looking forward to this.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...