Jump to content

Photo

Math FPU


46 replies to this topic

#1 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 17, 2017 1:43 AM

we have seen a lot of additional hardware expansions like video cards, disk adapters, dual pokey etc...

 

but is there not a FPU which would work with 6502 somehow? (i don't mean here having an adruino board as slave ;))

 

something what 8088 etc have?

 



#2 TXG/MNX OFFLINE  

TXG/MNX

    River Patroller

  • 3,604 posts

Posted Wed May 17, 2017 1:46 AM

we have seen a lot of additional hardware expansions like video cards, disk adapters, dual pokey etc...
 
but is there not a FPU which would work with 6502 somehow? (i don't mean here having an adruino board as slave ;))
 
something what 8088 etc have?
 

I think the Veronika cartridge is ideal for this

Verstuurd vanaf mijn Le X829 met Tapatalk

#3 Rybags ONLINE  

Rybags

    Quadrunner

  • 14,838 posts
  • Location:Australia

Posted Wed May 17, 2017 2:12 AM

I forget the game/s but there was a cartridge that had a maths chip, performed simple integer multi/div for 3D wireframe games (?)

 

In theory given there's a VGA core and a sound core for VBXE, it'd be possible to have a reduced funtion core that frees up some space and devotes it to maths functions... like maybe extra blitter functions that do array processing.

 

But really - the best option is probably something like a Veronika cart.



#4 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 17, 2017 2:37 AM

i just thought about all the motorolla fpu or intel FPUs.

 

I am wondering because even in 80s A800 had realtime clock, 80 collum cards, RAM expansion but no FPU... 

 

i guess basic 16x16 signed mul would help a lot...

 

because imho my experience at the moment with all those nice new hardware is that the 6502 is the bottle neck to feed the cards...



#5 thorfdbg OFFLINE  

thorfdbg

    Dragonstomper

  • 729 posts

Posted Wed May 17, 2017 3:55 AM

i just thought about all the motorolla fpu or intel FPUs.

 

At least as far as BASIC is concerned, this wouldn't help much. The math pack uses a very strange number format (BCD encoded with basis of 100), compared to the Mot and intel FPUs which support the old-but-still-state-of-the-art IEEE (binary, base-2) format.
 



#6 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 17, 2017 3:59 AM

i am talking about assembler of course... ;)

 

but havent found any FPU like device compatible to 6502? i am not a hardware guy but I find stuff like PIA and CPUs from western design but no FPU... so I guess it was not developed?

 

what about z80?



#7 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 17, 2017 4:06 AM

ah... just my 2 cents why I came up with that topic... simply doing 3 projects on lynx... esp when doing 3d or calculation stuff... the build in FPU comes handy...if you have seen my Elements Demo you see the difference to A8... 

 

- 4mhz helps a lot even 6502 and not 65816 like SNES.

- FPU with accumulation helps all the 3d calcs a lot

- blitter support (ok... sprites) for all the rest... parallel to CPU

- hardware clipping/virtual world 

- 512k HDD-like cart

- Sound is weak though

- 4096 colors, 4bit per R,G,B

 

the Lynx is imho one of the most balanced 6502 8bit systems (would make great stand alone console imho).

 

the Voxel landscape, Tunnels, Voxel-Ball stuff you see the difference to the 1,77 Mhz A8... gfx mode is same (4bit/2 pixel per byte).

the 3d uses extensivly the FPU plus blitter for rendering the polys in the flat shading... the gouraud relys reavily on CPU interpolating...

 

now back on A8 i am suprised that the VBXE might help in the blitting stuff... but I feel the CPU sweat.... doing all on his own...



#8 Rybags ONLINE  

Rybags

    Quadrunner

  • 14,838 posts
  • Location:Australia

Posted Wed May 17, 2017 4:17 AM

Yep, I don't think anything out there was easily compatible.

 

The "proper" implementation before they were incorporated into the CPU was to have some dedicated instructions which if the FPU was present would simply pass the parameters along and if not present would just generate an Exception and the operation was slowly performed by the handler.

68000 has the line-A and line-F instruction types, line-F were intended for coprocessor use (I think ST used A-line for graphics operations that could be offloaded to the blitter if installed).

 

A 6502 implementation - there's not much helpful interface hardware in our CPU to help.  There is the SYNC pin which external hardware can monitor and snoop the instruction to add extra functionality though whether we'd want to do it this way, doubtful.  The arcade Missile Command uses this pin with external hardware which senses (ind,X) instructions to allow bitmap access that puts a pixel per byte though in reality it's stored as packed 2bpp in memory.

 

A better implementation might be where the numbers are just loaded into hardware registers then a mode register initiates the operation and optionally generates an IRQ when the result is ready.



#9 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 17, 2017 4:45 AM

yeah... i am thinking of same like Lynx or VBXE... not that you get FMUL, FADD etc...

 

Feed values into registers of device like the RGB color registers via IO port... and toggle instruction and wait copro finishing operations and read results... same like VBXE... same on LYNX.



#10 Wrathchild OFFLINE  

Wrathchild

    Stargunner

  • 1,786 posts
  • Location:Reading, UK.

Posted Wed May 17, 2017 10:28 AM

IMHO, the UNO Cart is a veritable blank-canvas that can be easily utilised for this sort of task. Feel free to PM me and I can probably work with you to create a PoC.



#11 Mclaneinc OFFLINE  

Mclaneinc

    River Patroller

  • 4,733 posts
  • Location:Northolt, UK

Posted Wed May 17, 2017 10:40 AM

I forget the game/s but there was a cartridge that had a maths chip, performed simple integer multi/div for 3D wireframe games (?)

 

 

Assault Force 3D with the Supercharger 3D



#12 CharlieChaplin OFFLINE  

CharlieChaplin

    River Patroller

  • 2,448 posts

Posted Wed May 17, 2017 12:27 PM

Yep,

Assault Force 3D, Fandal created a file version of the game that worked without the cart but with the same speed: http://atariage.com/...-2#entry1562507

 

But since Atarians are pirates, let`s steal some stuff:

- steal the SID chip from the C64

- steal the AY chip from the ST, Spectrum and Amstrad CPC

- steal the 65816 from the SNES

- steal the FPU from the Lynx

- steal the (second, third, fourth) Pokey from the 7800 Ballblazer carts

- steal the custom chips from the Amiga (e.g. use them instead of VBXE, Covox, etc.)

and use all these chips inside the A8...

 

Master plan: If all Atarians unite and steal millions of these chips and therefore kill all former competitors (since they are missing major chips) - then Atari 8Bit will finally rule the world, muhahaha...


Edited by CharlieChaplin, Wed May 17, 2017 12:28 PM.


#13 thorfdbg OFFLINE  

thorfdbg

    Dragonstomper

  • 729 posts

Posted Wed May 17, 2017 1:42 PM

 

but havent found any FPU like device compatible to 6502?

 

No, there isn't anything like this. The 6502 addressed really the low-budget market - FPUs were for number crunching and high-performance computing, a completely different market sector.

 

The 6502 lacks any sort of coprocessor interface that would allow an external chip to take over the instruction flow. The best one could do is to provide an FPU that operates on the basis of I/O registers, to be filled with source data, and a second set where the CPU could read the results.



#14 Rybags ONLINE  

Rybags

    Quadrunner

  • 14,838 posts
  • Location:Australia

Posted Wed May 17, 2017 10:25 PM

Another alternative could be a vector/array processor that works using DMA - but the problem is that Atari didn't implement DMA properly in that Antic accesses aren't predictable by other devices also wanting to do DMA.



#15 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 17, 2017 10:30 PM

Nowadays I would think of a "math pack" inside existing cards and mapped to dxxx.

#16 TangentAudio OFFLINE  

TangentAudio

    Chopper Commander

  • 211 posts
  • Location:USA

Posted Thu May 18, 2017 8:00 AM

Another alternative could be a vector/array processor that works using DMA - but the problem is that Atari didn't implement DMA properly in that Antic accesses aren't predictable by other devices also wanting to do DMA.

 

Could apply the technique I am using in my PBI WiFi design, using dual port RAM in an FPGA.  One could probably design a nice math coprocessor in an FPGA and have it operate on sections of memory mapped to the Atari.  Just needs a little bit of handshaking to signal back and forth that data is available or locked.



#17 phaeron OFFLINE  

phaeron

    River Patroller

  • 2,166 posts
  • Location:USA

Posted Thu May 18, 2017 10:38 PM

 

Could apply the technique I am using in my PBI WiFi design, using dual port RAM in an FPGA.  One could probably design a nice math coprocessor in an FPGA and have it operate on sections of memory mapped to the Atari.  Just needs a little bit of handshaking to signal back and forth that data is available or locked.

 

IMO a "Veronica II" solution with a coprocessor would be more useful. If it had interrupts, on-board bootable flash, and more flexible memory windowing it would be possible to use it both as a math accelerator and a general-purpose program accelerator. Veronica's 14MHz 65C816 is already more than 8 times faster than the 6502; the main issues are that getting data in and out of it are awkward and you need the whole program written around using it (versus just a driver).



#18 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Thu May 18, 2017 11:03 PM

With 14mhz 16bit CPU it might be overkill to use a maths coprocessor but it helps where it helps... ;)

#19 Matej OFFLINE  

Matej

    Chopper Commander

  • 245 posts

Posted Fri May 19, 2017 3:37 AM

And what about SNES enhancement chips???

 

WIKI

https://en.wikipedia...hancement_chips

 

As you know SNES was based on 65816...

 

Here is DSP-1 aka nec upd77C25 techdemo:

 

Datasheet:

http://www.fabiomont...-Data-Sheet.pdf

 

Or another DSP???



#20 Matej OFFLINE  

Matej

    Chopper Commander

  • 245 posts

Posted Fri May 19, 2017 3:42 AM

SA-1 demo:



#21 peteym5 OFFLINE  

peteym5

    Stargunner

  • 1,564 posts
  • Location:Buffalo NY USA

Posted Sat May 20, 2017 6:04 PM

For games like Tempest and a few other ongoing games I use table based multiply and divide that do simple integer multiply and divide under 20 clock cycles. 

 

http://codebase64.or...6502_6510_maths

 

I have some code that quickly calculates arc tangent full very quickly.

 

I am not sure if these can be adapted to run faster on a 65816 cpu.

 

I had an ideal of creating a cartridge chip that supports XEGS bank switching and a math co-processor. Can this Veronika cartridge do this?



#22 David_P OFFLINE  

David_P

    Dragonstomper

  • 767 posts
  • Location:Canada

Posted Sat May 20, 2017 9:50 PM

Cheaper option would be to pre calculate the info you need and store it in banks of the cart. The big Atarimax carts give you one megabyte to play with; lots of room for lookup tables there...

#23 peteym5 OFFLINE  

peteym5

    Stargunner

  • 1,564 posts
  • Location:Buffalo NY USA

Posted Sun May 21, 2017 8:36 AM

The Arc Tangent routine I use is here: http://codebase64.or...an2_8-bit_anglereturns 8-bit angle

 

I know Trigonometry is tough to do in assembly and that is a common use of tables. This is a common Sine / Cosine lookup routine I use that returns based on 8-bit angle in the A register.

GETCOS
    AND #127
    CMP #64
    BCS NOTRIGINV
    JMP DOTRIGINV
GETSIN
    AND #127
    CMP #64
    BCC NOTRIGINV
DOTRIGINV
    EOR #63
NOTRIGINV
    AND #63
    TAY
    LDA TABTRIG,Y    
    RTS

TABTRIG 
  dta 000, 006, 012, 018, 025, 031, 037, 043, 049, 056, 062, 068, 074, 080, 086, 092, 097, 103, 109, 115, 120, 126, 131, 136, 142, 147, 152, 157, 162, 167, 171, 176
  dta 181, 185, 189, 193, 197, 201, 205, 209, 212, 216, 219, 222, 225, 228, 231, 234, 236, 238, 241, 243, 244, 246, 248, 249, 251, 252, 253, 254, 254, 255, 255, 255, 255

I have experimented with fast line drawing algorithms also. 

 

I admit it will be great to have a math co-processor inside that Atari or on a cartridge.

 

Anybody know where to get the chips for Supercharger 3D that was used in Assault Force 3D?



#24 Wrathchild OFFLINE  

Wrathchild

    Stargunner

  • 1,786 posts
  • Location:Reading, UK.

Posted Sun May 21, 2017 10:01 AM

Anybody know where to get the chips for Supercharger 3D that was used in Assault Force 3D?

 

I would wonder if that would be as cost effective as a board with an ARM/PIC 32 bit micro on it?

This is the UNO Cart doing the work of rendering the VRAM area a GameBoy would use into its own memory which is then 'fed out' via a Dlist approach similar to what the Tomek cart proposed. All the A8 is doing here is setting the top/left window coordinate to produce the effect of scrolling and setting the character data for the flower and water to produce the animation.
Therefore having the same architecture do anything from a replacement MathPak to even taking shape definitions & viewpoint and internally use 3D math calculations to produce the resulting scene is not going to sweat the microprocessor that much.



#25 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,074 posts
  • Location:Baden-Württemberg, Germany

Posted Sun May 21, 2017 11:09 AM

Mark.... anything helps... ;) I thought having such a tool inside a cart (wether vbxe maths edition or pico) would be fine by me.

Assume 16x16 signed mul with MAC or even floating point.... in hardware done parallel to CPU.




0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users