Jump to content

Photo

Math FPU


48 replies to this topic

#26 DavidMil OFFLINE  

DavidMil

    Moonsweeper

  • 438 posts
  • Location:Kingwood, Texas

Posted Sun May 21, 2017 6:02 PM

Correct me if I'm wrong, but didn't the 800 have a floating point math chip built onto the ROM board?  Seems like I remember one,

and then somebody came out with a 'Fast Chip' that wasn't really that much faster.

 

DavidMil



#27 Stephen OFFLINE  

Stephen

    Quadrunner

  • 6,513 posts
  • A8 Gear Head
  • Location:No longer in Crakron, Ohio

Posted Sun May 21, 2017 6:25 PM

Correct me if I'm wrong, but didn't the 800 have a floating point math chip built onto the ROM board?  Seems like I remember one,

and then somebody came out with a 'Fast Chip' that wasn't really that much faster.

 

DavidMil

Yes, but all that was, was software routines that the main CPU executed.



#28 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Mon May 22, 2017 7:19 AM

What happened to Tomek-8 cart???

 

Nice inspiration...


Edited by Matej, Mon May 22, 2017 7:27 AM.


#29 rpiguy9907 OFFLINE  

rpiguy9907

    Star Raider

  • 92 posts
  • Location:New Jersey, USA

Posted Mon May 22, 2017 8:33 AM

Both the Motorola 68881 and Weitek FPUs had a memory mapped I/0 mode, so you could theoretically hook one up to an 8-bit system and address them as a peripheral.

 

It would be very interesting, I wonder if the I/0 mapping would even fit in 64K of address space (I have no recollection of how much I/O space you needed to memory map to get them to work).

 

In any case, the short version is that any FPU that can be memory mapped can be added as peripheral. If you want to add an FPU that does not have a memory map mode then you have to build a an interface and potentially include another CPU on the board.

 

I would bet the SNES accelerators were memory mapped somehow.



#30 TXG/MNX OFFLINE  

TXG/MNX

    River Patroller

  • 3,627 posts

Posted Mon May 22, 2017 11:54 AM

I would love someone would make of the veronika cartridge for this because there is almost non real support for this nice cartridge..



#31 peteym5 OFFLINE  

peteym5

    Stargunner

  • 1,806 posts
  • Location:Buffalo NY USA

Posted Mon May 22, 2017 5:58 PM

When comparing the Table Driven Multiply and Divide Routines that calculate in about 50 clock cycles vs 150+ cycles to use the standard bit rotating multiply and divide. That is including loading registers and the time for JSR and RTS.

 

I am not sure how many SuperCharger 3D required. You needed to write to 3 registers in D5xx with the command and 4 NOPS, maybe takes between 25 to 35 cycles. To set up the registers, init command, and wait for result.

 

We talked about Fast Line Drawing algorithms here.

http://atariage.com/...ine-algorithms/

That routine gets you about 4 or 5 lines inside per VBI cycle. 

 

If someone was to make a fast math chip today, maybe have it get the results by the time the 6502 reads the result register.

 

Can we get something to assist with fast line drawing?

 

Edit. I needed to factor in setting up registers and the time it takes to do JSR + RTS.


Edited by peteym5, Mon May 22, 2017 6:16 PM.


#32 Wrathchild ONLINE  

Wrathchild

    Stargunner

  • 1,876 posts
  • Location:Reading, UK.

Posted Mon May 22, 2017 7:24 PM

You'd need to decide up front the delineation of work, e.g. Is the A8 going to draw the lines or the cart? 

Are you working in integer, fixed point or floating point number representations?

 

Reign that back a little and the cart could, given the start/end co-ords, calc all the offsets and 8-bit values to poke to the screen memory.

Reign that back a little more and the cart could instead return all the X,Y points of the resulting line and the A8 plots those.

Alternatively the cart could build the assembly code to do the plotting and the A8 can call that (i.e. exposed through cartridge memory)

etc etc

right back to the A8 copying operands to the D5xx, instigating the operation and awaiting the result, flagged when ready by the cart, which is then copied for further use.

 

At a simplistic level, the overheads of the last method may not provide the advantages sought but on the whole it would.

e.g. a integer based 'sin' table lookup versus a multiplication of two floating point numbers (replacing the 'setup in zeropage and call mathpak' approach).

 

Following on from this, pushing higher level functions into the cart, such as vector math, would also result in big time gains.

This way the A8 would be able to render an object, say the Utah teapot, by performing all the viewport transform math with the assistance of the cart.

 

But the argument then moves on to "why stop there". Supply the object definition, world co-ordinates and viewport definition and the cart can do all that for us.

In a way, this becomes similar to what the VBXE offers. The next step would be to delegate operations for changing the scene, e.g. through rotating or re-positioning the object, camera, even light source(s).

 

At the far end of things, the A8 becomes more of a Human Interface Device, passing joystick or key press details onto the cart to effect a change to the scene.

I'm OK with that, and confident the UNO Cart can help provide it. If anyone has code (preferably in C) for a neat 3D demo, either lines or polygons that can be adapted I'm happy to try.

(Along the lines of a "floating point" example, I was looking at a minimal ray-tracer which could be adapted to do something simple at say 160*192@4 colors or 80*192*16 shades)

 



#33 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,362 posts
  • Location:Baden-Württemberg, Germany

Posted Tue May 23, 2017 8:13 AM

What makes table driven muls at some point weak even fast is accuracy... so benefits here I see and speed.

Table driven 16x16 signed mul will kick in latest when handling 3d objects. The torus I used in the Lynx demo is rendered flat shaded at desired speed with 128 faces 64 vertices... same code on a8.... slide show even not counting rendering.

So now think of better accuracy when doing sub pixel for floating vectors... in a 3d world... now you are dealing with 24bit signed numbers.... see how quick your 8bit CPU gets in trouble...

#34 thorfdbg OFFLINE  

thorfdbg

    Dragonstomper

  • 739 posts

Posted Tue May 23, 2017 1:13 PM

Both the Motorola 68881 and Weitek FPUs had a memory mapped I/0 mode, so you could theoretically hook one up to an 8-bit system and address them as a peripheral.

 

Not exactly. The 68881/82 implements the Mot 68020 coprocessor interface. With some additional glue logic, you can make it appear as if it would be a 16 bit data port plus a control port. Yes, it can be done, but it is not immediate.

 

There once was an expansion card for the Amiga with a 68881 on it operating as an I/O device with such glue logic involved. It was not exactly fast because the CPU had to manually feed the FPU, and had to manually check for results.



#35 thorfdbg OFFLINE  

thorfdbg

    Dragonstomper

  • 739 posts

Posted Tue May 23, 2017 1:17 PM


We talked about Fast Line Drawing algorithms here.

http://atariage.com/...ine-algorithms/

That routine gets you about 4 or 5 lines inside per VBI cycle. 

Line drawing doesn't require an FPU. Line drawing only requires integer add, subtract, compare and increment/decrement.

 

Yes, it can be done in hardware, of course. Look at the Amiga blitter, it does exactly that.



#36 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,362 posts
  • Location:Baden-Württemberg, Germany

Posted Tue May 23, 2017 3:10 PM

Thorfdbg

Not exactly... if you are talking about bresenham. Yes... if you are start using DDA you already have mul involved... now invent subpixel for better quality and voila... fpu comes handy.

#37 CharlieChaplin OFFLINE  

CharlieChaplin

    River Patroller

  • 2,570 posts

Posted Tue May 23, 2017 3:49 PM


(Along the lines of a "floating point" example, I was looking at a minimal ray-tracer which could be adapted to do something simple at say 160*192@4 colors or 80*192*16 shades)

 

 

Hmmm,

 

if I remember correctly (but I might be wrong here), there are some raytracing programs for the A8, e.g. one from the german CT magazine (includes a demo program and the editor/creation program) either in Gr. 8 or Gr. 15 and another one from the Abbuc PD library (with several demos and a creation program) in Gr. 9+11. The demos look fine, however the editor/creation programs take some hours to calculate one single picture - with a math fpu or something similar this could be done much faster. Will take a look at my PD collection and post the ATR images tomorrow...



#38 peteym5 OFFLINE  

peteym5

    Stargunner

  • 1,806 posts
  • Location:Buffalo NY USA

Posted Wed May 24, 2017 9:51 AM

Believe me, I looked into how things like a 65816, undocumented opcodes, VBXE, or some other co-processor can improve line drawing speed. Maybe the 65816 can get faster compare slope results or add/sub the slope faster.



#39 CharlieChaplin OFFLINE  

CharlieChaplin

    River Patroller

  • 2,570 posts

Posted Wed May 24, 2017 11:47 AM

Alright, here is the Raytracing program from C'T magazine 1/1986. It includes the creation program (Kugel.TUR or Kugel3.LST; Kugel means sphere) that runs with TB XL on 64k machines and a demo program (Demo.Tur or Demo.LST) that runs with TB XL on 128k machines...  CT_1_1986.zip (contains Kugelraytracer -tur.ATR)

 

The creation program creates/calculates the Raytracing picture with the sphere and a selectable background (triangles, checkered/chessboard, etc.) and then draws a Gr. 8 picture in b/w on the screen. With a real A8 this takes quite a while - I tested the creation program with the given/pre-defined values under Atari 800 Win with full speed (dual-core CPU with 3.1Ghz; screen-size 640x480, full speed = 1800%) and it took some minutes until the picture was fully generated/drawn. Since C'T is a german magazine (still available today, but PC only), this type-in listing from 1/1986 also uses german language. You may need a dictionary or some online translation german=>english...

 

As said before, there is also a demo on the disk-image, just run Demo.Tur on your min. 128k machine.

Also attached you will find other Raytracing demos:

 

- RayAnimA.ATR and RayAnimB.ATR: These 3 demos in Gr. 8 were done by Eisbaer Corp./Karl Pelzer (he once was and maybe still is an Abbuc member) and they are based on the algorithm of the above C'T program from 1986. Boot the diskette, press 1,2, or 3 to choose one of the three available demos, then press 1,2,3 or 4 for the speed of the animation.

 

- Ray128k.ATR: This Gr. 9+11 raytracing animation demo (named Landscape) was also done by Karl Pelzer; there even was an editor/creator program for this Gr.9+11 raytracing animation, but alas, I cannot find it anymore...

 

Maybe these demos/animations and/or the editor/creation program are usefull to demonstrate the CPU and/or math FPU speed. There are also some fractal generation programs available for the A8, e.g. "Fractal Express" that could be used for some calculation speed tests...

Attached Files


Edited by CharlieChaplin, Wed May 24, 2017 11:53 AM.


#40 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,362 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 24, 2017 1:12 PM

Believe me, I looked into how things like a 65816, undocumented opcodes, VBXE, or some other co-processor can improve line drawing speed. Maybe the 65816 can get faster compare slope results or add/sub the slope faster.

 

sure?

 

dest y step of the vbxe will add your slope in 14mhz speed... not sure how you will do that faster on 6502...

 

but forget 8bit integer non fractional bresenham....

 

think of subpixel lines... where you need at least 8.8 or 3d world coords where you deal with 24bit at least...



#41 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Wed May 24, 2017 2:35 PM

Which fpu was in Lynx?Can it be replicated in fpga?Another way is using mcu for doing math stuff or matrixes for 3d.Some kind of simple 3d card...

#42 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,362 posts
  • Location:Baden-Württemberg, Germany

Posted Wed May 24, 2017 11:59 PM

The Lynx maths functions are inside the Suzy "Blitter". So as emulators are out there... would that be enough?

#43 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Thu May 25, 2017 1:47 AM

Hmm Best Electronics are selling Suzy for 20USD...

C302284 / VC5139

http://www.best-elec...om/custom-i.htm

August 2016...

 

But dont make something what is allready done!!!

 

Here on forum is sio2pi (raspberry pi)...

So maybe RPi can emulate fdd+second 6502...

 

Here is similar project 150mhz 6502 for BBCmicro.

 

So I can imagine sio cable and on end RPi1/2/3 or RPiZero in nice case.

 

BBCmicro Turbocard

http://stardot.org.u...php?f=3&t=11325

 

PiZero cost as that Suzy... So it is cheapest solution. Plus you can buy old RPi1 or RPi2 from

ebay for few $.

 

Also it will be easy DIY (just sio cable).

 

Sio2Pi:

http://atariage.com/...pi-as-a-floppy/

 

Also RPi is good for future. As there was millions allready manufactured.

 

Here is screenshot of 6502 BBCmicro test...

FPU - 165MHz vs. 2MHz original

intro4.jpg


Edited by Matej, Thu May 25, 2017 1:56 AM.


#44 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Thu May 25, 2017 2:00 AM

Here is 287MHz version:

https://github.com/h...eleases/tag/boa

SCREENSHOT

http://stardot.org.u...29791&mode=view

 

Video of RPi co-processor ELITE GAME, GEM:

 

ANOTHER VIDEO:


Edited by Matej, Thu May 25, 2017 2:13 AM.


#45 Heaven/TQA OFFLINE  

Heaven/TQA

    Quadrunner

  • Topic Starter
  • 10,362 posts
  • Location:Baden-Württemberg, Germany

Posted Thu May 25, 2017 3:07 AM

Nah isn't that real cheating? ;)

#46 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Thu May 25, 2017 4:54 AM

Isnt STM32 (arm) or FPGA or 16MHz Blitter/FPU cheating too?

Isnt 1MB or VBXE cheating??? I say NO! Its just technological 

progress. How will A800 evolve without Atari ST range for example...

Basically it will be 100-200MHz 6502 :D plus maybe FDD emulator+ethernet.

 

I will be happy to see:

- real doomlike 3D games

- real car 3D games

- real flight simulators

- street fighter like games with huge player sprites

- games using lot of sprites and big sprites enemies

- 512 or 1024 sprites r-type/project x like shooters

- 3D demos with high FPS (new effects possible)

- 8 channel MODs as music or 4 channel sampled effects plus 4 channel pokey music

- with ethernet online gaming will be possible

- with 150mhz+ethernet mmorpg or mmofps will be possible or rts too

- playing movies in game intro

- webbrowser (html4 to 320x192 graphics)

- VOXEL engine like Minecraft

- etc etc etc

 

Also RPi can be universal upgrade:

- fdd

- ethernet

- 6502 emulation (100mhz or more)

-...(something else in future??? USB??? WIFI???)

For 5Euro (PiZero)-50Euro(Pi3) thats pretty cool!

- you can buy RPi in any country (distribution logistics) + re-use old models (ecology)

- Atari community allready use RPi solution (Atari ST Cosmos-EX)

 

Here is how RPi cartridge should looks like.

(This one is for Acorn Electron)

http://imgworld.cz/WSijj2K2wN.png


Edited by Matej, Thu May 25, 2017 4:57 AM.


#47 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Sun May 28, 2017 12:55 AM

Here is FPE card for Apple 2:

http://apple2.org.za...urces/FPE.CARD/

 

Here is owners manual:

http://www.whatisthe...gine-Manual.pdf

 

It use MC68881/68882...

 

Here is how card looks like:

Floating-Point-Engine.jpg

 

Here is MC6888x on ebay:

http://www.ebay.de/s...c68881&_sacat=0

http://www.ebay.de/s...c68882&_sacat=0

 

Its Motorola... Its working on Apple 2. 

 

Datasheet:

http://www.alldatash...LA/MC68881.html



#48 Matej OFFLINE  

Matej

    Moonsweeper

  • 323 posts

Posted Thu Jun 29, 2017 3:00 AM

WOLF 3D on GAMEBOY using CPLD MMU from SNES:

http://www.happydaze.se/wolf/



#49 Stormtrooper of Death OFFLINE  

Stormtrooper of Death

    Moonsweeper

  • 398 posts
  • Location:The Netherlands

Posted Thu Jun 29, 2017 8:39 AM

Playing movies as game intro can also be done with SIDE2 and good programming skills.






0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users