Jump to content

Photo

Z80 vs. 6502

Z80 6502 vs

276 replies to this topic

#51 TMR OFFLINE  

TMR

    River Patroller

  • 3,473 posts
  • Beeping the horn on the data bus
  • Location:Leeds, U.K.

Posted Sat Dec 1, 2012 7:35 AM

Changing attributes on the fly ? Don't know is it ever used - maybe in some demos.


The technique is called rainbow processing and as far as i'm aware there's only a couple of recent-ish homebrews using it during an actual game; the excellent Buzzsaw is the first that springs to mind but there's a decent Sokoban clone i can't remember the name of right now as well.

The Spectrum released by Timex and Sinclair in the US has a mode where the attribute cells are 8x1 pixels which is replicated in the SAM Coupe and Pentagon, on the plus side there's no CPU hammering required like rainbow processing needs but it does beef the screen RAM up to 12K.

#52 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Sat Dec 1, 2012 12:28 PM

I think that main reason for such, 'attribute' based graphic systems was cost of RAM - so they made color graphic with low RAM usage. Machine like Spectrum 16K could not be done with some better graphic (palette based) - because video RAM only would be more than 16K.
And similar stays for C64. Where sprite logic and other was present - likely not less complex than some better graphic logic.
And additionally, more RAM means more CPU time to calculate, write graphic. It was the problem with Amstrad CPC and MSX - Z80 was just too slow for their higher graphic modes.

Actually, wait states were a problem for the MSX which slowed down the CPU.
I couldn't find a CPC page that said anything about wait states but it might have a similar issue.
Here is a page talking about added wait states with the MSX engine:
MSX Link

FWIW, wait states are what killed the NEC Trek, but on later machines they were acceptable?
I never understood that. Many of the same things that ran poorly on the Trek would run poorly on later machines like the MSX.
The programmer writing the game said the Trek was too slow to include the zoom window for Mega Bug and I don't find anything like it on the MSX. Go figure.
Even the 1MHz Apple II with it's messed up screen had the same game under the name Dung Beetles.

Really, if you start trying to drive too much data with too slow of a CPU it catches up with you. If you add wait states you compound the problem.
Even though the IIgs was clocked faster than any other 6502 machine of that time, it's a bit slow for driving full screen 320x200 16 color animation.
Some of the IIgs games with large sprites are a bit sluggish unless you have an accelerator. I think even 4MHz would have made a huge improvement and is what the IIgs should have been. An 8MHz accelerator on a IIgs is pretty fast and some people have much faster.

The 6502 world really needed to transition to CPUs that were at least 4MHz if they were going to upgrade the graphics and sound significantly.
They would also need 16 bit support so the 65802/65816 would be a must.
The Z80 world would have to go to 8MHz or you have to start optimizing the Z80.
The prefetch and wider ALU in the 64180/Z180 made it much faster (20%?) on the same code than the Z80 in native mode.
At that point the speed difference between the 6502 and Z80 narrows considerably. Then the Z80 might need to run at 6MHz.
After that you have to go to fully pipelined architectures, cache memory, etc... to really improve performance and things like self modifying code and timing loops start to break. You have to offer some sort of compatibility mode that adds wait states or disables features to run old software.
<edit>
I do think the 64180/Z180 enhancements would have been sufficient for the CPC and MSX, or the Trek for that matter.

Edited by JamesD, Sat Dec 1, 2012 12:33 PM.


#53 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Sat Dec 1, 2012 12:37 PM

Tumble Bugs was another name used for Mega Bug or Dung Beatles on other platforms.

#54 ParanoidLittleMan OFFLINE  

ParanoidLittleMan

    Stargunner

  • 1,632 posts

Posted Mon Dec 3, 2012 3:45 AM

I meant that higher screen modes were slow because their size - CPU was just too slow to fill them, calculate all needed enough fast. But, yes, where were usually additional wait states too, because bus was slow too, so video generation self loaded RAM pretty much.
I remember my first true color graphic card for PC: Trident 8800 or 8900 - something like, with 1MB RAM. In true color mode it was terribly slow - updating screen took couple seconds.

#55 youki OFFLINE  

youki

    River Patroller

  • 2,451 posts

Posted Mon Dec 3, 2012 4:17 AM

I didn't read yet of the thread.

But the title z80 vs 6502 is very interresting.

Personnaly to have programmed a lot on 6502 machine (mainly C64 and Oric) and Z80 machines ( colecovision ,spectrum and MSX).

I have to say that in that i don't like program in Z80. Not for technical reason , because Z80 is more or less as good than a 6502. But the RISC of the 6502 is more confortable and more pleasant than the CISC of the Z80. That's purely subjective.

Same thing if you compare 80x86 and 68000 programming. Having done both... i love 68000 and hate 80x86. (unfortunally i did more 80x86...)

#56 Rybags OFFLINE  

Rybags

    Gridrunner

  • 16,084 posts
  • Location:Australia

Posted Mon Dec 3, 2012 7:51 AM

The definition of RISC or CISC is a grey area and the terms didn't even come into use until the 6502 had been around nearly 10 years.

In it's prime it would have been considered "complex" - the contemporaries were the likes of the 8080 which wasn't much more complex.

There's also the 2 key contradictions of Risc - few registers and lots of instructions that act directly upon memory.

#57 Mr SQL OFFLINE  

Mr SQL

    River Patroller

  • 2,099 posts

Posted Mon Dec 3, 2012 10:23 AM

Great thread, lots of interesting posts! :)

Interesting that the 6502's direct page was designed to allow that memory to be used like registers; the 64 byte memory of the Fairchild VES was comprised entirely of registers.

As pointed out CPU's like the 6809 can slide the direct page window over the entire 64k, but I don't think direct addressing and other advantages of the 6809 are what prevented Mega-Bug from being developed on the Nec TREK with it's Z-80; that system has a 6847 VDG like the CoCo but there is no SAM support chip - seems to me it would leave the CPU pretty much left to race the beam since there's no buffer on the VDG and every line of input must be synchronised and accounted for like on the 2600.

Rybags, what do you think?
  • jhd likes this

#58 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Mon Dec 3, 2012 11:09 AM

I didn't read yet of the thread.

But the title z80 vs 6502 is very interresting.

Personnaly to have programmed a lot on 6502 machine (mainly C64 and Oric) and Z80 machines ( colecovision ,spectrum and MSX).

I have to say that in that i don't like program in Z80. Not for technical reason , because Z80 is more or less as good than a 6502. But the RISC of the 6502 is more confortable and more pleasant than the CISC of the Z80. That's purely subjective.

Same thing if you compare 80x86 and 68000 programming. Having done both... i love 68000 and hate 80x86. (unfortunally i did more 80x86...)

I agree on 80x86 vs 68000. I loved the 68000. The Coldfire chips are a pretty nice evolution of the 68000 too.

RISC of the 6502?
<sigh>

Someone started this 6502 being RISC because:
A) The 6502 is pretty simple
B) It makes the 6502 sound advanced somehow
C) People clearly don't understand what RISC is

So what is RISC if the 6502 isn't?

Have a look at the ARM register set.
You don't have special indexed registers, you don't have an accumulator, instead you have 12 of general purpose registers plus stack, etc...

These are the addressing operations.
Load, Load Multiple, Soft preload, Load Exclusive, Store, Store Multipe, Store Exclusive, Swap

You also have a series of simple indexed addressing involving:
A single register and a constant, a register and another register, a register and a scaled register. That's it.

Operator instructions like math, logic, test, compare, shifts, etc... only deal with registers, none access memory direct.

Load, do your operations, store... aka load and store architecture plus lots of general purpose registers.

#59 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 24,031 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Mon Dec 3, 2012 12:01 PM

RISC of the 6502?
<sigh>

Someone started this 6502 being RISC because:
A) The 6502 is pretty simple
B) It makes the 6502 sound advanced somehow
C) People clearly don't understand what RISC is

I don't think statements like this help a rational discussion at all.

#60 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Mon Dec 3, 2012 1:09 PM

I don't think statements like this help a rational discussion at all.

Rational discussion? You threw rational out the window when stated you were going to believe what you wanted to believe.
There's no point even discussing anything with you at that point.
<edit>
FWIW, I wasn't referring to Thomas in that post, I was referring to the first people that started calling the 6502 RISC.

Edited by JamesD, Mon Dec 3, 2012 1:33 PM.


#61 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 24,031 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Mon Dec 3, 2012 1:10 PM

unsubscribed

#62 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Mon Dec 3, 2012 1:12 PM

unsubscribed

Blocked

#63 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Mon Dec 3, 2012 1:28 PM

Great thread, lots of interesting posts! :)

Interesting that the 6502's direct page was designed to allow that memory to be used like registers; the 64 byte memory of the Fairchild VES was comprised entirely of registers.

As pointed out CPU's like the 6809 can slide the direct page window over the entire 64k, but I don't think direct addressing and other advantages of the 6809 are what prevented Mega-Bug from being developed on the Nec TREK with it's Z-80; that system has a 6847 VDG like the CoCo but there is no SAM support chip - seems to me it would leave the CPU pretty much left to race the beam since there's no buffer on the VDG and every line of input must be synchronised and accounted for like on the 2600.

Rybags, what do you think?

Steve Bjork the author of the CoCo version was writing the Trek version. He's discussed this before, probably on the coco mailing list.

The TREK actually has sort of a custom chip but I don't know that the CPU and VDG go through it to access RAM. The TREK actually has the ability to switch the palette selection on the fly with a block of parameters in memory. Graphics like Dragon Fire on the CoCo can even be done from BASIC on the TREK.
Ultimately, the 6502/680X machines interleave memory access between the CPU and graphics chips, the Z80 machines don't. If a Z80 machine doesn't isolate video memory from the CPU, the CPU will have wait states.

I suppose Steve might not have had much experience on the Z80 and his code might not have been optimal, but by the way he talked it wasn't even close.
His port of Canyon Climber is certainly as good as the CoCo version, I have both and they are pretty much identical.

#64 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Mon Dec 3, 2012 2:36 PM

I got to thinking about whether the Z80 not being interleaved was a RAM speed issue or a BUSS timing issue. I still have no idea but I started thinking about BUSS signals and remembered one other thing the Z80 has that would be attractive to system designers that I haven't mentioned.

The Z80 has I/O ports and the 6502/680X use memory mapped hardware. Technically I/O ports are memory mapped hardware, they just respond in a 256 byte region overlapping memory. When you access memory, the hardware at the same address doesn't respond and vice versa due to separate buss signals for accessing them.
It makes decoding the memory addresses for the hardware very simple since you only need to decode an 8 bit address. If you don't do more complete decoding on the 6502/680X, you end up with something responding at multiple addresses. This becomes a problem if you ever want to add something in an address range that has something mirrored.
Ultimately, I/O ports save a few parts in a design, space on a board, etc... which sames money.
Technically, the Z80 could actually place an I/O port anywhere in memory. It was an undocumented feature. But then you would need more complete address decoding to select the right hardware.
If I remember right, one of the things the 64180/Z180 added was official support of this undocumented feature.

#65 Rybags OFFLINE  

Rybags

    Gridrunner

  • 16,084 posts
  • Location:Australia

Posted Mon Dec 3, 2012 7:04 PM

Doesn't Z80 perform refresh cycles from the CPU (off topic) ?

The interleave or not isn't really that important - the 6502 has wait equivalents anyway. On the C64 there's interleaved access where VIC always gets every second cycle whether it needs it or not, and it steals some of the remaining cycles from the CPU.
There's waste in that from a possible 126-130 cycles per scanline (model dependent), Vic will only ever use a maximum of 80+24+8+5 = 117 on a scanline and the CPU will only get to use some of the remaining ones.

As we know, Atari uses the /HALT signal to stall the 6502 for video accesses.

Acorn Electron I believe actually uses 4-bit Ram in a singular configuration and uses 2 memory cycles and latching to transfer data to/from CPU/memory.

There's compromises and limitations in all our old computers - the CPU in use and the advertised clockspeed aren't necessarily indicators of what sort of performance or types of limitations to expect.

A good example there is the BBC Micro - the CPU runs at the full 2 MHz, video accesses also at 2 MHz effective, due to the Ram being able to access at 4 MHz.

#66 Mr SQL OFFLINE  

Mr SQL

    River Patroller

  • 2,099 posts

Posted Tue Dec 4, 2012 8:02 AM

Steve Bjork the author of the CoCo version was writing the Trek version. He's discussed this before, probably on the coco mailing list.

The TREK actually has sort of a custom chip but I don't know that the CPU and VDG go through it to access RAM. The TREK actually has the ability to switch the palette selection on the fly with a block of parameters in memory. Graphics like Dragon Fire on the CoCo can even be done from BASIC on the TREK.
Ultimately, the 6502/680X machines interleave memory access between the CPU and graphics chips, the Z80 machines don't. If a Z80 machine doesn't isolate video memory from the CPU, the CPU will have wait states.

I suppose Steve might not have had much experience on the Z80 and his code might not have been optimal, but by the way he talked it wasn't even close.
His port of Canyon Climber is certainly as good as the CoCo version, I have both and they are pretty much identical.

James,
Canyon Climber is another great game! Bjork is a fantastic programmer, I didn't know he was working on the mega-bug port for the Z-80 too but IMO whatever memory mangement scheme the TREK used for the VDG is suspect here, not the CPU:

The Synchronous Address Multiplexor (SAM) chip feeds the VDG instead of allowing the VDG to directly access memory; this is crucial because although the VDG has this functionality it forces a wait state on the CPU - with SAM interleaving the access this never happens.

Without SAM, I suspect the NEC Trek like the Imagination Machine incurred performance limitations from wait states unless their custom solutions were commensurate.

#67 ClausB OFFLINE  

ClausB

    Stargunner

  • 1,596 posts
  • Location:Michigan

Posted Tue Dec 4, 2012 10:44 AM

I programmed both in the day and I can't pick a favorite. Each felt different when coding. 6502 is more RISCy.

BTW did you know the Z80 only has a 4-bit ALU inside? There's no performance penalty for 8-bit operations because all instructions have enough clock cycles anyway, but 16-bit operations do suffer. See p. 10:
http://archive.compu...58073.05.01.pdf

<verbosity>


Wow. I really thought that the 4-bit ALU part of my post was the interesting bit for discussion.

I do know what RISC means. 6502 does feel more RISCy to me than Z80, subjectively speaking.

Osborne's 1979 "Introduction to Microcomputers" objectively summarizes many CPUs of the day. Their instruction summary tables for 6502 and Z80 are 5 and 11 pages long, respectively.

Zero page does feel like a large register set. Indirect zero page modes are similar to Z80's (HL) instructions.

Mr. D seems to think that the more you write, the more you're right.

He did manage to alienate Herr Jentzsch, one of the real stars of this forum.

#68 nanochess OFFLINE  

nanochess

    Processorus Polyglotus

  • 5,869 posts
  • Coding something good
  • Location:Mexico City

Posted Tue Dec 4, 2012 12:24 PM

It is the problem of any thread using the "vs" (versus) word in title (check C64 versus Atari)

People starts thinking that one of the two should win. But reality is that though I understand both Z80 and 6502, I prefer more Z80. And everyone here have their own preference.

That's all folks! :)

Now go in peace and write some good code.

#69 potatohead OFFLINE  

potatohead

    River Patroller

  • 4,404 posts
  • Location:Portland, Oregon

Posted Tue Dec 4, 2012 1:07 PM

Yeah too bad. I really like these discussions. Too bad it breaks down so often.

#70 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Tue Dec 4, 2012 2:55 PM

Ok, fine... I'll start quoting experts.

You can read all about this in 'Computer Architecture A Quantitative Approach" by John L Hennessy & David A Patterson.
It's the book from my computer architecture class in college.
This comes from the section titled "Historical Perspective and References"

RISC (Reduced Instruction Set Computers) is actually first discussed in a paper by Patterson and Ditzel in 1980. There were papers published back and forth over closer coupling between compilers and computer architecture or the simplifying computer architecture which Patterson and Ditzel advocated.

Research was also going on with compilers. Now, it's important to understand as this goes hand in hand with RISC.
The entire RISC concept arose out of an argument over whether to have the architecture better support compilers, or to go with a simple architecture that was easy to support from a compiler and with efficient pipelining.

Patterson (co-author of the book) and his collegues built two projects, RISC-I and RISC-II, which is where the technology got it's name.
MIPS was a project out of Stanford. But here is the tie in to compilers. "Efficient pipelining and compiler-assisted scheduling of the pipeline were both key aspects of the original MIPS design."

"These three early machines had much in common. ... All three machines... used a simple load/store architecture, fixed-format 32 bit instructions, and emphasized efficient pipelining."

"In 1985, Hennessy published an explanation of the RISC performance advantage and traced its roots to a substantially lower CPI" (CPI = Cycles Per Instruction) "under two for a RISC machine and over ten for a VAX..."

"Since the university projects finished up, in the 1983-84 timeframe, the technology has been widely embraced by industry. Many of the early computers (before 1986) laid claim to being RISC machines. However, these claims were often born out of marketing ambition that of engineering reality."

The book goes on and on about the history, different chips, etc...


From Chapter 6, Pipelining:
"Pipelining is an implementation technology whereby multiple instructions are overlapped in execution."

So, no the 6502 isn't pipelined because it only executes one instruction at a time and I was correct in the first place.
While the wiki doesn't recognize pipelining as a key element of RISC, the designers clearly did.

How does the 6502 measure up to those standards?
The 6502 isn't pipelined, it's not designed to support a compiler in any way shape or form, it doesn't use a load and store architecture, it doesn't support a fixed format 32 bit instruction set, it has instructions taking more than 2 clock cycles... face it, the 6502 fails on every count of what was RISC is to the people who invented it.

If you want to argue that the 6502 is RISC, I suggest you talk to Hennessy and Patterson and let us know how that goes.

#71 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Tue Dec 4, 2012 3:06 PM

ClausB,
The 4 bit ALU of the Z80 is interesting.
It is largely responsible for the long cycle counts of Z80 instructions.

I think I already mentioned the 64180 and Z180 having an 8 bit ALU and a prefetch which makes them about 20% faster than a Z80 when they are in native mode. For exact Z80 timing they had added wait states.
FWIW, the Z80 was NMOS while the 64180/Z180 were CMOS technology. And the Z80 was dynamic while the 64180/Z180 were static. If you disable the clock to the Z80 for very long, the registers will loose their contents. This is true of many CPUs from that time supposedly including the 6809 though I've never tried it. The 64180/Z180 even has a sleep mode where the clock is halted to save power and it's register contents stay intact. Other Hitachi chips like the 6309 also seem to be CMOS and static in design. I would guess the 6303 is as well but I'd have to look it up in my docs.

If you follow the MSX link I posted, it has a comparison of cycle times between the Z80 and R800 from the MSX Turbo R machines.
The R800 had a 16 bit ALU and it runs all Z80 code with the exception of a few undocumented features.
Many of the instructions dropped from 15 or more clock cycles to 2.
Instructions that took 19 clocks take 5 on the R800.
The slowest Z80 instructions take up to 23 clocks... on the R800 they take 4 or 7.
A few instructions even dropped from 11 to 1 clock cycle.
That page also discusses an improved memory interface saving clock cycles.
The ROM in the Turbo R has to be copied to high speed RAM to be fast enough to keep up with the R800.

So ALU size has a huge impact on performance.

#72 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Tue Dec 4, 2012 3:17 PM

Doesn't Z80 perform refresh cycles from the CPU (off topic) ?

Yes. Already mentioned as one of the reasons companies chose the Z80 to build a machine around.
That may actually be part to the reason Z80 machines don't interleave memory access for video.

The interleave or not isn't really that important - the 6502 has wait equivalents anyway. On the C64 there's interleaved access where VIC always gets every second cycle whether it needs it or not, and it steals some of the remaining cycles from the CPU.
There's waste in that from a possible 126-130 cycles per scanline (model dependent), Vic will only ever use a maximum of 80+24+8+5 = 117 on a scanline and the CPU will only get to use some of the remaining ones.

As we know, Atari uses the /HALT signal to stall the 6502 for video accesses.

Acorn Electron I believe actually uses 4-bit Ram in a singular configuration and uses 2 memory cycles and latching to transfer data to/from CPU/memory.

There's compromises and limitations in all our old computers - the CPU in use and the advertised clockspeed aren't necessarily indicators of what sort of performance or types of limitations to expect.

A good example there is the BBC Micro - the CPU runs at the full 2 MHz, video accesses also at 2 MHz effective, due to the Ram being able to access at 4 MHz.

I understand what you are saying. It's certainly not important if you write games within the limitations of a machine. Sprite hardware definitely helps many machines. But the wait states also mean you just don't see certain CPU intensive games on those machines. The wait states eat what appears to be 8% of the cycle time on the C64, on the MSX it's about 1 wait state for every 4 clock cycles depending on the instruction. It ends up being 20% - 25%. On the TREK it could be worse. Really, it depends on how many wait states. I do think MSX could have been designed with fewer wait states and it would have opened it up to more ports from machines like the Spectrum. A lot of games on the Spectrum depended on the 1 bit per pixel display and minimal wait states.

The slower CoCo and Apple II didn't have any wait states. That allowed the slower machines to run their own version of Mega Bug.
Any wait states are compensated for by the higher clock speed on the Atari and it was able to run the game.
The game didn't make it to any more platforms in spite of the fact Datasoft supported other machines.
Maybe the TREK experience just discouraged Datasoft from trying to port games to multiple platforms, but you have to think Datasoft would have at least tried to port the game to the C64 if they thought it was capable. I would think the zoom window could be overlayed on the screen with sprites and that would cut some of the work the CPU had to do. Maybe they tried and failed but I've never heard a story about it.

In spite of the slowness of the TREK, it has a version of XEVIOUS that is pretty impressive and I've seen a lot of demos that duplicate cut scenes from later machines.

#73 JamesD OFFLINE  

JamesD

    Quadrunner

  • 8,461 posts
  • Location:Flyover State

Posted Tue Dec 4, 2012 3:23 PM

Does the C64 have the option to double size the sprites like some other machines? I'm wondering if that would be a cheap way to create a Mega Bug clone without a lot of processing time.

#74 Rybags OFFLINE  

Rybags

    Gridrunner

  • 16,084 posts
  • Location:Australia

Posted Tue Dec 4, 2012 11:41 PM

C64 sprites can be doubled in h/v direction. With hardware exploits they can be expanded further vertically but it's more of a CPU expensive trick than anything useful.

#75 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!, Boulder Dash, THREE·S, Star Castle

  • 24,031 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany, Europe, Earth

Posted Wed Dec 5, 2012 6:25 AM

With hardware exploits they can be expanded further vertically but it's more of a CPU expensive trick than anything useful.

IIRC you can use interrupts here (every 21 lines), so it is not THAT CPU expensive.





Also tagged with one or more of these keywords: Z80, 6502, vs

0 user(s) are browsing this forum

0 members, 0 guests, 0 anonymous users