popmilo Posted September 28, 2017 Share Posted September 28, 2017 I tried to make some kind of table to avoid the classical division .. and now I've been finally successful ! Let's say we divide A/B. As I said, result is always <1 .. so A<B. So when I'm scratching the leading zeros, B always ends up with 1 in the top bit. Before last 8 bit division, B is always in range 128-256. That's significant simplification, as it's inversion (in this case 256/x) can go from only from 2 to 1. Still not very good for table. But if I subtract 1, I get range 1..0 .. or 255 to 0. Let's call this number Q. Now to get the division, I have to do A*(1+Q) .. which is same as A+A*Q .. which is simple. It gets my division from average of 251 cycles to 130 cycles, and I think there is still lot of room of improvement by low level optimization (for example at the moment I call the mul simply by jsr). On the global it lowers time consumption of division from 5% to under 3%, at the price of 128 bytes table (well that and huge scroll tables, but I have them anyway). Could you share code for this division ? Your explanation sounds logical but I can't wrap my head around it 1 Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted September 28, 2017 Share Posted September 28, 2017 Was there any specific reason why you used 16-bit? I understand your use case is quite different from mine, as I don't care for code size currently. I have coordinates up to 256 (this is not Antic 13), so I need at least 9 bits to cover the +- range: Draw_SetMode(Draw_Set); line.startPoint.nX = 0; line.startPoint.nY = 0; line.endPoint.nX = 255; line.endPoint.nY = 1; Draw_Line(&line); line.startPoint.nX = 255; line.startPoint.nY = 0; line.endPoint.nX = 254; line.endPoint.nY = 231; Draw_Line(&line); line.startPoint.nX = 255; line.startPoint.nY = 231; line.endPoint.nX = 0; line.endPoint.nY = 230; Draw_Line(&line); line.startPoint.nX = 0; line.startPoint.nY = 231; line.endPoint.nX = 1; line.endPoint.nY = 0; Draw_Line(&line); 16 bit error (correct): 8 bit error (no comment): Ideas for improvements welcome! 1 Quote Link to comment Share on other sites More sharing options...
R0ger Posted September 28, 2017 Share Posted September 28, 2017 Well typically you always draw from left to right, you just swap the coordinates when they come in the wrong order. And then you have 2 variants of the life for up and down .. so you never have one code handling 2 directions. Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted September 28, 2017 Share Posted September 28, 2017 (edited) Well typically you always draw from left to right, you just swap the coordinates when they come in the wrong order. And then you have 2 variants of the life for up and down .. so you never have one code handling 2 directions. Yes, but I have only one variant (the central patched routine) and so no swapping. Maybe I should give the fixed point variant a try... On the other hand the code belongs to a frozen project and would be fast enough for the use case... Edit: I'm unsure how swapping the coordinates would help to reduce the needed bit-depth for the error estimation?!? Edited September 28, 2017 by Irgendwer Quote Link to comment Share on other sites More sharing options...
R0ger Posted September 28, 2017 Share Posted September 28, 2017 Any line can be drawn left to right .. so you will never have negative delta X. That's how you can do it with just 8 bits. And for up and down, you have 2 variants, and again, each variant only works with positive delta Y. Quote Link to comment Share on other sites More sharing options...
VladR Posted September 29, 2017 Author Share Posted September 29, 2017 I believe he means that the Bresenham error value needs more than 8 bits once the X coordinates approach value of 256. At some older project, I was experimenting with different error formulas, which gives you option of keeping just 8 bits for the error value. Also, he mentioned he needs to have just one central code path, so the optimizations we use are unavailable to him. On the other hand, he doesn't have to tweak and test 8+ code paths , which is something I would appreciate 2 Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted September 29, 2017 Share Posted September 29, 2017 I believe he means that the Bresenham error value needs more than 8 bits once the X coordinates approach value of 256. Since the error value is signed, it's enough when 127 is crossed and that is the case in the sample. It's just this border case - in all others (shorter or more steep lines) a signed 7 bit error would be sufficient. Quote Link to comment Share on other sites More sharing options...
VladR Posted September 29, 2017 Author Share Posted September 29, 2017 Btw. kinda relevant and very good video on Elite, by big B. himself .. http://www.gdcvault.com/play/1014628/Classic-Game-Postmortem Damn, how did I miss that! But, I don't think I would have appreciated this talk the same before I embarked on the 8-bit flatshader journey Most excellent timing of you posting this link Together with this thread I again can't sleep and I transform and draw lines in my head all the time. Yeah, I keep thinking on those algorithms all the time myself too It's funny that all those fastest graphics algorithms have converged in the PC/console arena long time ago, but Atari doesn't get that research love. While A800 sure gets much more dev exposure than, say, Atari Jaguar, it's still not fully explored! I just returned from the ~5,000 km trip driving through pathetically slow U.S. interstate highways (daily avg speed across 12+hrs is unbelievably [when compared to driving in Europe] low 64 km/h), and thus had shitload of time when my brain had to do something (hell, anything, really) , as the brutally slow driving doesn't really take more than few % of your capacity anyway. I did, thus, come up with few ideas, that I'll be experimenting with during weekend: 1. Scanline traversal is NOT line computation - this is a no brainer, really - why bother computing pixels, intermediate error values, and other bullshit, when we really only need 1 pixel per each scanline ? - this gets us to idea 2 -> 2. Use the fact that Steep line computation matches scanline traversal (e.g. each new point computed is guaranteed to be on a new scanline) - this is a prime example for trying out fixed-point calculations on 6502 (I'll go try 8. - I did a lot of them on jaguar (where they were almost free & very simple & very precise (32-bit registers)), so I'm curious to see how they'll work out on 6502 - even if they don't turn out to be fast, the implementation still might spark some other ideas (and exchange on this forum), so I think it's worth it - this gets us to idea 3 -> 3. Use a separate algorithm for Non-Steep lines - why ? Because we need only 1 point of the edge per scanline - why is is a problem ? Because a non-steep line that is 41 pixels long, across 4 scanlines, only needs 4 pixels computed, yet we compute remaining 37 that we don't need ! - 35/40 = 0.902 -> 90.2% of computations are thrown away - 90% ? Hmm, if *that* does not qualify for triple-facepalm.gif, I don't know what does... 4. I swear I saw a short glimpse of the look-up-table that just might completely bypass line computation - this is a variation of what I came up with for Klax demo on jaguar (which was in C) - I'll play this one out in excel first, then C, then ASM 2 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted September 29, 2017 Share Posted September 29, 2017 did I not mention scanline traversal (dda)? my first poly fillers worked in 8bit integer so no 8.8 etc... I only went to more precise approach when dealing with gouraud and subpix rendering. Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted September 29, 2017 Share Posted September 29, 2017 so I need at least 9 bits to cover the +- range Actually I'm busy with other stuff, but this kept bothering me. The "solution" is so simple: The carry flag is already the ninth bit. Lots of lines can be removed now: "el" is max(dx,dy) nextPoint: lda err sec SMC SubEs, { sbc #SMC_Value } sta err bcs quickStep ; was: bcs testErr ; not needed: dec err+1 ; not needed: testErr: ; if (err<0) ; not needed: lda err+1 ; not needed: bpl quickStep SMC AddEl, { lda #SMC_Value } ; not needed: clc adc err sta err ; not needed: bcc noHighInc ; not needed: inc err+1 noHighInc: SMC ChangeSlowPosition, { inc SMC_ZpAdr } ; All (inc/dec and x0/y0) is subject of change! quickStep: SMC ChangeQuickPosition, { inc SMC_ZpAdr } ; All (inc/dec and x0/y0) is subject of change! PlotPixel: ## PIXEL IS PLOTTED HERE ## dec el bne nextPoint Thank you all for let this rolling again in my "think-bowl"! 2 Quote Link to comment Share on other sites More sharing options...
VladR Posted September 29, 2017 Author Share Posted September 29, 2017 I did a quick proof-of-concept with 8.8 fixed-point. The inner loop is generic, I'm currently only using manually-computed fractional step value (to be implemented later, probably via some 256 Byte table). Example line: dx = 8 dy = 5 step = 8/5 = 1.6 -> Step.Hi = INT (1.6) = 1; Step.Lo = 153 (0.6*256) Point [0] = 1:153 (no need to compute, as it is equal to Step) Point [1] = 3:50 Point [2] = 4:203 Point [3] = 6:100 Point [4] = EndPoint (no need to compute,just copy dx2) LDA ( #1 ) STA ( _StepHi ) CLC ADC ( _xp ) STA ( _ValueHi ) LDA ( #153 ) STA ( _StepLo ) STA ( _ValueLo ) LDY ( _dy ) // Y = dy (Loop Counter) DEY // We don't compute last point DEY // We only do inner loop (dy-2) times LDX ( #0 ) // X = scanline (0-based) // 42c per point R_LE_NonSteepScanlineLoop03: LDA ( _ValueHi ) STA ( _edgeL,X ) // edgeL [scanline] = ValueHi // ValueHi += StepHi CLC ADC ( _StepHi ) STA ( _ValueHi ) // ValueLo += StepLo CLC LDA ( _ValueLo ) ADC ( _StepLo ) BCC ( R_LE_NSSLBypass03 ) INC ( _ValueHi ) R_LE_NSSLBypass03: STA ( _ValueLo ) // Loop Next INX // ToDo: Merge both indices(and reverse the order of writing) DEY BNE ( R_LE_NonSteepScanlineLoop03 ) // Store the last value that only got computed LDA ( _ValueHi ) STA ( _edgeL,X ) // edgeL [scanline] = ValueHi INX // Store Endpoint to avoid holes LDA ( _xp2 ) STA ( _edgeL,X ) // edgeL [scanline] = xp2 - So, the inner loop gets executed only dy-2 times (as we don't have to compute first and last point). - This is in huge contrast with Bresenham, where if the line was 256 pixels long (yet, just across 5 scanlines), its inner loop would get executed way too many useless times (despite us only needing 5 xpos values for each scanline's xpos). - Yet, I retain the full precision,as 8 bits give me precision of 256, which is totally enough for my resolution. - To my surprise, I didn't need to do any bitshifting (as I was initially expecting), it's really just a straightforward adding. - the line is really nice - not exactly 100% identical to Bresenham, but way nicer than my older double-step approach and I actually consider this full quality and don't believe one could spot the visual difference in motion at all - this also answers the question for sub-pix precision implementation costs I had earlier I should be able to slightly optimize it further by: - using just 1 index register for the loop, though this needs reversed order of writing and computation - reusing the other register for hi or lo value, thus saving couple cycles compared to reading from page-0 3 Quote Link to comment Share on other sites More sharing options...
VladR Posted September 29, 2017 Author Share Posted September 29, 2017 did I not mention scanline traversal (dda)? my first poly fillers worked in 8bit integer so no 8.8 etc... I only went to more precise approach when dealing with gouraud and subpix rendering. I know, but in all honesty, I don't really know the DDA algorithm from the top of my head, so it's entirely possible I just reinvented the wheel here I'm just fumbling my way through this, but discovering this stuff is 90% of fun! Actually I'm busy with other stuff, but this kept bothering me. The "solution" is so simple: The carry flag is already the ninth bit. Lots of lines can be removed now: Thank you all for let this rolling again in my "think-bowl"! Yes, using the Carry flag as a 9-th bit doubled a precision for me, when I was doing the 8-bit world-space transforms, so it was way more than worth doubling the codepath size. I think I should start counting the number of bytes too (not just the number of cycles), as I just keep doubling each codepath left and right, like there's no tomorrow (but, it still should all fit within 128-512 KB) 3 Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted September 29, 2017 Share Posted September 29, 2017 I think I should start counting the number of bytes too (not just the number of cycles), as I just keep doubling each codepath left and right, like there's no tomorrow (but, it still should all fit within 128-512 KB) Just reduced some more: "el" is max(dx,dy) nextPoint: lda err sec SMC SubEs, { sbc #SMC_Value } bcs quickStep SMC AddEl, { adc #SMC_Value } noHighInc: SMC ChangeSlowPosition, { inc SMC_ZpAdr } ; All (inc/dec and x0/y0) is subject of change! quickStep: sta err SMC ChangeQuickPosition, { inc SMC_ZpAdr } ; All (inc/dec and x0/y0) is subject of change! PlotPixel: ## PIXEL IS PLOTTED HERE ## dec el bne nextPoint Line routine is now ~25% faster. (Lots of cycles wasted due to size restriction by JSRing to the central, somewhat slow Plot routine) 1 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted September 30, 2017 Share Posted September 30, 2017 (edited) just to give you an idea here is the lua/pico-8 of my polydrawer: function dopoly(facecol) maxy=0 miny=128 if tflag==0 then -- scanedge xx1=x1 yy1=y1 xx2=x2 yy2=y2 doline() xx1=x2 yy1=y2 xx2=x3 yy2=y3 doline() xx1=x3 yy1=y3 xx2=x1 yy2=y1 doline() for y=miny,maxy-1 do line(ledge[y],y,redge[y],y,facecol) end end function doline() local buffid=0 -- right buffer if yy2<yy1 then -- swap points temp=xx2 xx2=xx1 xx1=temp temp=yy2 yy2=yy1 yy1=temp buffid=1 --left buffer end local deltax=(xx2-xx1) local deltay=(yy2-yy1) if deltay<=0 then return end local prestep=flr(yy1+0x0.ffff)-yy1 local xstep=deltax/deltay xx1=xx1+prestep*xstep local sy1=flr(yy1) local sy2=flr(yy2) if sy1<miny then miny=sy1 end if sy2>maxy then maxy=sy2 end if buffid==0 then for y=sy1,sy2-1 do local sx=xx1 redge[y]=sx xx1=xx1+xstep end else for y=sy1,sy2-1 do local sx=xx1 ledge[y]=sx xx1=xx1+xstep end end end function dopyramid() zoomstep=(zoomstep+1)%180 if tflag==0 then zoom=sin((zoomstep+180)/360)/3.0 --1.7 --1.85 end if tflag==1 then zoom=sin((zoomstep+180)/360)/4.0 --1.7 --1.85 end -- local zoom=0.50 local ssinx=sin(rotanglex) local ccosx=cos(rotanglex) local ssiny=sin(rotangley) local ccosy=cos(rotangley) local ssinz=sin(rotanglez) local ccosz=cos(rotanglez) for v=1,max_vertices3 do local px=pxtab3[v]*zoom local py=pytab3[v]*zoom local pz=pztab3[v]*zoom --z axis local stxx=ccosz*px-ssinz*py local styy=ssinz*px+ccosz*py --y axis local stxxx=ccosy*stxx+ssiny*pz local stzz=-ssiny*stxx+ccosy*pz --x axis local styyy=ccosx*styy-ssinx*stzz local stzzz=ssinx*styy+ccosx*stzz xxtab[v]=256*stxxx/(stzzz+5)+mx yytab[v]=256*styyy/(stzzz+5)+my zztab[v]=stzzz end for f=1,max_faces3 do x1=xxtab[facetab3[f]] y1=yytab[facetab3[f]] x2=xxtab[facetab31[f]] y2=yytab[facetab31[f]] x3=xxtab[facetab32[f]] y3=yytab[facetab32[f]] -- 2d cull test doculltest() if culltest>0 then c1=zztab[facetab3[f]]*15+16 c2=zztab[facetab31[f]]*15+16 c3=zztab[facetab32[f]]*15+16 dopoly(sget(f,7)) end end end so should be clear. it works in subpix. and this was backported to pico-8 from my 6502 engine . Edited September 30, 2017 by Heaven/TQA Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted September 30, 2017 Share Posted September 30, 2017 here is just a part from Lynx "Elements" demo: CalcEdge_new ; linebufid.a=0 ;left buffer ; If y2<y1 ; Swap x1,x2 ; Swap y1,y2 ; linebufid.a=1 ;right buffer ; EndIf ; deltax=(x2-x1) ; deltay=(y2-y1) ;>>5 ;subpix fraction ;sort already done so y2>y1 lda x2Edge sec sbc x1Edge sta dx lda x2Edge+1 sbc x1Edge+1 sta dx+1 bpl signcheck2 jmp CalcEdge_new_signed signcheck2 lda y2Edge sec sbc y1Edge sta dy lda y2Edge+1 sbc y1Edge+1 sta dy+1 ; If deltay<1 ; ProcedureReturn ; EndIf cmp #1 bcc calcexit ;dynegative ; xstep.w=deltax<<8/deltay ;32bit div 16 bit = 16bit lda dx ;dx/dy sta T1 lda dx+1 sta T1+1 lda dy sta T2 lda dy+1 sta T2+1 jsr do_math_copo_div lda product sta xstep lda product+1 sta xstep+1 ; subpix.a=((y1!$ff)&$ff) lda y1Edge eor #$ff sta subpix ; x1=x1+((subpix*xstep)>> ;prestep; lda subpix ;subpix*xstep sta T1 stz T1+1 lda xstep sta T2 lda xstep+1 sta T2+1 jsr do_math_copo_mul lda x1Edge clc adc product+1 sta x1Edge lda x1Edge+1 adc product+2 sta x1Edge+1 ; screeny1.a=y1>>8 ;not necessary hi byte oly ; screeny2.a=y2>>8 ;not necesshi byte only ldy y1Edge+1 ldx x1Edge nnewedgeloop lda x1Edge+1 nbp sta ledge,y txa nbplo sta ledge_lo,y ;sta $fda0 clc adc xstep tax lda x1Edge+1 adc xstep+1 sta x1Edge+1 iny cpy y2Edge+1 bne nnewedgeloop calcexit rts Quote Link to comment Share on other sites More sharing options...
VladR Posted September 30, 2017 Author Share Posted September 30, 2017 Generally, I intend to have 2 versions of scanline traversal: 1. Nice (currently Bresenham, but would like to replace it with fixed point) - for cutscenes 2. Fast (currently, unrolled Step algo) - for combat I am absolutely hooked on the simplicity, elegance, speed and quality of the 8.8 fixed point line calculation in the inner loop. This does seem like a bull's eye hit, indeed, as it just checks all the boxes. There's just one remaining problem - the initialization - obtaining the fractional part (e.g. Step.Lo) of the 8.8 in a fast and memory-reasonable way. I'll use same example from before: Example line: dx = 8 dy = 5 step = 8/5 = 1.6 -> Step.Hi = INT (1.6) = 1; Step.Lo = 153 (0.6*256) Point [0] = 1:153 (no need to compute, as it is equal to Step) Point [1] = 3:50 Point [2] = 4:203 Point [3] = 6:100 Point [4] = EndPoint (no need to compute,just copy xp2) So, how do we get to the Step.Lo = 153 ? dy = 5 1 / dy = 0.20 (floating point) 0.20 * 256 = 51 (8-bit fixed point) dy5LUT [] = {51, 102, 153, 204} The index to to dy5LUT is the remainder from the division of 8/5 minus 1 -> e.g. (3-1) = 2 dy5LUT [2] = 153 These are the options that I can see at this point: Option 1: Use huge table, where for each X (by X, I mean deltaY of the line), we store X-1 values (the fixed point step value) - This however takes up 12.6 KB (range <0, 159>), which is just too much for something like this. Option 2: Use full-precision table for smaller lines, and progressively lower precision, the longer the line is - 100% coverage: the range <0,47> takes up 1,128 Bytes, and that's a line that's half the screen height - 50% coverage: the range <48,95> takes up 1,740 Bytes (instead of 3,480), at the cost of 1 bitshift, and very minor error (around 0.7%) - 25% coverage: the range <96,159> takes up 2,056 Bytes (instead of 8,224), at the cost of 2 bitshifts, and still small error (around 2.3%) Option 3: Store just 1 fixed point value, and multiply it by the remainder - This however takes only 160 Bytes, as we store just the 1/X value per each X - I played it out in excel and the error begins very small, but gets to 9% around line length 32, so I didn't really continue with the table, as ~10% feels like too much too soon - this also means doing multiplication via adding, which is a problem for lines that have big discrepancy between dx and dy - thus, the greatest downside of this is unpredictability, where it suddenly might take up way more cycles than regular Bresenham - I don't know exactly where that performance threshold is, but it's clear you don't need very long lines for this to happen and I don't like unpredictable algorithms Option 4: Split the line into 2 halves - this halves the memory cost, but adds code complexity and cycles to handle the transition - While I haven't implemented that yet, I've seen it somewhere,and it just looked too ugly (around the midpoint) - I think I'd rather use my current step codepath, than do this - this seems like too much work for too little visual benefit I'm currently heavily leaning towards Option 2, where I'd have full precision for range <0,47> and double-bitshift precision for range <48,159> for less than 4 KB of RAM (which is really about max I'm willing to devote for this - but I get a feeling that 4 KB of LUT of the 4th option I mentioned few posts above will give me better results - but won't really know till I work it out in greater detail). I think I should go and examine the decimal mode of 6502 - it just might provide some additional features that might render the fixed-point useless (but that's just a guess at this point). I welcome any ideas on the matter above. 1 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted September 30, 2017 Share Posted September 30, 2017 (edited) ... Edited October 1, 2017 by Heaven/TQA 2 Quote Link to comment Share on other sites More sharing options...
VladR Posted October 1, 2017 Author Share Posted October 1, 2017 (edited) I just googled PICO-8 and found out what it is. Now I understand why your code is in LUA (which was quite a major WTF for me ). Out of curiosity - why do you prefer doing stuff for virtual console, rather than , say, physical Atari ? What exactly hooked you on ? EDIT: I just browsed through the forums and noticed the compos and such - I can now see the retro appeal - just can't get over the lua thing... Edited October 1, 2017 by VladR 2 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted October 1, 2017 Share Posted October 1, 2017 (edited) Released yesterday.... For me it was just fun... and no overhead in terms of directx etc and as I dont know c++ etc I can not code PCs So I stick to retro consoles and Popmilo mentioned pico some years ago and now suddenly it was time to play around and I really like it... built in basic 128x128 16 colors 32k... well And lua sux big but kept pico going once you get around its specialities.... As you see the complete code was coded in several weeks while all the 3d shit was done in days. Even the fractalus part... no comparison to 6502 Edited October 1, 2017 by Heaven/TQA 2 Quote Link to comment Share on other sites More sharing options...
Sheddy Posted October 1, 2017 Share Posted October 1, 2017 (edited) Sorry for the offtopic, but I don't get the pico8 choice of 128x128. Surely 256x256 would be much closer to early 8bit and console and you start to get some nice sprite definition to screen size about that point. (We all know that even 160x is not really that great for sprite games)... Edited October 1, 2017 by Sheddy Quote Link to comment Share on other sites More sharing options...
VladR Posted October 1, 2017 Author Share Posted October 1, 2017 Released yesterday.... For me it was just fun... and no overhead in terms of directx etc and as I dont know c++ etc I can not code PCs So I stick to retro consoles and Popmilo mentioned pico some years ago and now suddenly it was time to play around and I really like it... built in basic 128x128 16 colors 32k... well And lua sux big but kept pico going once you get around its specialities.... As you see the complete code was coded in several weeks while all the 3d shit was done in days. Even the fractalus part... no comparison to 6502 I do really enjoy seeing the Fractalus in a clean, smooth motion ! What kind of performance / instruction throughput do you have available on PICO ? I didn't find any performance specifics. Sorry for the offtopic, but I don't get the pico8 choice of 128x128. Surely 256x256 would be much closer to early 8bit and console and you start to get some nice sprite definition to screen size about that point. (We all know that even 160x is not really that great for sprite games)... I guess they just wanted something closer to A2600, or perhaps a handheld type of thing. 128x128 is basically close to a narrow mode of 160x192 on A800... Quote Link to comment Share on other sites More sharing options...
+MrFish Posted October 1, 2017 Share Posted October 1, 2017 (edited) 128x128 is basically close to a narrow mode of 160x192 on A800... Closer to narrow mode for Antic D: 128 x 96... Edited October 1, 2017 by MrFish 2 Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 1, 2017 Share Posted October 1, 2017 Sorry for the offtopic, but I don't get the pico8 choice of 128x128. Surely 256x256 would be much closer to early 8bit and console and you start to get some nice sprite definition to screen size about that point. (We all know that even 160x is not really that great for sprite games)... My guess because it was enough ? Limits like this made it so much more productive. For 256x256 you need a good artist to make a likable game. In smaller resolution you can draw ten pixels and it looks like a small human 1 Quote Link to comment Share on other sites More sharing options...
Sheddy Posted October 1, 2017 Share Posted October 1, 2017 Hmm, guess it's enough to be educational, but if it's about limits why do you get 128 sprites! Still, looks fun. 3 Quote Link to comment Share on other sites More sharing options...
emkay Posted October 1, 2017 Share Posted October 1, 2017 Hmm, guess it's enough to be educational, but if it's about limits why do you get 128 sprites! Still, looks fun. Looks a lot inspired by the Gameboy specs. 2 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.