Jump to content
IGNORED

Volker Barthelmann's/Frank Wille's VASM assembler for the JRISC


JagChris

Recommended Posts

Jaguar related vasm updates, Sept 2016:

 

http://sun.hasenbraten.de/vasm/

 


  • mot-syntax: Retain the possibility to use multiple ORG directives and output a real binary file, as before V1.7e. Only an ORG directive after a SECTION directive is now relocated within that section.
  • mot-syntax: Added IFMI (=IFLT) and IFPL (=IFGE).
  • mot-syntax: Fixed problems when macro arguments are followed by blanks and the -spaces option is given.
  • mot-syntax: New directives inline and einline, to define an isolated block for local labels.
  • mot-syntax: New directives ifmacrod and ifmacrond to conditionally assemble a block when a macro is defined or undefined (compatible with the same directives on the Barfly assembler).
  • madmac-syntax: Make ORG behave like in mot-syntax.
Edited by JagChris
Link to comment
Share on other sites

Sept 2016 vlink updates:

 

http://sun.hasenbraten.de/vlink/

 

 


  • Reworked reading and writing of relocations. Support more complex relocations.
  • Fixed inserting little-endian relocations, which only worked before when the reloc-fields started and ended on byte-boundaries.
  • Do not include zero-bytes at the end of a section into the ininitalized file size, for those file format which support that.
  • (elf) Fixed illegal memory access when linking an ELF program without a linker script.
  • (elf32jag) Fixed MOVEI-relocation.
  • (vobj) For little-endian the bit position in a byte of a reloc-field is now counted from right to left. VOBJ files by vasm-versions older than V1.7f may be incompatible for complex little-endian relocations (e.g. ARM).
Link to comment
Share on other sites

  • 3 months later...

http://sun.hasenbraten.de/vasm/

 

 

02-Nov-2016: vasm 1.7g.
  • Avoid a crash or internal error in some output modules, when an equate refers to an undefined (imported) symbol.
  • m68k: Optimizing/translating to 68020+ addressing modes for MOVEM didn't work. For example MOVEM (40000,A0) was not automatically translated into a bd32 addressing mode, but had to be explicitely written as (40000.l,A0).
  • m68k: Optimizing MOVEM was not attempted, when using a register list symbol in one operand and a label in the other.
  • m68k: Fixed -opt-movem & -opt-speed optimization of MOVEM with two registers, which saw a wrong instruction size and moved following labels.
  • aout-output: Instructions with more than one relocation were no longer supported since V1.7f.

 

 

http://sun.hasenbraten.de/vlink/

 

 

01-Nov-2016: vlink 0.15c.
  • Fixed problems with relocation addend sign-extension, introduced in the last version.
Link to comment
Share on other sites

Now we need vbcc supporting risc like vasm....

Up until recently, I was in line waiting for something like this (for a looooong time). But 2 weeks ago I bit the bullet and dug deep into GPU syntax during evenings.

 

From this [very short, so far] experience, I found out that in that timeframe, I came up with about 2 dozen macros that emulate various aspects of most basic C - all within limits of the available syntax of smac, of course.

 

So, it's entirely realistic to just have the core,tight loops in pure ASM, and the rest of the code in easy-to-read high-level macros that handle local stack, subroutines, loops, data transfers, conditions, read/write/increment/decrement variable pointers, vblank and calling your own functions (despite GPU not really directly supporting anything like that, as there is no stack).

 

Which means, that at that point, you don't actually really need the C-to-JRISC anymore.

 

What would be immensely helpful, on the other hand, is being able to debug the inner tight loops. That in itself is, like, 2 orders of magnitude more important than full C syntax.

 

Actually, I just realized, that I could use my VRBasic backend, and tweak it for the JRISC syntax, expand those macros I have written so far, and quickly implement the 3 most important remaining C features in their proper C syntax:

- loops (for i=0; i < 10; i++),

- conditions (if (R17<27) R18++)

- simple math expressions (R11 = ((R7 + R13 - 1)*R5) >> 7) that in ASM take up lots of lines

 

 

Hmmm, great thread! I think I just might go and do the above, and make my life much simpler for the GPU coding :)

Link to comment
Share on other sites

Like I said many times in past, I don't have any more interest in publicly supporting a language that does not seem to attract a lot of coders, but this is an obscure platform, so it's all expected. It was fun as a dream few years ago, but it's clear now that effort of mine is useless now.

 

The time I would spend on adding any new VRBasic features is way more better spent on 3D experimenting with jag, or like now - learning GPU.

 

That being said, the VRBasic backend can indeed make my GPU debugging way faster and productive, as with few minor tweaks to the C# codebase, I can introduce C-style syntax, as the backend already does all the heavy language-parsing lifting (parsing, labels, copy-pasting pass-through code, expressions ...). I don't even need to implement the GPU functionality, as the backend will just copy-paste all unknown commands as-is, so all I really need to do is just to implement those 3 features (conditions, loops, math expressions). It's already integrated in my build config, so I'll just exchange the parser executable and it'll work as it is with the smac.

 

I'm currently thinking that if I actually bothered to implement the virtual machine (there's not that many instructions in GPU), I could actually use VisualStudio's debugger and step through the GPU code inside VS and see all the registers and variables live (extremely handy for what I'm doing right now - checking contents of registers after a loop has run couple hundred times). Since GPU's instructions only work on 2 operands, the implementation of the VM would be very easy - just adjust the register or memory value after executing each instruction. And I wouldn't even need all instructions. ~20 would be enough (LOAD/STORE/MOVE, ADD/SUB, DIV/MULT, SHRQ/SHLQ, NEG/ABS, AND/OR/NOT/XOR, JUMP/JR).

 

I already have the C# backend hooked up to the XNA window, so I can already right now display the contents of Framebuffer anyway, so I'd actually have a debugging set-up for jag GPU work inside VisualStudio (well, at least for software-rasterizing functionality that does not use jag's blitter, but that's what I'm working on right now anyway).

 

I never thought of this use case while working on VRBasic codebase, but looks like it can be reused for something much better. This would be actually much more productive than gdb, that I don't even have hooked up to jag yet.

 

I just spent 2 hrs chasing one texturing bug that manifests only in using corner scenarios after executing couple hundred times. If I had it running in VisualStudio, I suspect it would take about 3 minutes to find (just hit breakpoint, set it to break after 100 times), so I think I crossed the threshold whether I want to keep printing values on jag's screen or spend a weekend with implementing a simple VM for those ~20 instructions.

  • Like 2
Link to comment
Share on other sites

You should also rrad about the known bugs in the gpu and some instructions don't give the result back so you must add a nop before reading the result or write your code interleaved. And there are some rules when using external ram to make it work. Because there are situations is does not work.

Write your experiences here and questions ofcourse this thread is very interesting to read.

Link to comment
Share on other sites

It's too early for that. But, I'm going to look into gpu-in-main very soon, as I wrote lots of support/debugging code (like displaying text, converting to hex/binary/decimal, various statistics) that is absolutely not needed to run at full performance of local cache (unlike, say, texturing routine, that absolutely must run at full performance of cache), it just needs to be run when I'm hunting for bugs - but it's occupying precious cache. Right now, my support / debugging code with all variables takes up almost half of the cache, so it would be really nice to just stash it somewhere into main memory. Also, that functionality could get really fancy very soon, and I could add some local disassembler / memory dump at run-time (upon, say, a joystick press).

 

If I remember correctly (it was a long time ago I read it), it just needs to be double-phrase-aligned (for the jumps to work), or something like that. Plus, there's .gpumain directive in smac.

 

 

And I think I already ran into the problem you say (scoreboarding) few times - when I display the contents of registers at certain situations, it has different values, than it should have (even after I quadruple-check it on paper, or excel, or by displaying values via my debug routines), so I guess I should revisit those text files with the known GPU bugs, issues and glitches...

 

As for other experiences, I think the most shocking one, performance-wise, is that you can get, literally, free division if you play with interleaving. I've benchmarked one tight loop (repeat 100,000 times) and the 16-cycle overhead of the division was completely eliminated by proper interleaving. Move it one spot further, and bam, the performance visibly suffers.

 

Comparatively, the division is extremely slow on 68k. The asm code for it is, like, 4 pages (or similar). I can now just brute-force everything on GPU, and use bitshifting only occasionally. The only ugly thing about division is that it's unsigned, so you have to write your own version that handles negative numbers (but, it's not a big deal with ABS,NEG).

Link to comment
Share on other sites

  • 1 month later...

06-Feb-2017: vlink 0.15d.

  • Section-trimming, introduced with V0.15b, did not work well with ELF executables. Fixed that.
  • New options -gc-all and -gc-empty for section garbage-collection.
  • New option -Z to prevent the linker from automatically removing a section's trailing zero-bytes.
  • New output format: jagsrv. Absolute raw binary output, similar to rawbin1, but with a header to make it load and execute via the Atari Jaguar SkunkBoard or the VirtualJaguar emulator.
  • (ados/ehf): Fixed a memory leak in relocation output and optimized it.
  • (elf) Allow N_FUN stabs without a relocatable label in n_value.

http://sun.hasenbraten.de/vlink/

Edited by JagChris
  • Like 1
Link to comment
Share on other sites

 

 

14-Feb-2017: vasm 1.7h.

  • Implemented a dynamic line buffer. No limitations on line lengths anymore.
  • Octal escape sequences are limited to a maximum of three digits.
  • Allow assembler text output (echo, printv) in offset sections.
  • Print a warning for initialized data in a bss-type section. This already worked in the past (1.2c and later), but has been lost somewhere.
  • Some single-character labels and symbols will be rejected (depending on the syntax module).
  • -maxerrors=0 should print all errors in the source.
  • Print expressions in the listing file and the test output in decimal and hexadecimal form.
  • m68k: Immediate- and PC-relative destination addressing modes for 68851 PMOVE are not allowed. PMOVE ea,PCSR doesn't exist.
  • 6502: Perform zero-page optimization with a known label from an absolute section.
  • std-syntax: Fixed problem with parentheses in character constants.
  • oldstyle-syntax: New option -org=<address> to set the absolute base address of the program from the command line.
  • oldstyle-syntax: Implemented some listing file directives, but without any function yet: nam, subttl, page, space.
  • bin-output: Fixed output section sorting, which didn't work with some implementations of qsort().
  • elf-output: Fixed external references in stabs.
  • elf-output: Use a hash table for ELF symbols to speed up the output.
  • hunk-output: Optimization to make it faster with many sections.
  • test-output: Fixed crash when printing stabs without a value.

 

http://sun.hasenbraten.de/vasm/

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

I hope they do more than just "target" the Jaguar. The whole "GPU in Main" thing seems too complicated to have real world benefit - especially when you'd have to manage switching bits and pieces in and out while avoiding gotchas. If this becomes fully usable maybe someone would demonstrate if this assumption is correct.

 

Sorry to hear you dropped your BASIC project VladR. I know from experience real life issues and motivation can kill a project.

  • Like 1
Link to comment
Share on other sites

 

Which compiler would you use in conjunction with this, then? I'm not talking about the list of compatible syntax in the docs. I'm talking about what one would actually choose to use.

Well for the 68k you got a wide choice of compilers.

 

For the GPU you only have one choice.

 

http://www.3do.cdinteractive.co.uk/viewtopic.php?f=35&t=3356#p37117

Link to comment
Share on other sites

  • 2 months later...
Although it won't be able to assemble both m68k and gpu code at the same time.

 

Can you explain that a bit? Do you mean that you'll have to issue two different commands or that you'll have to use a completely separate program to do the m68k?

 

Edit: Nevermind, I think I understand now. One specifies the architecture when compiling vasm itself, as per this document: http://sun.hasenbraten.de/vasm/index.php?view=compile

Edited by rocky1138
Link to comment
Share on other sites

 

No, he can't.

 

attachicon.gifvasm.wankobanego.jpg

 

Hey, we all start somewhere :)

 

Edit: At first I thought you were poking at my newbieness, but after reading this thread all the way through, I realized that you're just a bully trying your hardest to make fun of my boy JagChris. Grow up. If you don't like something, just don't reply. You know the old adage: If you don't have anything nice to say, say nothing at all.

Edited by rocky1138
  • Like 1
Link to comment
Share on other sites

 

Can you explain that a bit? Do you mean that you'll have to issue two different commands or that you'll have to use a completely separate program to do the m68k?

 

Edit: Nevermind, I think I understand now. One specifies the architecture when compiling vasm itself, as per this document: http://sun.hasenbraten.de/vasm/index.php?view=compile

Yeah you have to assemble and link them separately.

  • Like 1
Link to comment
Share on other sites

 

 

16-May-2017: vasm 1.8.

  • External references in ORG or RORG sections are allowed.
  • Option -depend only prints relative include file names, while the new option -dependall prints all included file names, also with absolute paths.
  • m68k: Support for Apollo Core 68080 and AMMX ISA.
  • m68k: MSP, ISP and MMUSR are no valid 68060 control registers.
  • 6502: Fixed potential segfault during zero-page optimization (new since last version).
  • jagrisc: Fixed SHLQ instruction.
  • mot-syntax: Make NREF directive work for PhxAss compatibility. Allows optimization of absolute references to base-relative.
  • std-syntax: Labels ending on '$' are only local when all preceding characters are digits.
  • madmac-syntax: Fixed .long directive (which only aligned to even bytes).
  • oldstyle-syntax: New options -i (ignore everything in the operand after a blank), -noc (no C-style constant prefixes) and -noi (no intel-style constant suffixes).
  • oldstyle-syntax: Z80 supports multiple directives or instructions per line, separated by a ':' character.
  • oldstyle-syntax: Fixed parser problem with nested repeat/endrepeat blocks.
  • output-hunk: -kick1hunks must not forbid base relative relocs and references. It was supported by some 1.3 linkers (blink for example).

 

Two Jaguar specific updates for this month in VASM.

 

http://sun.hasenbraten.de/vasm/

Edited by JagChris
  • Like 1
Link to comment
Share on other sites

General Vlink updates for this month:

 

 

 

16-May-2017: vlink 0.16.
  • Fixed a potential crash when linking with empty object files, while using a linker script.
  • (ados/ehf): Support blink/slink linker symbols _RESLEN, _RESBASE, _NEWDATAL for generating resident (pure) programs.
  • (ados/ehf): Fixed SAS/C-compatibility linker symbol __BSSLEN. Now it represents the number of long words instead of the number of bytes. WARNING! Make sure to check your code, if you used __BSSLEN before!
  • (ados/ehf): AmigaOS LoadSeg() (up to V40) has a problem with allocating data-bss sections, which have an initialized size of 0. Implemented a workaround for this case.
  • (elf) Fixed crash in dynamic linking due to section-trimming.
  • (elf,aout) Malformatted library archive files are no longer fatal, but will be ignored.
  • (rawseg) Do not write output sections marked with NOLOAD.

 

http://sun.hasenbraten.de/vlink/

Link to comment
Share on other sites

  • 3 months later...
  • 4 weeks later...

I'm running vasm 1.8a and the MOVE PC,Rn instruction of the jagrisc CPUs is giving me an internal error.

 

I copied this macro from some Atari example code:

    .MACRO    JSR            ; Jump to subroutine macro
                        ;    trashes R6
    subq    4,SP        ; adjust the stack pointer
    nop
    move    PC,r6        ; determine current program address
    addq    16,r6        ; new address for after this macro
    store    r6,(SP)        ; push return address onto stack
    movei    \1,r6        ; load up subroutine address

    jump    (r6)        ; jump to subroutine
    nop                    ; jump doesn't occur until this instr

    .ENDM

And when I try to use it, I get this error in vasm:

vasm 1.8 (c) in 2002-2017 Volker Barthelmann
vasm Jaguar RISC cpu backend 0.4b (c) 2014-2017 Frank Wille
vasm madmac syntax module 0.4b (c) 2015-2017 Frank Wille
vasm vobj output module 0.8 (c) 2002-2016 Volker Barthelmann

fatal error 5 in line 4 of "JSR": internal error 0 in line 476 of cpus/jagrisc/cpu.c
        called from line 7 of "FIXED_MMULT_PRODUCT"
        called from line 349 of "dsp_matrix.jerry.s"
>       move    PC,r6           ; determine current program address

aborting...

make: *** [Makefile:43: obj/dsp_matrix.o] Error 1

Edited by Luigi301
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...