Willsy Posted December 2, 2017 Share Posted December 2, 2017 What's the accepted way to disassemble an op-code into an instruction mnemonic? I mean, given an op-code, how does one first identify which instruction format the op-code belongs to? There must an efficient sequence to go through but starting at the op-code formats and their op-code fields is not inspiring me. Anyone? Quote Link to comment Share on other sites More sharing options...
+mizapf Posted December 2, 2017 Share Posted December 2, 2017 In TIImageTool's disassembler, I keep the mnemonics with their base opcode in a table, together with a number specifying the format, and the number is an index in a table with masks. For each presented opcode, I run through the list of mnemonics, applying the associated mask to the opcode, and compare the result to the base opcode. If they match, I know the command and the format and can then continue with the operands. Quote Link to comment Share on other sites More sharing options...
+mizapf Posted December 2, 2017 Share Posted December 2, 2017 In MAME's emulation of the TMS 99xx, I create a kind of B-tree with four levels. Every node may have up to 16 children. The tree is traversed in up to four steps, according to the four hex digits. Commands may appear more than once in that tree. Suppose that the machine instruction is 0460; in that case, we start with child 0 of the root node, then go to its child 4. In the next level, children 4, 5, 6, and 7 all point to the B microprogram. Since B is defined to have a 10 bit opcode, the search is terminated at that point, and control is transferred to the B microprogram. I had to do this tree search because I could not afford a linear search through the list every time the emulated 99xx encounters an operation. This is only possible because the opcode is a continuous bit string starting at the left; also, we know that every opcode is at least 4 bits long. Quote Link to comment Share on other sites More sharing options...
Stuart Posted December 2, 2017 Share Posted December 2, 2017 If you want to do it manually, then look at section 24.9 of the E/A manual, which lists op-codes and instructions. Then looking at your code, for many instructions you can identify the op-code just by looking at the first 1, 2 or 3 hex digits, without having to get into instruction formats or fields. For example, if the op-code starts with a C then it's a MOV instruction. A 1D is SBO. 09 is SRL. 020 is LI. And so on. Quote Link to comment Share on other sites More sharing options...
insomnia Posted December 2, 2017 Share Posted December 2, 2017 There's a working implementation in the tms9900 binutils package in binutils-2.19.1/opcodes/tms9900-dis.c. Look for the print_insn_tms9900 function. This function does basically the same thing mizapf describes. Here's some pseudocode: index = (opcode >> 12) & 0x0F switch(index) { case 0: index = (opcode >> & 0x0F switch(index) { case 0, 1, 12, 13, 14, 15: format[] = {"","","","","","","","","","","","","","","",} break case 2, 3: index = (opcode >> 4) & 0x1F format[] = {"li","", "ai","", "andi","", "ori","", "ci","", "stwp","", "stst","", "lwpi","", "limi","", "idle","", "rset","", "rtwp","", "ckon", "", "ckof","", "lrex", "","","",} break case 4, 5, 6, 7: index = (opcode >> 6) & 0x0F format[] = {"blwp", "b", "x", "clr", "neg", "inv", "inc", "inct", "dec", "dect", "bl", "swpb", "seto", "abs", "", ""} break case 8, 9, 10, 11: format[] = {"", "", "", "", "", "", "", "", "sra", "srl", "sla", "slc"} break } break case 1: index = (opcode >> & 0x0F format[] = {"jmp", "jlt", "jle", "jeq", "jhe", "jgt", "jne", "jnc", "jno", "jl", "jh", jop", "sbo", sbz", "tb"} break case 2, 3: index = (opcode >> 10) & 0x07 format[] = {"coc", "czc", "xor", "xop", "ldcr", "stcr", "mpy", "div"} break default: format[] = {"", "", "", "", "szc", "szcb", "s", "sb", "c", "cb", "a", "ab", "mov", "movb", "soc", "socb"} break } if(format[index] != "") decode as format[index] else invalid instruction There's more to it than this, but this should give you somewhere to start. Good Luck! 1 Quote Link to comment Share on other sites More sharing options...
ralphb Posted December 4, 2017 Share Posted December 4, 2017 It's not a straight lookup, unless you want a lookup table with 65536 entries. Each instruction format has a particular opcode length associated, so you have to iterate over all formats and see if the word you have anded with the format's opcode length matches any opcode with that format. To see an implementation, see xda99.py in the xdt99 suite, search for "def decode". 1 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted December 13, 2017 Share Posted December 13, 2017 9900 instruction decoding is pretty straight forward, especially when compared to the 8-bit CPUs of the day. The 9900 is very orthogonal and uses 7 (depending on how you count) so-called "instruction formats": 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 |11 |12 |13 |14 |15 | ---------------------------------------------------------------+ 1 arith 1 |opcode | B | Td | D | Ts | S | 2 arith 0 1 |opc| B | Td | D | Ts | S | 3 math 0 0 1 | --opcode- | D or C | Ts | S | 4 jump 0 0 0 1 | ----opcode--- | signed displacement | 5 shift 0 0 0 0 1 | --opcode- | C | W | 6 pgm 0 0 0 0 0 1 | ----opcode--- | Ts | S | 7 ctrl 0 0 0 0 0 0 1 | ----opcode--- | not used | 7 ctrl 0 0 0 0 0 0 1 | opcode & immd | X | W | To isolate the format you use a priority encoder. Notice that in the first 7-bits there will only *ever* be one bit that is set. After that the opcode field is 1..4 bits depending on the format, which means there are are no more than 16 instructions in the most complex format. The other bits are used in various ways to identify what registers are being operated on, what kind of memory operations, etc. Notice, for example, that the source (S) field is *always* in bits 12..15 and the source-mode (Ts) is always bits 10..11, in all the formats that use Ts and S. Same for Td, D, and W. If you approach this from a hardware perspective instead of a functional model (i.e. don't think like a programmer), it is really very easy to take apart the instructions and figure out what they are, and what they operating on. 7 Quote Link to comment Share on other sites More sharing options...
Willsy Posted December 13, 2017 Author Share Posted December 13, 2017 Thank you Matthew. That is the clearest explanation I've ever seen. I was wondering how the 9900 did it. That's how. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.