Proposed database format for 8-bit media, need comments!

+gnusto · September 22, 2019

Hey folks, long story short I'm putting together a database for everything Atari 8-bit media. Goals:

* Establish a format by which we can exchange high resolution info on the appropriate settings (emulation) or environment (hardware) to run anything Atari 8-bit flavored on. One that is durable for decades and removes some of the need for awkward naming conventions.

* Reasonably efficient in space, but don't sacrifice bytes for any potential error such as hash collision

* Laid out so it would work in utf8 CSV, or XML, so it can cross language borders with ease

* Eventual purpose of the DB is to be a data file that can be consulted with physical hardware or run with a shim/plugin to establish exactly the right settings for an emulator

Here is the layout I am proposing, I'm interested if anyone sees modifications that need to be made. If we can get agreement that this is complete I'll write the simple code to fill in the deterministic fields (I am a C++/C#/python/bash programmer), and then we'll tackle all the "settings" stuff - I can seed it with a lot of values out of my front end setup that exists. With each category here I'll add a bit of explanation.

Categories:

Hash - (64 bytes hex SHA256) This is a checksum of the file in question, for instance a CAS image, ATR, or Basic program. Why SHA and not MD5? While MD5 is still very prevalent, it is deprecated for secure use and can be trivially confused with a collision. Some joker would seed alternate versions of files to collide with values already in the DB, and SHA isn't prone to that failure.

Common name - (256 bytes, utf8) The filesystem name of this file. For reference I would probably go with this order of authority : a8preservation -> CSS crack -> TOSEC -> Others. So as you go through that list, the first one with a match "wins". The others could still be added as duplicate variants though. Preference here is to always keep the original name, even if it has naming convention features that are redundant to this DB, e.g. "OS-B" as part of the name would be preserved.

Root Media Hash - (64 bytes hex SHA256) In the case of grouped disks/casettes/files, this value indicates the "start" media. For instance if we are looking at a disk 3 of 4, this value would indicate Disk 1. This + the values immediately below mean you can be assured all of the files necessary to run something given any single instance of those files.

Media Sequence - (1 byte). In this case 0 means the file is standalone, e.g. an xex, and a non zero value indicates where in the sequence of needed media this file exists, e.g. disk 3 of a set would be "3".

Media Count - (1 byte). Again 0 means standalone, anything else is the count of total files required. 6 disk set would mean this value would be "6"

Next Media Hash - (64 bytes hex SHA256). Hash of the media that would go after this one in sequence. This allows you to find an entire sequence of media, by going to the root, then reading forward a file a time.

Cartmapper - (1 byte value). If this is a cart, its cart mapper value (banks).

CPU Type - 1 byte enumeration. 0=6502, 1=65C02, 2=65C816 @ 1.79Mhz, 3=65C816 @ 3.58Mhz, 4=65C816 @ 7.14Mhz, 5=65C816 @ 10.74Mhz, 6=65C816 @ 14.28Mhz, 7=65C816 @ 17.90Mhz, 8=65C816 @ 21.48Mhz. Note that many methods of displaying the data would provide the human readable version, and even with the pure number, the pattern would be fairly obvious.

Host Machine Type - 1 byte enumeration. 0=400/800, 2=1200xl, 3=800xl, 3=130xe, 4=5200, 5=xegs, rest for future use. This *could* be a bitfield to show all the different configurations a given file can run.

Host Memory - 4 bytes integer (in KB). This could arguably be 3 bytes * 8, but 1 byte is worth human readable.

Bank Selection type - 1 byte enumeration. 0=normal/none, 1=Atari (130XE), 2=3 bit PIA/192K (Compy, Innovative), 3=4 bit PIA/256K (Newell), 4=4 bit PIA/256K (Rambo), 5=4 bit PIA (Atari Magazin/320K), 6=5 bit PIA/576K (Compy), 7 = 6 bit PIA(1088K), 8 = 8 bit PIA/4MB "0"

Kernel ROM hash - (64 bytes hex SHA256). Why not just an enumeration? Well, first there are in fact a fair number of official ROMs. But having it be a hash means that all future modified/hacked/optimized versions are accounted for. I thought about having another field here, a bitfield of all of the standard kernels that the file will run on, which might be helpful in addition to this. It would be interesting to write an automated tester that rolled through every rom + basic combination!

Basic ROM hash - (64 bytes hex SHA256). Same logic as above. "0" means not basic compatible file.

Video Standard - 1 byte bitfield (bit 0 = future use, bit 1 = ntsc, bit 2 = pal, bit 3 = secam, bit 4 = VXBE, bit 5 = VXBE shared memory, bit 6 = VBXE $D7xx register window)

Refresh Rate - 2 digits in Hz. Allows for NTSC/PAL 50/60 mismatches, or arbitrary weirdness.

HW Patch compatible - 1 byte bitfield (bit 0 = future use, bit 1 = cassette SIO patch, bit 2 = disk SIO patch, bit 3 = PRT SIO Patch, bit 4 = disk burst I/O, bit 5 = PRT burst I/O, bit 6 = fast boot, bit 7 = fast float). Note all bits independent.

SIO Patch modes - 1 byte bitfield (bit 0 = future use, bit 1 = SIOV patch, bit 2 = PBI device, bit 3 = SIO override detection). Bits 1 & 2 can be combined.

CIO device patches - 1 byte bitfield (bit 0 = future use, bit 1 = H:, bit 2 = P:, bit 3 = R:, bit 4 = T:, bit 5 = CIO burst

Speed - 3 byte integer (0=fast as possible, otherwise value = % speed recommended)

Video features - 1 byte bitfield (bit 0 = future use, bit 1 = artifacting recommended, bit 2 = high artifacting recommended, bit 3 = frame blending recommended, bit 4 = interlace recommended, bit 5 = scanlines recommended, bit 6 = enhanced video hardware intercept, 7 = enhanced video CIO intercept)

Audio Features - 1 byte bitfield (bit 0 = stereo, bit 1 = non linear mixing, bit 2 = serial noise, bit 3 = audio monitor, bit 4-8 are channel on/off for channels 1-4)

Audio HW - 1 byte bitfield (bit 0 = future use, bit 1 = covox, bit 2 = D2C0, bit 3 = D500, bit 4= DC600

I thought about separating categories into hardware capabilities and emulator capabilities, but it seemed redundant as you can just handle it in the parsing of data. For instance, real hardware can have stereo POKEYs, but real hardware doesn't need to know non linear mixing, it's all just how the electronics work. However if you had a field that said "non linear mixing" and you were examining it for what hardware to run on, you could just filter that field as uninteresting to actual hardware.

Having this database filled out would be a great boon to me. I play games on real hardware, but more often I'm running Altirra, and frankly there are enough permutations to make your average launch a mess. There are multiple front end attempts to get all the settings (including the great games set here which we're really thankful for), but they aren't consolidated, and sometimes emulation features have evolved past when they were authored in the first place. When complete I would write simple transforms to take the database and express it in whatever front end systems are popular (I've taken to using Launchbox these days myself for instance).

So anyway, thoughts or recommendations on changes? Fields that I am missing? Strong disagreement on format, and the reasons why on that?

phaeron · September 22, 2019

I guess you're not aware that Altirra already has a compatibility database system (not that it's utilized much currently)....

The argument for SHA256 is unpersuasive. It is true that MD5 is now considered unsafe for crypto and other secure systems, but the published attacks are primarily collision attacks with only a theoretical preimage attack. More importantly, what is the threat model -- that someone would intentionally use a large amount of computing power to cause collisions in an Atari compatibility database, to do... what? At worst, this would just cause a tool to misidentify an image and launch it with the wrong settings. This is of little consequence as long as the emulation environment is properly sandboxed. In the meantime, CRC32 has long been used to identify Atari ROM images even though it is very easy to manipulate a CRC and there have been no problems with forgeries. The only major issue with CRC32 is the collision rate due to the birthday problem, and MD5 is a suitable replacement due to its 128-bit hash and very widespread support.

There is another consideration, which is the speed of hashing. For the majority of images that are within 4K-180K, hashing speed is not a problem even with SHA256, but Altirra deals with some cartridge images that are as large as 128MB. At this size there can be a noticeable delay just verifying the cartridge sum, which is just a simple sum of all bytes. For this reason, it is undesirable for such a database to require applying an expensive hash to all images. This is even worse if there are scenarios where it might be required to match such an image as it is being incrementally written -- unlike simpler hashes like CRC32, it is intentionally impractical to compute an MD5 or SHA256 incrementally or in parallel because that would be an attack vector. For this reason, the cartridge match entries in Altirra's compatibility database only apply to images 1MB or smaller, and if it needed to support larger images a multi-hash solution would be used (size + partial hash + full hash).

For the remainder of the proposed information, there is the general problem that much of it is hardcoded to current emulation capabilities and makes a lot of assumptions about hardware features being exclusive that often aren't in practice. This includes extended RAM, for which usually only a minimum is required; video standard, for which you will almost never find software that requires PAL or SECAM but cannot work with another; hardware patch, for which there are many cases where Altirra can successfully boot titles under SIO patch that older emulators cannot; kernel ROM hash, which ignores that many titles will work with both the XL/XE (v2) and XEGS (v4) ROMs; and cartridge types, for which Altirra regularly supports new cartridge types that have no public mapper values. It's going to be very difficult to maintain the viability of such a rigid database schema as there's no way it would stay complete over time.

The approach taken by Altirra's compatibility database is that the titles are associated with a series of tags which indicate constraints on the required system configuration. This includes requiring a specific cartridge mapper, internal BASIC on/off, SIO patch off, etc. The tags are just strings, so they can be ignored when not supported and extended with new tags as required. This leads to a relatively simple database layout:

A matching table, which is keyed by value and hashing/match type (e.g. cartridge checksum = VALUE). Each one points to an alias, which is matched if all match rules used by the alias are satisfied. A program wanting to match an alias computes initial hashes, checks the matching table for matches, and then computes additional hashes as needed if there are partially matched aliases that might be completed that way. For instance, a match could be composed of both a CRC32 and an MD5, for which a program could use one or both, or even avoid computing the MD5 unless the CRC32 matches. I'd been considering upgrading the existing internal table from CRC64 to MD5 this way.
The alias table points to an entry in the title table. The motivation for this mapping is that identical images can be encoded in multiple file formats, so using aliases allows this to be handled based on matching the raw file rather than requiring a content hash, e.g. matching both an ATR and a DCM without having to parse either. It also allows for the same titles to be matched under different sets of matching rules.
The title entry in the title table points to a list of tags, which then indicates the specific configuration conditions that the emulator should check/enforce.
A tag table maps from tag IDs to tag strings, just for compactness. This table also has a human-readable description so that tools can identify the tag even if they don't support it.

Here's an example of how one of the existing entries actually works:

Current disk image has a CRC64 of 2247FA69ABD5A091.
The DiskCRC64 table has a row with key 2247FA69ABD5A091, which points to one alias. That alias only has that rule, so it is a match.
The alias points to a title called Jenny of the Prairie.
The title entry has one tag, "basic". This tag indicates that the title requires internal BASIC or a BASIC cartridge. The emulator checks if BASIC is enabled, and if not, it pops up UI suggesting to the user that BASIC is needed for this title to work.

I used JSON as the interchange format, because XML is annoying and CSV doesn't handle more than one table well.

Altirra primarily uses this system to handle the ambiguous and difficult to detect 16K cartridge mappings, where it uses hashes to match cartridge images to titles and then looks up the tags for the titles, which then indicate if 16K one-chip or 16K two-chip is required. The tag system allows the database to be both backward and forward compatible, however. If for some reason in the future it turns out that there was an emulation problem that invalidates the existing tagging, I can add v2 versions of those tags with the new correct values without impacting older versions of the emulator, which will just ignore them and continue using the original tags. It also means that entries can be very minimal, just associating a known compatibility issue with a title without having to specify a full configuration. It's rare that someone would go through the full testing to establish all of the fields listed in the OP for a particular title, but it is much more common to know that a single specific configuration option must be set a particular way.

+gnusto · September 22, 2019

2 minutes ago, phaeron said:

that someone would intentionally use a large amount of computing power to cause collisions in an Atari compatibility database, to do... what

Well "large" is a relative thing. MD5 can be collided with a laptop and an hour these days. And the case is exactly as you describe, trolls will be trolls and somebody would pollute the table with invalid data (I see it as a publicly exchanged thing, maybe here or on a Google sheet etc). But I get the consideration, sure, from a code perspective either hash is identical work, and as you say there are unusual cases where the emulated version deals with larger files.

5 minutes ago, phaeron said:

I guess you're not aware that Altirra already has a compatibility database system (not that it's utilized much currently).

I know Altirra will often guess with carts specifically, but I didn't know it went as far as disks and executables. There is also the case of file assemblies, say a basic program with a set of data files. More commonly encountered as disk images, but still. Is there a user facing way to edit this? Should I give up on this idea and just augment what exists?

8 minutes ago, phaeron said:

extended RAM, for which usually only a minimum is required; video standard, for which you will almost never find software that requires PAL or SECAM but cannot work with another; hardware patch, for which there are many cases where Altirra can successfully boot titles under SIO patch that older emulators cannot; kernel ROM hash

With respect "almost never" is one of the cases I want to expressly deal with. And while I'm aware Altirra is quite flexible in a large number of ways, I saw this as a resource for hardware users as well, and users of other emulators (maybe on a host platform Altirra doesn't support), which may have varying capabilities. So being explicit about some cases that Altirra can work it's way through was quite intentional.

10 minutes ago, phaeron said:

It's going to be very difficult to maintain the viability of such a rigid database schema as there's no way it would stay complete over time.

The fields listed now don't need to be inclusive of everything ever listed...and again, I was hoping to help hardware people as well. Should I add file foo.atr to my 130XE NTSC Ultimate cart? is a question this was meant to answer. There's been heroic efforts by many people to tag names appropriately, but sometimes you're dealing with homebrew or just poorly tagged/unknown files.

There are a couple of places I see where I could change to be alternate choices instead of mutually exclusive - host machine type I noted, memory is one you note (it was meant to be a minimum, not an exclusive choice). I should probably re-examine with that in mind, I would want future flexibility there.

17 minutes ago, phaeron said:

It also means that entries can be very minimal, just associating a known compatibility issue with a title without having to specify a full configuration

I understand the efficiency as more of a constraint system, but from a usage standpoint this might fail some cases. Examples that come to mind are titles that *work* but really need artifacting to properly play or understand, or for instance a game that might work fine under NTSC but was marketing as PAL and when played in NTSC has the wrong pacing/music (this is actually an issue that drives me bonkers personally).

It sounds like you're against this idea, and that's disappointing as an avid user of your emulator, which I am tremendously thankful for, especially your focus on accuracy. I saw this as inroads against problems I often have using both my real machines and Altirra (not problems with Altirra itself, but matching the data correctly to settings). Assuming we mostly patterned after the behavior you have now in Altirra (keeping in mind I'm looking for a pure data format that benefits more than just the one program), can you suggest the path to solve these cases:

1) As described, a game that works under NTSC but has a different play experience PAL (or vice versa) which is counter to intent of author, in the case that can be attributed.

2) Having file 1 of a set and either not knowing where the others are or not knowing which of the others are matched to the one you have.

3) The very common current case where a launch with Altirra results in failure, and you have to choose amongst a set of several dimensions which changes to make (basic, OS, memory, kernel). Maybe this just needs more value in your current DB? Maybe an optional smart parse feature that can pick constraints out of the file tags?

4) Multi-disk chains is something I was hoping to solve so a front end or plugin could just automatically arrange all the required disks on the Altirra command line (or copy them for hardware). Files are often stored in a way locally that might make them difficult to find in a dialog box quickly, and knowing ahead of time to just put all three in the Altirra disk rotation would make things really smooth.

phaeron · September 22, 2019

2 hours ago, gnusto said:

Well "large" is a relative thing. MD5 can be collided with a laptop and an hour these days. And the case is exactly as you describe, trolls will be trolls and somebody would pollute the table with invalid data (I see it as a publicly exchanged thing, maybe here or on a Google sheet etc). But I get the consideration, sure, from a code perspective either hash is identical work, and as you say there are unusual cases where the emulated version deals with larger files.

As far as I know, the rapid MD5 attacks allow producing a collision, but not with a specific desired hash. This means a troll could create new images that hashed to colliding entries in the database, but currently not reasonably force a collision with an existing title. I'm not sure this is really any more serious than the troll just submitting garbage in general, especially for an existing title -- which doesn't need an attack on the hash, just an attack on whoever's maintaining the database. And in the event that such entries did sneak in, it would be more effective to just delete the garbage rather than ensuring the lack of a collision.

Quote

I know Altirra will often guess with carts specifically, but I didn't know it went as far as disks and executables. There is also the case of file assemblies, say a basic program with a set of data files. More commonly encountered as disk images, but still. Is there a user facing way to edit this? Should I give up on this idea and just augment what exists?

I wouldn't try to hook into Altirra's system, as I'm not sure when I would work on it further. It'd be better just to think of it as a possible import/export path and a source for ideas.

The compatibility system doesn't currently have support for managing multiple images, though it could be extended to do so. The emulator does support mounting a single image from a .zip file and I have thought about extending it to automounting image sets.

Quote

With respect "almost never" is one of the cases I want to expressly deal with. And while I'm aware Altirra is quite flexible in a large number of ways, I saw this as a resource for hardware users as well, and users of other emulators (maybe on a host platform Altirra doesn't support), which may have varying capabilities. So being explicit about some cases that Altirra can work it's way through was quite intentional.

The fields listed now don't need to be inclusive of everything ever listed...and again, I was hoping to help hardware people as well. Should I add file foo.atr to my 130XE NTSC Ultimate cart? is a question this was meant to answer. There's been heroic efforts by many people to tag names appropriately, but sometimes you're dealing with homebrew or just poorly tagged/unknown files.

There are a couple of places I see where I could change to be alternate choices instead of mutually exclusive - host machine type I noted, memory is one you note (it was meant to be a minimum, not an exclusive choice). I should probably re-examine with that in mind, I would want future flexibility there.

Sure, but taking those additional uses cases into account only increases the need for constraints rather than specific values, precisely because they don't necessarily support the same set of configuration modes that Altirra does. Even Altirra doesn't support the same modes that Altirra does, after a year has gone by.

Taking memory as an example, just changing to a minimum won't be enough. There are some titles that don't require exactly 128K or at least 128K -- they require separate ANTIC access, which is side effect of the extended memory configuration. There are also titles that need not to have extended memory, because they inappropriate twiddle PORTB bits and crash.

Quote

I understand the efficiency as more of a constraint system, but from a usage standpoint this might fail some cases. Examples that come to mind are titles that *work* but really need artifacting to properly play or understand, or for instance a game that might work fine under NTSC but was marketing as PAL and when played in NTSC has the wrong pacing/music (this is actually an issue that drives me bonkers personally).

It sounds like you're against this idea, and that's disappointing as an avid user of your emulator, which I am tremendously thankful for, especially your focus on accuracy. I saw this as inroads against problems I often have using both my real machines and Altirra (not problems with Altirra itself, but matching the data correctly to settings). Assuming we mostly patterned after the behavior you have now in Altirra (keeping in mind I'm looking for a pure data format that benefits more than just the one program), can you suggest the path to solve these cases:

I'm not at all opposed to this idea. The pacing issue is one of the reasons I refuse to make the emulator default to PAL under the idea that it "just makes things work", because it also has the effect of misrepresenting a lot of signature titles with the wrong speed and aspect ratio.

What I simply believe is problematic is trying to hardcode these into fixed columns or fixed values. It's clear that the proposed properties list was modeled after Altirra's feature set as an example, but I can tell you that even before considering other use cases what's here is already out of date. For instance, current versions of Altirra don't just have a VBXE enable, they support specific core versions because of breaking changes in the different FX cores that cause software to require a specific core version. It also now supports PAL60 and NTSC50 modes, which makes the NTSC/PAL/SECAM bits ambiguous because there are both frame rate compatibility and color accuracy concerns. So, out of the gate, I can already suggest a bunch of additional columns that would be needed to cover what I've already had to deal with in the wild.

This isn't to say that you can't represent all of this in the proposed format, as you absolutely can -- you just need more columns and more enumerations. The problem is that the way it's being proposed, every time a new piece of information is needed someone needs to figure out what bits within what column it ends up in. In other words, that they need to be part of the core format specification to begin with is the problem. This is the reason I ended up with the design that I did for the compat DB, so that each time I ran into some weird problem I could just add a tag for it instead of having to modify the schema to fit in a new field or enum value, and not have to design the kitchen sink up front. Same goes for bit packing, which is an implementation detail not exposed in the JSON source format.

Quote

1) As described, a game that works under NTSC but has a different play experience PAL (or vice versa) which is counter to intent of author, in the case that can be attributed.

This would probably best be a "Published in NTSC region" or "Published in PAL region" tag (as opposed to "requires 50Hz," which would be the hard compatibility rule).

Quote

2) Having file 1 of a set and either not knowing where the others are or not knowing which of the others are matched to the one you have.

This is not something I've thought about or needed, but this would probably be served by adding an additional level of indirection: aliases -> images -> titles. Alias matches would identify the images, the set of images being mounted would then be matched to a title and indicate any other missing images. I don't think it's that common of a case that you have images that are part of a multi-disk set but can't obviously associate them by name -- ATR disk images tend to be more straightforwardly named than, say, arcade ROM image files, and Atari cartridge dumps aren't stored multi-file.

Quote

3) The very common current case where a launch with Altirra results in failure, and you have to choose amongst a set of several dimensions which changes to make (basic, OS, memory, kernel). Maybe this just needs more value in your current DB? Maybe an optional smart parse feature that can pick constraints out of the file tags?

Altirra actually has the framework to handle those, it just doesn't have the full set of tags and no one's seen it because the internal database doesn't have many entries and the UI for the external database kind of sucks. For instance, you can already create an external database with an entry for a disk or cartridge image that has tags specifying that it requires the XL/XE OS and any revision of BASIC:

image.png.607f6de58cc8a86d11346ca0675a7e15.png

Note that the compatibility DB doesn't hardcode the signature for the XL/XE OS. It just has the XL/XE OS tag, and it's the firmware database that associates a ROM image for it, which can be changed to a different ROM image. This is because in most cases requiring a specific ROM image is overspecification and modded ROMs that have Option inverted or a fast math pack will work. Same goes for BASIC, there are a lot more titles that just need a BASIC and not many that require a specific Atari BASIC revision (though they do exist).

Parsing constraints out of filenames is possible but a bit of a mess, and there are some known conflicts in the wild -- such as 'b' for either requires BASIC or bad dump.

Quote

4) Multi-disk chains is something I was hoping to solve so a front end or plugin could just automatically arrange all the required disks on the Altirra command line (or copy them for hardware). Files are often stored in a way locally that might make them difficult to find in a dialog box quickly, and knowing ahead of time to just put all three in the Altirra disk rotation would make things really smooth.

In most of the cases I've seen, just allowing multi-select/multi-drag and using some sorting rules on the filenames for common tagging would work for this. Beyond that it's going to be highly specific to the file layout used. There are some cases I can think of where the only viable way to do this is to manage the set ahead of time by path, it can't be done by signature because the images change. This happens any time one of the images is an auto-save disk or the program writes back to its own disk (The Tail of Beta Lyrae). Some front ends I've seen handle this by implicitly creating a profile associated with the launch image to store associated settings like additional media and save states. That having been said, I don't think it's a bad idea to have an image database that can identify images that are meant to be used together.

Edited September 22, 2019 by phaeron
undoing bad auto-formatting

+CharlieChaplin · September 22, 2019

There are some dozens of A8 games, that do require e.g. Basic Rev. A and will not work with Rev. B or C. Atarimania does list some of them, alas, its just a note under the addendum "instructions" and thus not searchable thru their database. Example:

http://www.atarimania.com/game-atari-400-800-xl-xe-compute-4-reversi_1204.html

You can also see with this example entry, that there are versions especially made for PAL (THE11014P) or NTSC (THE11014N), which will NOT work at all with the other standard (again, not searchable in the atarimania database). At least Thorn EMI made it easy to distinguish the program version by adding an N or P at the end of the serial number, other vendors did not do that...

Then there is the OS-A and OS-B note, only available in the instructions, not searchable in the database and the RAM requirements note in the additional comments (e.g. for Bomb Jack or Numen) that is also not part of the database and therefore not searchable. Fandal's webpage (database) does include OS-A and OS-B, as well as XRAM, but not Basic revision or PAL/NTSC standard...

Edited September 22, 2019 by CharlieChaplin

+gnusto · September 22, 2019

15 hours ago, phaeron said:

Sure, but taking those additional uses cases into account only increases the need for constraints rather than specific values, precisely because they don't necessarily support the same set of configuration modes that Altirra does.

Ah, well my original thought on this (perhaps wasteful in duplicate though) was to have multiple entries. The index of the db is the file hash and it doesn't have to be exclusive; you could have several lines that pertain to different configurations that work. Take your own BASIC implementation; i use it whenever possible due to the speed advantages, but "needs basic" doesn't indicate "works with Altirra BASIC". So in the case that it did work, two configs, "needs basic" with the Atari version that works, and then one with Altirra BASIC. If there isn't one with Altirra BASIC, then that isn't proven to work yet and might not be the first choice to launch.

Perhaps better with a bitfield again. Bitfields are generally hard to read as humans, but they have some advantages: they can be incrementally added to in the future (without invalidating previous versions), with one copy they can generate up to N bits of a single logical operation (OR/AND), and with two copies they can cover exclusions as well. If bitfields are too opaque they could be JSON or XML enumerated values though, and treated as non exclusive. If you included proven positive and proven negative as distinct enumerations, it would both be extendable and flexible. Sound better to your approach? It's basically signalling "Here are all the known variations of this thing that work, and here is the set of variations that don't. If you have something that isn't in the list, we just aren't sure".

Separately I would stress it wasn't my intent to describe the ONLY config that would work originally; just a proven config. I saw as it guidance on those worst case scenarios, "You need OS-B, it has to be NTSC, and BASIC". A smart front end or emulator would probably guess the hardest constraint is OS-B, BASIC might work with multiple versions (so if you dont have the specific one, substitute), and NTSC/PAL might be interchangeable, but maybe not.

16 hours ago, phaeron said:

It also now supports PAL60 and NTSC50 modes, which makes the NTSC/PAL/SECAM bits ambiguous because there are both frame rate compatibility

I tried to capture that one by having refresh rate distinct from video format (it's a separate value in the table). But I get your general point about being less exclusive whenever possible - oddly enough that's why bitfields are so present. When I say something is 1 byte now, that's because I can encapsulate all of the known states in a byte. A better future proof designation might be "byte stride bitfield with the top field set to 1 to determine width". That is forwards compatible, since a reader can infer from the value the potential range, and even if it is wider than expected, mask out the unknown flags for the known set. Or go with enumerations as mentioned earlier.

16 hours ago, phaeron said:

In most of the cases I've seen, just allowing multi-select/multi-drag and using some sorting rules on the filenames for common tagging would work for this.

Yes! That would solve a good number. Another case I've seen is a couple of front ends have the behavior of "unarchive this thing into a temp directory then run with that output". The archive is typically an entire disk set, and often there aren't sorting rules to provide the lexical first file as "first", so you either get a random one or more often, the LAST one, as that was the most recent file written. I understand this may not be a case you even want to deal with, though. For myself I wrote a little shim launcher that sorts everything and always presents the first file, then loads the others as further /disk parameters. Saved a ton of time when I did that with LaunchBox, which shows this behavior.

Sign In

Proposed database format for 8-bit media, need comments!

Recommended Posts

+gnusto

Link to comment

Share on other sites

phaeron

Link to comment

Share on other sites

+gnusto

Link to comment

Share on other sites

phaeron

Link to comment

Share on other sites

+CharlieChaplin

Link to comment

Share on other sites

+gnusto

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Apps

My Activity Streams

More