WFDis Documentation

What is this?

WFDis is an AI-based automated reverse engineering project for binary executables, which is currently in development.

Because of the large resources required it uses a client/server model, with a web browser for the front end. With the addition of a simpler tracing disassembler, the front end has become useful in its own right, and is made available for public testing and use for 6502 binaries. The full back end is not yet public.

The current focus is creating all the UI elements and interactions the system will wish to express. This public version is "Human Mode" reverse engineering, with the user manually discovering traits of the binary and triggering UI elements to describe them; while the server-based version is "AI Mode" which automatically fleshes out a descriptive understanding of the code, reporting through this same presentation substrate.

Why 6502?

These 8-bit environments are much simpler to instantiate than modern platforms, but offer far broader challenges in reverse engineering:

This promotes focusing development effort on WFDis's unique "smarts" rather than on reimplementing myriad platform details. However, the infrastructure is already designed to express modern CPUs, memory systems, and abstract OS interfaces. While there is less of this low-level code style today, analysis that can tackle it is still absolutely applicable for malware, kernels, drivers, embedded systems, and debugging compiler output.

There is also strong general interest in these older computing platforms, with a generational wave of nostalgia for popular home computers and video games from a few decades back. The 6502 was used in many systems from Commodore, Atari, Apple, and Nintendo, just to name a few.

Supported File Formats

The specific load behavior is based on the extension of the provided file's name.

Files without a recognized filename are assumed to be in .prg format. The .prg loader is also used for BASIC and SID file formats, autodetecting based on file contents.

BASIC programs and SID files are automatically disassembled from their known entry points. Raw bin/rom images are disassembled from their $fffx vectors, if the image overlaps those locations. Otherwise, the disassembly must be manually begun by selecting an address and pressing Shift-a.

If a Commodore 64 file format is loaded, labels for various ROM routines and I/O locations are automatically created.

Numeric Formats

Inputs that require an address or value can accept multiple formats:

Importing Labels

A file defining labels can be imported, which affects only the current overlay. VICE label files (which can be generated by ca65) are supported, as well as a more freeform syntax. Semicolons indicate comments.

Sample lines:

The regex for separating label & address is (=|:| \.?[eE][qQ][uU]? ) which should support enough variation for common cases.

Emulation

Many media-loaded programs contain decompression or relocation code before actually getting to the software itself. A rudimentary emulator is included to attempt to run such routines and capture their output for further disassembly.

Select the entire code block to emulate, and press Shift-r to emulate that section. The emulation will exit if the PC reaches the instruction after the last selected instruction. All reads (including read-modify-write) must be from known loaded bytes, or from bytes that the emulated code has already written. This often fails when routines update visuals to show progress (e.g. inc $d020).

Saving

The disassembled context can be saved to browser localStorage. Ensure that browser privacy settings, or plugins like Self-Destructing Cookies, do not automatically wipe out your saved information.

It can also be downloaded to a .wfdis file on your computer, but because of the manual process on each save due to browser security measures this is not the default.

Bugs & Upcoming Features

Fixed Limitations

This Human Mode version does not support the following features. Support will only come through the full AI Mode version.

Credits

NMOS illegal opcode naming conventions are sourced from All About Your 64. There are many different mnemonics for these opcodes; this is the list I'm most familiar with.

PETSCII to Unicode mappings are from the work on Recode here.

Changelog

2024-04-19:

2024-04-14:

2019-02-17:

2018-12-24:

2018-11-25:

2018-07-14:

2017-11-04:

2017-10-14:

2017-09-12:

2017-08-29:

2017-08-10:

2017-07-26 :

2017-07-07 :

2017-04-02 :

2017-03-25 :

2017-03-11 :

2017-03-01 :

2017-02-19 :

Contact

The WFDis thread on the 6502.org forums is a good place to post, or PM user White Flame there.