Files
kaizen/external/capstone/suite/auto-sync/ARCHITECTURE.md
T
iris 00cc9309cb Squashed 'external/ircolib/' changes from ce3cd726c..de6e324bd
de6e324bd separate emu thread
10d3daf86 Roms List improvements
95d202f37 Let's make the rom list process on a separate thread so the emulator doesnt take ages to load.
fc306967f Wow the ROM Header was just completely busted. Game list view works now
bad1691ee fuck this shit
2b59e5f46 game list in progress
d26417b83 remappable inputs in progress
ac4af8106 input
e72abc240 update readme
430139dc9 Qt6 frontend
3080d4d45 Fix this small bug too
08cd13b85 Cop0 unused functions do not actually pose a threat (as per manual). They don't do anything, so shall we.
61bb4fb44 make idle loop detection a little more specific with where the load goes
b037de4c3 SAZDFsdff
12e81e73e need to figure out why n64-systemtest loops indefinitely at some address that appears to be valid (i think it's me not invalidating the cache properly)
204f0e13b idle skipping seems to work!
cb8bb634a sdkfjlasdf
58e5c89c1 Fix compilation issue on my machine (no idea)
24fb2898e attempting more serious idle skipping
214719577 Place rsp.Step inside cached interpreter. Gains about 3 more fps
bb97dcc23 mmmmm
920b77d38 wjkhasdfjhkasdf
430ccdab4 it's a start...
4f42a673a Cached interpreter plays Mario 64. Start looking into RSP as well
c9a030787 idle skipping works!
5fbda03ce new idea
366637aba Idle skipping... maybe?
609fa2fb0 Cache instructions implemented but broken lmao. Commented out for now
e140a6d12 - Stop using inheritance for CPU, instead use composition. - Introduce KAIZEN_JIT_ENABLED optional define instead of relying on __aarch64__ and the like. - More cache work
68e613057 prep cache impl
811b4d809 fix clang format
fda755f7d idk
d5024ebbf small MI refactor in preparation of (eventually) implementing the RDRAM interface properly
694b45341 Merge commit '206dcdedf195fb320913584180edb12c7731e396' as 'external/SDL'
206dcdedf Squashed 'external/SDL/' content from commit 4d17b99d0a
4d16e1cb4 need to update sdl
848b19920 Fix compilation error
db61b5299 Merge commit 'e94a94559f28e49678fbcf72199a5258137b0fe9' as 'external/imgui'
e94a94559 Squashed 'external/imgui/' content from commit 02e9b8cac
52edb3757 need to update imgui
c1a705e86 Emulate weird JALR behaviour
4b4c32f4b Fix exception for "unusable COP1" in 4 instructions i missed accidentally (again)
df5828142 Bug putting 0s in the log everywhere
f8b580048 Make isviewer a sink to file
8241e9735 Fix exception for "unusable COP1" in 4 instructions i missed accidentally
b29715f20 small changes
d9a620bc1 make use of my new small utility library
0d1aa938e Add 'external/ircolib/' from commit 'ce3cd726c8df8388d554abf8bb55d55020eb4450'
e64eb40b3 Fuck git

git-subtree-dir: external/ircolib
git-subtree-split: de6e324bde
2026-06-15 11:56:38 +02:00

6.0 KiB

Architecture of the Auto-Sync framework

This document is split into four parts.

  1. An overview of the update process and which subcomponents of auto-sync do what.
  2. The instructions how to update an architecture which already supports auto-sync.
  3. Instructions how to refactor an architecture to use auto-sync.
  4. Notes about how to add a new architecture to Capstone with auto-sync.

Please read the section about capstone module design in ARCHITECTURE.md before proceeding. The architectural understanding is important for the following.

Update procedure

As already described in the ARCHITECTURE document, Capstone uses translated and generated source code from LLVM.

Because LLVM is written in C++ and Capstone in C the update process is internally complicated but almost completely automated.

auto-sync categorizes source files of a module into three groups. Each group is updated differently.

File type Update method Edits by hand
Generated files Generated by patched LLVM backends Never/Not allowed
Translated LLVM C++ files CppTranslater and Differ Only changes which are too complicated for automation.
Capstone files By hand all

Let's look at the update procedure for each group in detail.

Note: The only exception to touch generated files is via git patches. This is the last resort if something is broken in LLVM, and we cannot generate correct files.

Generated files

Generated files always have the file extension .inc.

There are generated files for the LLVM code and for Capstone. They can be distinguished by their names:

  • For Capstone: <ARCH>GenCS<NAME>.inc.
  • For LLVM code: <ARCH>Gen<NAME>.inc.

The files are generated by refactored LLVM TableGen emitter backends.

The procedure looks roughly like this:

                                                                   ┌──────────┐
    1               2                 3                4           │CS .inc   │
┌───────┐     ┌───────────┐     ┌───────────┐     ┌──────────┐  ┌─►│files     │
│ .td   │     │           │     │           │     │ Code-    │  │  └──────────┘
│ files ├────►│ TableGen  ├────►│  CodeGen  ├────►│ Emitter  ├──┤
└───────┘     └──────┬────┘     └───────────┘     └──────────┘  │  ┌──────────┐
                     │                                 ▲        └─►│LLVM .inc │
                     └─────────────────────────────────┘           │files     │
                                                                   └──────────┘
  1. LLVM architectures are defined in .td files. They describe instructions, operands, features and other properties of an architecture.

  2. LLVM TableGen parses these files and converts them to an internal representation.

  3. In the second step a TableGen component called CodeGen abstracts the these properties even further. The result is a representation which is not specific to any architecture (e.g. the CodeGenInstruction class can represent a machine instruction of any architecture).

  4. The Code-Emitter uses the abstract representation of the architecture (provided from CodeGen) to generated state machines for instruction decoding. Architecture specific information (think of register names, operand properties etc.) is taken from TableGen's internal representation.

The result is emitted to .inc files. Those are included in the translated C++ files or Capstone code where necessary.

Translation of LLVM C++ files

We use two tools to translate C++ to C files.

First the CppTranslator and afterward the Differ.

The CppTranslator parses the C++ files and patches C++ syntax with its equivalent C syntax.

Note: For details about this checkout suite/auto-sync/CppTranslator/README.md.

Because the result of the CppTranslator is not perfect, we still have many syntax problems left.

Those need to be fixed partially by hand.

Differ

In order to ease this process we run the Differ after the CppTranslator.

The Differ compares our two versions of C files we have now. One of them are the C files currently used by the architecture module. On the other hand we have the translated C files. Those are still faulty and need to be fixed.

Most fixes are syntactical problems. Those were almost always resolved before, during the last update. The Differ helps you to compare the files and let you select which version to accept.

Sometimes (not very often though), the newly translated C files contain important changes. Most often though, the old files are already correct.

The Differ parses both files into an abstract syntax tree and compares certain nodes with the same name (mostly functions).

The user can choose if she accepts the version from the translated file or the old file. This decision is saved for every node. If there exists a saved decision for two nodes, and the nodes did not change since the last time, it applies the previous decision automatically again.

The Differ is far from perfect. It only helps to automatically apply "known to be good" fixes and gives the user a better interface to solve the other problems. But there will still be syntax errors left afterward. These must be fixed by hand.