MAME Debugger¶
Introduction¶
MAME includes an interactive low-level debugger that targets the emulated system. This can be a useful tool for diagnosing emulation issues, developing software to run on vintage systems, creating cheats, ROM hacking, or just investigating how software works.
Use the -debug command line option to start MAME with the debugger
activated. By default, pressing the backtick/tilde (~) during
emulation breaks into the debugger (this can be changed by reassigning
the Break in Debugger input).
The exact appearance of the debugger depends on your operating system and the options MAME was built with. All variants of the debugger provide a multi-window interface for viewing the contents of memory and disassembled code.
The debugger console window is a special window that shows the contents of CPU registers and disassembled code around the current program counter address, and provides a command-line interface to most of the debugging functionality.
Debugger commands¶
Debugger commands are described in the sections below. You can also type help <topic> in the debugger console, where <topic> is the name of a command, to see documentation directly in MAME.
Specifying devices and address spaces¶
Many debugger commands accept parameters specifying which device to operate on. If a device is not specified explicitly, the CPU currently visible in the debugger is used. Devices can be specified by tag, or by CPU number:
Tags are the colon-separated paths that MAME uses to identify devices within a system. You see them in options for configuring slot devices, in debugger disassembly and memory viewer source lists, and various other places within MAME’s UI.
CPU numbers are monotonically incrementing numbers that the debugger assigns to CPU-like devices within a system, starting at zero. The cpunum symbol holds the CPU number for the currently visible CPU in the debugger (you can see it by entering the command print cpunum in the debugger console).
If a tag starts with a caret (^) or dot (.), it is interpreted relative to the CPU currently visible in the debugger, otherwise it is interpreted relative to the root machine device. If a device argument is ambiguously valid as both a tag and a CPU number, it will be interpreted as a tag.
Examples:
maincpuThe device with the absolute tag
:maincpu.^melodypsgThe sibling device of the visible CPU with the tag
melodypsg..:adcThe child device of the visible CPU with the tag
adc.2The third CPU-like device in the system (zero-based index).
Commands that operate on memory extend this by allowing the device tag or CPU number to be optionally followed by an address space identifier. Address space identifiers are tag-like strings. You can see them in debugger memory viewer source lists. If the address space identifier is omitted, a default address space will be used. Usually, this is the address space that appears first for the device. Many commands have variants with d, i and o (data, I/O and opcodes) suffixes that default to the address spaces at indices 1, 2 and 3, respectively, as these have special significance for CPU-like devices.
In ambiguous cases, the default address space of a child device will be used rather than a specific address space.
Examples:
ramThe default address space of the device with the absolute tag
:ram, or theramspace of the visible CPU..:ioThe default address space of the child device of the visible CPU with the tag
io, or theiospace of the visible CPU.:programThe default address space of the device with the absolute tag
:program, or theprogramspace of the root machine device.^vdpThe default address space of the sibling device of the visible CPU with the tag
vdp.^:dataThe default address space of the sibling device of the visible CPU with the tag
data, or thedataspace of the parent device of the visible CPU.1:romThe default address space of the child device of the second CPU in the system (zero-based index) with the tag
rom, or theromspace of the second CPU in the system.2The default address space of the third CPU-like device in the system (zero-based index).
If a command takes an emulated memory address as a parameter, the address may optionally be followed by an address space specification, as described above.
Examples:
0220Address 0220 in the default address space for the visible CPU.
0378:ioAddress 0378 in the default address space of the device with the absolute tag
:io, or theiospace of the visible CPU.1234:.:romAddress 1234 in the default address space of the child device of the visible CPU with the tag
:rom, or theromspace of the visible CPU.1260:^vdpAddress 1260 in the default address space of the sibling device of the visible CPU with the tag
vdp.8008:^:dataAddress 8008 in the default address space of the sibling device of the visible CPU with the tag
data, or thedataspace of the parent device of the visible CPU.9660::ramAddress 9660 in the default address space of the device with the absolute tag
:ram, or theramspace of the root machine device.1883:vram.mAddress 1883 in the memory region with the absolute tag
:vram.1923:sprites.sAddress 1923 in the memory share with the absolute tag
:sprites.
The examples here include a lot of corner cases, but in general the debugger should take the most likely meaning for a device or address space specification.
Debugger expression syntax¶
Expressions can be used anywhere a numeric or Boolean parameter is expected. The syntax for expressions is similar to a subset of C-style expression syntax, with full operator precedence and parentheses. There are a few operators missing (notably the ternary conditional operator), and a few new ones (memory accessors).
The table below lists all the operators, ordered from highest to lowest precedence:
()Standard parentheses
++--Postfix increment/decrement
++--~!-+b@w@d@q@b!w!d!q!Prefix increment/decrement, binary complement, logical complement, unary identity/negation, memory access
*/%Multiplication, division, modulo
+-Addition, subtraction
<<>>Bitwise left/right shift
<<=>>=Less than, less than or equal, greater than, greater than or equal
==!=Equal, not equal
&Bitwise intersection (and)
^Bitwise exclusive or
|Bitwise union (or)
&&Logical conjunction (and)
||Logical disjunction (or)
=*=/=%=+=-=<<=>>=&=|=^=Assignment and modifying assignment
,Separate terms, function parameters
Major differences from C expression semantics:
All numbers are unsigned 64-bit values. In particular, this means negative numbers are not possible.
The logical conjunction and disjunction operators
&&and||do not have short-circuit properties – both sides of the expression are always evaluated.
Numbers¶
Literal numbers are prefixed according to their bases:
Hexadecimal (base-16) with
$or0xDecimal (base-10) with
#Octal (base-8) with
0oBinary (base-2) with
0bUnprefixed numbers are hexadecimal (base-16).
Examples:
123is 123 hexadecimal (291 decimal)$123is 123 hexadecimal (291 decimal)0x123is 123 hexadecimal (291 decimal)#123is 123 decimal0o123is 123 octal (83 decimal)0b1001is 1001 binary (9 decimal)0b123is invalid
Boolean values¶
Any expression that evaluates to a number can be used where a Boolean
value is required. Zero is treated as false, and all non-zero values
are treated as true. Additionally, the string true is treated as
true, and the string false is treated as false.
An empty string may be supplied as an argument for Boolean parameters to debugger commands to use the default value, even when subsequent parameters are specified.
Memory accesses¶
The memory access prefix operators allow reading from and writing to emulated address spaces. The memory prefix operators specify the access size and whether side effects are disabled, and may optionally be preceded by an address space specification. The supported access sizes and side effect modes are as follows:
bspecifies an 8-bit access (byte)wspecifies a 16-bit access (word)dspecifies a 32-bit access (double word)qspecifies a 64-bit access (quadruple word)@suppress side effects!do not suppress side effects
Suppressing side effects of a read access yields the value reading from address would, with no further effects. For example reading a mailbox with side effects disabled will not clear the pending flag, and reading a FIFO with side effects disabled will not remove an item.
For write accesses, suppressing side effects doesn’t change behaviour in most cases – you want to see the effects of writing to a location. However, there are some exceptions where it is useful to separate multiple effects of a write access. For example:
Some registers need to be written in sequence to avoid race conditions. The debugger can issue multiple writes at the same point in emulated time, so these race conditions can be avoided trivially. For example writing to the MC68HC05 output compare register high byte (OCRH) inhibits compare until the output compare register low byte (OCRL) is written to prevent race conditions. Since the debugger can write to both locations at the same instant from the emulated machine’s point of view, the race condition is not usually relevant. It’s more error-prone if you can accidentally set hidden state when all you really want to do is change the value, so writing to OCRH with side effects suppressed does not inhibit compare, it just changes the value in the output compare register.
Writing to some registers has multiple effects that may be useful to separate for debugging purposes. Using the MC68HC05 as an example again, writing to OCRL changes the value in the output compare register, and also clears the output compare flag (OCF) and enables compare if it was previously inhibited by writing to OCRH. Writing to OCRL with side effects disabled just changes the value in the register without clearing OCF or enabling compare, since it’s useful for debugging purposes. Writing to OCRL with side effects enabled has the additional effects.
The size may optionally be preceded by an access type specification:
porlpspecifies a logical address defaulting to space 0 (program)dorldspecifies a logical address defaulting to space 1 (data)iorlispecifies a logical address defaulting to space 2 (I/O)3orl3specifies a logical address defaulting to space 3 (opcodes)ppspecifies a physical address defaulting to space 0 (program)pdspecifies a physical address defaulting to space 1 (data)pispecifies a physical address defaulting to space 2 (I/O)p3specifies a physical address defaulting to space 3 (opcodes)rspecifies direct read/write pointer access defaulting to space 0 (program)ospecifies direct read/write pointer access defaulting to space 3 (opcodes)mspecifies a memory regionsspecifies a memory share
Finally, this may be preceded by a tag and/or address space name
followed by a dot (.).
That may seem like a lot to digest, so let’s look at the simplest examples:
b@<addr>Refers to the byte at <addr> in the program space of the current CPU while suppressing side effects
b!<addr>Refers to the byte at <addr> in the program space of the current CPU, not suppressing side effects such as reading a mailbox clearing the pending flag, or reading a FIFO removing an item
w@<addr>andw!<addr>Refer to the word at <addr> in the program space of the current CPU, suppressing or not suppressing side effects, respectively.
d@<addr>andd!<addr>Refer to the double word at <addr> in the program space of the current CPU, suppressing or not suppressing side effects, respectively.
q@<addr>andq!<addr>Refer to the quadruple word at <addr> in the program space of the current CPU, suppressing or not suppressing side effects, respectively.
Adding access types gives additional possibilities:
dw@300Refers to the word at 300 in the data space of the current CPU while suppressing side effects
id@400Refers to the double word at 400 in the I/O space of the current CPU while suppressing side effects
ppd!<addr>Refers to the double word at physical address <addr> in the program space of the current CPU while not suppressing side effects
rw@<addr>Refers to the word at address <addr> in the program space of the current CPU using direct read/write pointer access
If we want to access an address space of a device other than the current CPU, an address space beyond the first four indices, or a memory region, we need to include a tag or name:
ramport.b@<addr>Refers to the byte at address <addr> in the
ramportspace of the current CPUaudiocpu.dw@<addr>Refers to the word at address <addr> in the data space of the CPU with absolute tag
:audiocpumaincpu:status.b@<addr>Refers to the byte at address <addr> in the
statusspace of the CPU with the absolute tag:maincpumonitor.mb@78Refers to the byte at 78 in the memory region with the absolute tag
:monitor..md@202Refers to the double word at address 202 in the memory region with the same tag path as the current CPU.
Some combinations are not useful. For example physical and logical
addresses are equivalent for some CPUs, and direct read/write pointer
accesses never have side effects. Accessing a memory region (m
access type) requires a tag to be specified.
Memory accesses can be used as both lvalues and rvalues, so you can write
b@100 = ff to store a byte in memory.
Functions¶
The debugger supports a number of useful utility functions in expressions.
- min(<a>, <b>)
Returns the lesser of the two arguments.
- max(<a>, <b>)
Returns the greater of the two arguments.
- if(<cond>, <trueval>, <falseval>)
Returns <trueval> if <cond> is true (non-zero), or <falseval> otherwise. Note that the expressions for <trueval> and <falseval> are both evaluated irrespective of whether <cond> is true or false.
- abs(<x>)
Reinterprets the argument as a 64-bit signed integer and returns the absolute value.
- bit(<x>, <n>[, <w>])
Extracts and right-aligns a bit field <w> bits wide from <x> with least significant bit position <n>, counting from the least significant bit. If <w> is omitted, a single bit is extracted.
- s8(<x>)
Sign-extends the argument from 8 bits to 64 bits (overwrites bits 8 through 63, inclusive, with the value of bit 7, counting from the least significant bit).
- s16(<x>)
Sign-extends the argument from 16 bits to 64 bits (overwrites bits 16 through 63, inclusive, with the value of bit 15, counting from the least significant bit).
- s32(<x>)
Sign-extends the argument from 32 bits to 64 bits (overwrites bits 32 through 63, inclusive, with the value of bit 31, counting from the least significant bit).
Source-level debugging¶
Source-level debugging allows you to step through and reference symbols from the original source of a program, rather than the disassembly. This feature is intended for use when emulating a vintage microcomputer and running software on that emulated machine for which you have access to the original source code. For example, if you are developing new software for a vintage machine, source-level debugging allows you to view your own source code while debugging.
Enabling source-level debugging¶
You will need to generate a MAME Debugging Information File. You can then enable source-level debugging by launching MAME with the -src_debug_info command-line option. You may also want to specify -src_debug_search_path and / or -src_debug_prefix_map.
Once source-level debugging is enabled, you will then be able to access the Options menu, Show Source command from the main debugger window. You may switch back and forth between source and disassembly view, and open separate disassembly windows even while the main window shows source.
The source view has a drop-list at the top for selecting which source file to view. The list is populated with source file paths from the MAME Debugging Information File. When the debugger is paused, you can change the selection to view the file you wish. When stepping, the selection is automatically updated to show the file associated with the current PC address.
Source-level breakpoints¶
The expression evaluator includes syntax for specifying a source
file and line number. This can be used in conjunction with the
bpset command to set
a breakpoint on a particular line number,
rather than specifying the address manually. Source-level symbols
evaluate to addresses, and so may also be used in bpset commands.
When specifying a
source file, you may specify either the full path to the source file
as it existed on the machine that built the binary, the full path
to the source file as it exists on the host running MAME, or any
non-ambiguous substring at the end of the full path (such as
just the filename). The filename or path is surrounded by single
backticks `, with the line number immediately following
the closing backtick.
Examples:
bpset `c:\full\path\to\file.c`3Set a breakpoint on the first instruction associated with file.c, line 3
bpset `file.c`3Set a breakpoint on the first instruction associated with file.c, line 3. Note the use of just the filename instead of a full path. This is fine so long as it is unambiguous relative to the other paths present in the MAME Debugging Information File.
print `to\file.c`3Print the address of the first instruction associated with file.c, line 3. Note the use of the path ending rather than the full path. This is fine so long as it is unambiguous.
bpset MainSet a breakpoint on the label named
Main
When the main window is showing source, you may also click on any source line and hit F9 to set a breakpoint on that line.
Source-level stepping¶
The stepping commands step, over, and out have source-level debugging
versions steps,
overs, and
outs, respectively, which operate
at the source level rather than at the disassembly level.
To take step as an example, if
the PC points to an address associated with line 4, step (with
no parameters) will advance
to the next instruction, whereas steps will advance to the next instruction
that is associated with a source line other than 4. When the original
source is assembly language, step and steps generally behave the same.
But when the original source is in a higher-level language like C or BASIC, steps
executes the remainder of a block of instructions associated with
the current source line.
When executing Step Into, Step Over, and Step Out from the menu or keyboard
shortcuts, the behavior depends on what the main window is showing.
If the main window shows disassembly, then step, over, or out
would be invoked. If the main window shows source,
steps, overs, or outs would be invoked.
Source-level symbol evaluation¶
MAME defines its own built-in symbols based on the current
CPU (e.g., register names) and from its global symbol table
(e.g., beamx, frame, names of built-in functions, etc.).
When source-level debugging is enabled, symbols from the source code will
be imported from the MAME Debugging Information File and added
to the list of symbols recognized by MAME's expression evaluator.
A symbol from the MAME Debugging Information File representing a source-code
variable evaluates to the variable's address, not its value. So,
for example, a 16-bit C compiler that compiles int harry = 4; will
create a symbol named harry whose value is the address at which
the variable is stored. To view the value you would issue a command
to the debugger console such as print w@harry.
Symbol collisions and priority: It is possible that source-level symbols present in the MAME Debugging Information File will conflict with built-in symbols. When source-level debugging is enabled, source-level symbols take precedence. You may always force a reference to the built-in symbol by prefixing the symbol with
ns\("not source"). For example, suppose the MAME Debugging Information File includes the source-level symbolywhich conflicts with the symbol for the registery. References toywill be interpreted as the source-level symboly. References tons\y, such as:print ns\yprint b@ns\ywpset ns\y
will be interpeted as the register
y.Case sensitivity: When the debugger evaluates expressions, symbols are generally interpreted case-insensitively when source-level debugging is not active. Many programming languages treat symbols case-sensitively. This can lead to source-level symbols like
Fooandfoobeing present in the MAME Debugging Information File simultaneously. When source-level debugging is active and MAME's expression evaluator encounters a symbol, it will give precendence to a case-sensitive match. If no such match is found, it will look for a case-insensitive match.
Address offsets¶
In some cases, the assembler or compiler does not know where the generated code will be loaded at run-time. For example, the program might be written in position-independent code to allow an operating system to decide where to load the code at run-time. In such cases, the assembler or compiler will be unable to provide the correct addresses in the MAME Debugging Information File. A build tool that supports run-time relocation could then start numbering its addresses at 0, assuming that the operating system will apply the appropriate offset when the binary is loaded. Similarly, the MAME debugger may also apply an offset to the addresses from the debugging info so that the resulting addresses match where the code is loaded at run-time. You can do this in two ways:
On the command-line, specify -src_debug_offset with the offset to apply. This approach makes sense if the code to be debugged is reliably loaded at an offset you can predict.
At any time during the debugging of the program, use the command sdoffset from the debugger console with the offset to apply. This approach can be used by users or LUA scripts that need to inspect memory to determine where the program was loaded. Any breakpoints set before
sdoffsetwas executed will need to be removed and re-added so the new offset can be applied. Any source-level symbols loaded from the MAME Debugging Information File will automatically evaluate to values with the new offset the next time a command references them.
Generating MAME Debugging Information Files¶
MAME Debugging Information Files (or .mdi files) are generated by
assemblers or compilers that target machines emulated by MAME. Currently
.mdi files adhere to a single binary format called "Simple". In the
future, new formats may be created as the need arises. The Simple format includes:
Full or relative paths to the source files input to the build tool
Mappings from source file and line numbers to blocks of 16-bit addresses where the corresponding instructions reside
Mappings from symbol names to 16-bit addresses
These symbols can either be global or scoped
Scoped symbols can either have fixed values or values dependent on register values (e.g., stack local variables)
The recommended way for build tools to generate .mdi files is to use
the small static library libmame_srcdbg_static.a. The source code resides
in the MAME tree at src/lib/srcdbg, and the library gets built to
a configuration-specific subdirectory, such as
build/linux_gcc/bin/x64/Release/libmame_srcdbg_static.a. The header
file srcdbg_api.h includes declarations and documentation for the C
functions comprising the API, along with comments that describe how to use
them.
Danger
Tools must not rely on any functionality
other than that declared in srcdbg_api.h
and srcdbg_format.h. Other files will
change without warning.
Tools written in C++ can include srcdbg_api.h and link to
libmame_srcdbg_static.a without any further makefile changes. Since
the library's API is pure C, tools
written in C can also include srcdbg_api.h and link to
libmame_srcdbg_static.a, but will need to add the C++ standard library
to the link line (as the library's implementation is C++). For example,
cc -m64 -o mytool mytool.c -L/path/to/lib/dir -lmame_srcdbg_static -lstdc++.
Note that -lstdc++ must appear at the end.
Tools not written in C or C++ may be able to use the shared library
libmame_srcdbg_shared.so or mame_srcdbg_shared.dll, assuming the tool
is written in a language that supports interfacing with shared libraries.
Linux shared library versioning
On Linux the shared library follows the recommended versioning names, with
the initial version of the library's real name being
libmame_srcdbg_shared.so.1.0 and soname being libmame_srcdbg_shared.so.1.
MAME does not have a setup program, so the responsibility is on tools
redistributing the shared library to perform the usual shared library installation
steps. For example, a tool might want to insulate itself from the machine's
environment, and tuck its own copy of the shared library into a tool-specific
folder, and use the -rpath linker option to declare where the tool can
find libmame_srcdbg_shared.so at run-time. Alternatively, a tool could
install libmame_srcdbg_shared.so into a machine-wide folder like
/usr/lib and run sudo ldconfig to register the library and create
a symbolic link from the soname to the real name. In any case, the tool
will need to manually create a symbolic link from the linker name
libmame_srcdbg_shared.so to either the soname
(if ldconfig was run) or directly to the real
name to ensure that the build-time linker can find libmame_srcdbg_shared.so.
Before proceeding with any of these options, it's recommended that you read up
on Linux shared library versioning, for example:
https://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
If consuming neither the static nor shared version of the MAME srcdbg library is
feasible, tools may also manually generate the binary format directly. The format
is defined in src/lib/srcdbg/srcdbg_format.h. Because this is error-prone,
tools should prefer using the static or shared library over generating the binary
format directly.
Viewing MAME Debugging Information Files with srcdbgdump¶
A small console executable, srcdbgdump is built along with other MAME tools. It may be used to view the contents of MAME Debugging Information files.