This PR is the first of two PRs that replaces earlier PRs #2589 and #2590.
Due to a git branching mishap it was decided to re-partition the new
functionality in two sequential PRs that offer self-contained, new
functionality to sim65.
The functionality in this first PR extends the sim65 simulator in the following ways:
(1) It provides tracing functionality, i.e., the possibility of printing one line of simulator state information per instruction executed.
(2) It provides a memory mapped "sim65 control" peripheral that allows control of (a) the tracing functionality, and (b) the cpu mode.
(3) It provides command-line options to sim65 to enable the tracing, and to override the CPU mode as specified in the program file header.
More detailed information and some discussion can be found in the discussions with the (now retracted) PRs #2589 and #2590.
This PR provides the technical infrastructure inside the sim65 simulator program itself. Once this PR is accepted, a follow-up PR will be posted that adds C and assembly-language support for the new tracing and peripheral features so they can be easily accessed from the CC65 compiler and the CA65 assembler; some examples; and the documentation for these features. The lack of the latter, in this pull request, will be addressed then.
This PR fixes all discrepancies of sim65 instruction timings, for both the 6502 and the 65C02 processors.
The timings as implemented in this PR have been verified against actual hardware (Atari 800 XL for 6502; and WDC 65C02 for 65C02).
These timings can also be verified against the 65x02 test suite. However, in this case, a single discrepancy arises; the 65x02 testsuite suggests that the 65C02 opcode 0x5c should take 4 clocks. However, tests on a hardware 65C02 have conclusively shown that this instruction takes 8 clock cycles. The 8 clock cycles duration for the 65C02 0xfc opcode is also confirmed by other sources, e.g. Section 9 of http://www.6502.org/tutorials/65c02opcodes.html.
This test makes sim65 correct both in terms of functionality (all opcodes now do what they do on hardware) and in terms of timing (all instructions take as long as they would on real hardware).
The one discrepancy that remains, is that on a real 6502/65C02, some instructions issue R or W cycles on the bus while the instruction processing is being done. Those spurious bus cycles are not replicated in sim65. Sim65 is thus an instruction-level simulator, rather than a bus-cycle level simulator. In other words, while the clock cycle counts for each instruction are now correct, not all clock cycles are individually simulated.
This PR fixes the implementation of 5 illegal opcodes
in the 6502, which the 6502X supports:
* $93 SHA (zp),y
* $9B TAS abs,y
* $9C SHY abs,x
* $9E SHX abs,x
* $9F SHA abs,y
The common denominator of the previous implementation was that it didn't correctly handle the case when the Y or X indexing induced a page crossing. In those cases, the effective address calculation of the instructions becomes truly messed up (with the high byte of the address equal to the value being written).
The correctness of the implementations in this PR was verified using the 65x02 test suite, and corresponds to a (detailed) reading of the "No More Secrets" document.
Stylistically, there is room for improvement in these implementations, specifically in factoring out common behavior into macros. However, for now the "explicit" coding style will suffice. It is clear enough, and we want to reach a situation soon where the sim65
code is able to pass the full '65x02' testsuite. Once we get to that point, we can refactor this code with a lot more confidence, since we will have the benefit of a working exhaustive test to make sure we don't break stuff.
This PR implements support for 32 65C02-specific instructions
to sim65: BBRx, BBSx, RMBx, SMBx, with x = 0..7.
These instructions are implemented using two macros:
* The "ZP_BITOP" macro implements the RMBx and SMBx isntructions.
* The "ZP_BIT_BRANCH" macro implements the BBRx abd BBSx instructions.
The implementation of these instructions has been verified usingthe 65x02 test suite.
After a lot of preparatory work, we are now in position to finally tighten
the types of the 6502 registers defined in the CPURegs struct of sim65.
All registers were previously defined as bare 'unsigned', leading to subtle
bugs where the bits beyond the 8 or 16 "true" bits in the register could
become non-zero. Tightening the types of the registers to uint8_t and
uint16_t as appropriate gets rid of these subtle bugs once and for all,
assisted by the semantics of C when assigning an unsigned value to an
unsigned type with less bits: the high-order bits are simply discarded,
which is precisely what we'd want to happen.
This change cleans up a lot of spurious failures of sim65 against the
65x02 test-set. For the 6502 and 65C02, we're now *functionally*
compliant. For timing (i.e., clock cycle counts for each instruction),
some work remains.
ANE (0x8b) is an unstable illegal opcode that depends on a "constant" value that isn't
really constant. It varies between machines, with temperature, and so on. Original sim65
behavior was to use the constant value 0xEF. To get the behavior in line with the 65x02
testsuite, we now use the value 0xEE instead, which is also a reasonable choice that can
be observed in practice.
The obvious way to implement JSR for the 6502 is to (a) read the target address,
and then (b) push the return address minus one. Or do (b) first, then (a).
However, there is a non-obvious case where this conflicts with the actual order
of operations that the 6502 does, which is:
(a) Load the LSB of the target address.
(b) Push the MSB of the return address, minus one.
(c) Push the LSB of the return address, minus one.
(d) Load the MSB of the target address.
This can make a difference in a pretty esoteric case, if the JSR target is located,
wholly or in part, inside the stack page (!). This won't happen in normal code
but it can happen in specifically constructed examples.
To deal with this, we load the LSB and MSB of the target address separately, with
the pushing of the return address sandwiched in between, to mimic the order of the
bus operations on a real 6502.
It provides access to a handful of 64-bit counters that count different things:
- clock cycles
- instructions
- number of IRQ processed
- number of NMIs processed
- nanoseconds since 1-1-1970.
This in not ready yet to be pushed as a merge request into the upstream CC65
repository. What's lacking:
- documentation
- tests
And to be discussed:
- do we agree on this implementation direction and interface in principe?
- can I include inttypes.h for printing a 64-bit unsigned value?
- will clock_gettime() work on a Windows build?
The linkage of the 'Regs' variable in 6502.c was changed from static
to extern. This makes the Regs type visible (and even alterable) from
the outside.
This change helps tools to inspect the CPU state. In particular, it
was implemented to facilitate a tool that verifies opcode
functionality using the '65x02' testsuite. But the change is also
potentially useful for e.g. an online debugger that wants to inspect
the CPU state while the 6502 is neing simulated.