NikonEmulator Internals

=Emulator source code= The code of the Emulator is freely available. If you want to take a look at it, build your own version or understand how something works, please help yourself (and all feedback is welcome of course). Here's how:

Tools and links

 * If you just want to browse sources online, they are hosted on the Google Code site that Simeon set up.
 * If you want to get sources, you can use any Mercurial (Hg) command-line or graphical tool. For Windows, I recommend TortoiseHg. You just have to "clone" the following repository, which will get you all source files (Emualtor as well as other tools used in the project): https://code.google.com/p/nikon-firmware-tools/
 * If you want to go further, you need a Java IDE (Integrated Development Environment). I personnaly love IntelliJ IDEA by Jetbrains, and the free Community Edition is all you need to build, develop and debug the Emulator. I've setup a screencast showing how to load sources and build the project in a few clicks.

=General Considerations=

So what is an emulator anyway?
Basically, an emulator (or simulator as some call it) is a software that makes a "system" behave as another "system". By "system", we often mean a "computer", but in the general sense it's any digital device executing a software to perform operations. Most well known emulators run on a modern computer (PC, Mac) to simulate the behaviour of old microcomputers or gaming devices, such as this Commodore 64 Emulator or this Virtual Gameboy - check out http://www.emulator-zone.com/ for more.

In this project, the target platform we want to emulate is a camera, but the idea is the same.

How does it work?
The principle is simple: take each instruction of the original software (in this case the camera firmware), decode it, and perform the same thing as the camera's micocontroller would do with it. To get a grasp of how this translates into code, I would invite you to read the excellent How To Write a Computer Emulator article by Marat Fayzullin, author of several emulators.

What is different compared to other emulators
These are the differences:
 * Cameras have many specific devices (sensor, secondary LCD, auto-focus module, light metering module, orientation sensor, etc)
 * Several devices, or microcontrollers themselves, are proprietary and, as such, have no public documentation available.
 * The control of Nikon DSLRs (like others) is shared between 2 microprocessors that have different roles, so we have to emulate two systems instead of one, at once and make them communicate
 * That last point means that timing is a crucial issue
 * This emulator is written in Java, which is not a good choice from a performance point of view (although nowadays, speed difference compared to C is less than one might think). No language war here, this choice was made because Java is the language I (Vicne) am most used to...

Side note - Java and unsigned integer types
This is probably one of the strangest things to grasp for people used to C (and other) languages:
 * Java has no "unsigned" int.
 * Java also has no "signed" int.
 * Java just has int.
 * And it's ok.

Most of the time, the fact is that you only care about signed vs unsigned is when : For all frequent uses (add/subtract/multiply, equality comparison) it makes no difference thanks to 2's complement way of storing values.
 * 1) you display the values for human readability.
 * 2) you compare values with less than / greater than operators

It may sound weird - and I also found it weird until I really started working on binary data in Java - but that's the way it is.

You'll find many people answering in forums that due to Java's unability to cope with int values between 2^31 and 2^32, you have to use a larger type for storing unsigned int's (e.g. long, which is 64-bit (*)). That is plain wrong. Java ints handle that range just fine, it is just considered negative. 32-bit is 32-bit. So just use an int and take care of the 2 above points. The solutions for the above are:
 * 1) Format them as hex using Integer.toHexString(value), or better yet my Format.asHex method that left-pads values with as many zeroes as needed (**).
 * 2) Create distinct cases according to the first bit of those two numbers, or just write the right code once and reuse it. For example, this is a method to compare two positive numbers n1 and n2 in the range [0, 2^32-1], stored as ints:

public static boolean isLessThanUnsigned(int n1, int n2) { return (n1 < n2) ^ ((n1 < 0) != (n2 < 0)); }

I admit calling "Utils.isLessThanUnsigned(n1, n2)" instead of "n1<n2" is somewhat annoying, but really, most of the time, checking for equality is what you want.

If you need to be convinced, take a look at the emulator code, you'll see that all 32-bit registers are implemented with ints. For example, the CPU's pc register (which we would normally consider as unsigned) is an int. In the TX19, all code is located at BFCxxxxx, which would be printed as a negative integer if you just did a "System.out.println(pc)". But we don't care, and relative jumps work great as-is in that "negative memory space".

(*) Contrary to C, in Java, the size of the primitive types is fixed no matter the architecture you run on: byte = 8-bit, short = 16-bit, int = 32-bit, long = 64-bit. char (which can be seen as unsigned 16-bit) should really be reserved for characters.

(**) If you really want to print them as unsigned in decimal, then you might have to promote them to a larger type and mask them. For example, printing a byte "n" representing a number in the decimal range [0-255] is done with System.out.println(n & 0xFF). I agree that is not very intuitive, but for those interested, let me explain why this "& 0xFF". What happens is as follows:
 * 1) unless further specified, all literals are treated as int, so 0xFF is in fact 0x000000FF.
 * 2) "&", as a binary operator, cannot work on types smaller than int, so the byte "n" is promoted to int, and promotion occurs with sign extension. So bit 7 of n is copied to all bits [31:7], making sure that the signed interpretation of n is preserved. In other words, if n was in [0x00-0x7F], it it becomes 0x000000NN. If it was in [0x80-0xFF], it becomes 0xFFFFFFNN.
 * 3) the & operation occurs, cancelling the sign extension by only keeping the 8 lower bits, and the result is an int which only holds the 8 bits that were in n, but will now be printed as a positive 32-bit number (highest bit being 0).

=Highlight of a few features=

Problem
In a classic emulator, as explained in the article by Marat Fayzullin (see "How does it work ?" above), you basically let the emulated system run freely in a while loop. The first "dual-CPU" emulator versions worked that way by simply having each microcontroller run in its own thread "as fast as it could".

However, when trying to interconnect them, it became evident that this couldn't work as one processor could run at 1/10 real time while the other ran at 1/50 real time for example. In other words, one ran 5 times faster than it should, and timeouts were triggered in code due to this.

So now one MasterClock class is responsible for "distributing the clock" and making each processor run in turn, at the speed its clock was programmed to. Of course both processors don't run at exactly the same clock speed. Moreover, the speed of the processors can be dynamically changed by the firmware code.

Model
The way it works now is that all elements that have to be synchronized together must be registered to the masterclock, and they must implement the "Clockable" interface, which requires a "getFrequency" method (to return the device frequency in Hz) and a "onClockTick" method (that the masterClock will call when it's that device's turn).

To give the correct ticks, the master clock has to "run" at a frequency that is the least common multiple of all devices, and it associates to each registered device a counter and a threshold. When the counter reaches the threshold, it gets reset and the onClockTick method is called.

So for example, if we have CPU1 at 6 MHz and CPU2 at 4MHz, the clock loop will run at a virtual 12MHz and sets the threshold of CPU1 to 2 and the one of CPU2 to 3. So we will have:

Of course, that can be extended to many devices that have to be called at regular intervals or after a given delay (e.g. timers, DMA, serial interfaces, etc.)

Optimization
While the above logic worked flawlessly, profiling the code showed that much time was spent iterating on steps that didn't have anything to "run" or, inside a given step, iterating on devices that shouldn't run on that "tick". Consequently, an optimization was done by post processing the table above and converting it to a list of Steps, each having a list of Clockable objects to run and a duration (which can be the sum of several masterclock ticks if some of them had nothing to run). So the 6-tick table above is "compressed" into the following list of steps:

Problem
To emulate a complete system, it is required that several elements can speak together. For "hi-level" components (such as serial ports), it can be done at the logic level. For example, a serial interface sending a byte can just call the "write" method of the connected serial interface.

But to have generic I/O ports (e.g. the "~SELECT" signal of a serial device), it can be useful to have a model going down to the pin level, and having an API to connect pins together, just like a copper track or wire would connect an output pin of a device to an input pin of another. That way, connections can be made configurable (like with a physical panel) and we have a good point to connect a "spy" probing what's the level of a given pin.

Model
Here's the current model for pin interconnection:

Explanations
(all class names are in blue).
 * Each microcontroller (or "Platform") has an array of IoPort objects (FrIoPort or TxIoPort according to the processor type).
 * Each IoPort has an array of 8 Pin objects of subclass "VariableFunctionPin"
 * Each VariableFunctionPin has a PinFunction object

Upon initialization, the static method "Pin.interconnect(Pin pinA, Pin pinB)" can be called to indicate which pins are connected together. For example, the "~SELECT" pin of the flash eeprom is connected to the P46 of the Tx19 microcontroller using the code:

Pin.interconnect(txIoPorts[IoPort.PORT_4].getPin(6), eeprom.getSelectPin);

A PinFunction object can be assigned dynamically when the emulated code configures the function registers of the ports, allowing a pin to behave differently according to the running code.

A. microcontroller A outputs a value
(follow the green flow)
 * Platform A stores a value at an address port, by calling the setValue method of the corresponding IoPort object (that is done for example in the TxIoListener).
 * the IoPort iterates on each bit, and if it is configured as output, calls the setOutputValue method of the corresponding Pin
 * Pins that are configured as output store that value locally then call setInputValue on their connected Pin, if any
 * The connected Pin handles that input value. In the case of a VariableFunctionPin, it delegates the handling to the PinFunction currently assigned to it.
 * That PinFunction can have different behaviours such as triggering an interrupt (implemented), triggering a timer capture (not yet implemented), etc.

B. microcontroller B reads a value
(follow the red flow)
 * Platform A loads a value from an address port, by calling the getValue method of the corresponding IoPort object (that is done for example in the ExpeedIoListener).
 * the IoPort iterates on each bit, and if a) it is configured as input, it calls the getInputValue method of the corresponding Pin. b) If it is configured as output, it calls the getOutputValue of the corresponding Pin
 * Pins that are configured as output call getOutputValue on their connected Pin, if any.
 * The connected Pin replies by returning its stored outputValue.

Spying on a connection
If we want to show or otherwise "spy" what is happens to that pin's signal, we can "disconnect" the platform pins and insert a "Wire" in between. This Wire will transmit information from one pin to the other, but will also be able to inform a "Listener" that a new value has been set. The model then becomes:

Dynamically assignable pins
Some pins have a predefined behaviour, like the "SELECT" pin of a device. But others belong to processors that can dynamically change the function associated to that pin.

For example, a pin can be a configured as a A/D converter trigger input at a moment, and as a plain output at another moment. To allow that flexibility, we use a specific implementation of the Pin class in which the behaviour is not coded, but is delegated to a PinFunction object that is associated with it. When a port reconfiguration occurs, the corresponding pin remains connected as it was before, but it is just given a new PinFunction subclass which will have another behaviour.

Call Stack Logger
The current "call stack" was written based on the logic of procedural code, at the time we were only using FR80 and not handling tasks. So when a "CALL" occurs, that call is added to the call stack, and when a "RET" occurs, the last call is removed from the call stack. At that time, it worked pretty well. The first thing that somewhat broke its usefullness was tasks: if task 1 starts, then dispatching occurs and task2 starts because of higher priority, then task2 waits and task 1 restarts, the call stack will show them as nested, like

task1.fn1 task1.fn2 task1.fn3 task2.fn1 task2.fn2 task2.fn3 task1.fn4 task1.fn5 task1.fn6

Moreover, on FR80, task switching doesn't use "call" but pure "jmp", which are not logged on the stack, so task switching is hard to follow. Don't remember how it works on TX19 but it could be that it also messes things up

Now with TX19, calls are just a special kind of jump (jalr), and returns are standard "jrc" calls with $ra as parameter. The convention is to use $ra for return, but nothing prevents the code from using another register than $ra, or from copying registers in between.

This is much harder to follow.

Model
Each line of disassembled code is kept by the Emulator in a Statement object (a Statement is an Instruction (opcode) plus its operands, if any). By always keeping these "high-level" objects instead of the String (text) form, further processing is possible.

To convert a <Statement to String, you first decode it using "statement.formatOperandsAndComment(context, false, outputOptions)". Then you perform the conversion to String using "toString(outputOptions)" (output options are normally "prefs.getOutputOptions(chip)").

Doing the reverse (parsing text to Statement) is not done currently. It will have to be done one day if we want to do assembly, but for the time being, working with Statement objects is much better.

The key class is CodeStructure (there's an array codeStructure[2] in EmulationFramework - one for each CPU). If it's not null, then disassembly with 'structure' option has occured, and it contains everything you need to browse the code:

All those maps are sorted by address (TreeMap class), so codeStructure has methods like "getAddressOfStatementBefore(address)" or "getStatementEntryAfter(address)" to navigate the code.
 * "statements" is a map by which you can find the statement at a given address
 * "labels" is a map by which you can find the defined label at a given address
 * "functions" is a map by which you can find the function starting at a given address.
 * "returns" is a map by which you can find the start of the function corresponding to a given "return"

The second most important structure is the "Function" class itself, which contains:
 * the list of Statements (the function listing)
 * the list of jumps starting from this function
 * the list of calls to other functions from this function (the "down" arrows in the graph)
 * the calledBy map of calls made to this function by other functions (the "up" arrows in the graph)

EEPROM
All changes to EEPROM are saved in Emulator preference file, and reloaded upon restart if the corresponding preference is set. In the prefs file, it is serialized in plain XML format by the XStream library: ... /f3+/v/+/uoAIwBh/a7//BsaGRkaGhkYGBkaGRkZGgECAADAAAEAAAAAAAMAgADAAACQAAAAAAAA AAAAAAAA+BkKCR0BLAAAF7MqjAADAAAAAAgBAAB2nwAAAAAAvgAAAwAAAAAAFAABLAEBFxgAAAAA AACkpKSkpGAAAACAAgADAFoAGB4AAAADAwAAAAAAAQABAXAAAAABAwAAAAQDAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJKACc AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABQJGK8AApZmAAAAAAAAAACHTEGDM0GEM0GBM0GE M0GCU0GBM0GBM0GCNEGCNEGDNEGDNEGCNEGCNEGDNEGCNEGDNEGDNEGDNEGDNEGCNEGBNEEAAAAA AAAAAAAAAAAAAAAAIwAABK0FHQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAAAAAA=  It is base64 encoded. Just remove "/.../" part. So above you see first bytes of EEPROM changed at offset +0.