Saturday, August 30, 2008

Image Loading in Potato

The Squeak Image Format
In order to understand the image loading process we first take a look at the image concept and the image format.
The concept of an image might sound strange for Java programmers, but has its advantages. Instead of having multiple class files containing algorithms and type description, everything resides in one file - the image. But the image is not just one mega class file, it is a snapshot of the memory of a running application. Thus it also contains all objects and data of the running application.

The picture shows the structure of a Squeak Image containing the objects of the application (blue) as well as meta data (green).



The objective of the image loading process is to read the bytes from the image file and reconstruct the objects they represent. The JSqueak approach to accomplish this involved a lot of handwork for the byte handling. This is now done in a more simple (and more Java-like) way by using the tools available in the Java class library, e.g., the FileCachedImageInputStream class. (Despite the word "Image" in its name this is a class from the Java library and has nothing to do with Squeak images in general.)

Handling Byte Order
When a Squeak image is saved to a file, it will either be in the big-endian or little-endian format, depending on the architecture. Similarly, when loading the image file, the byte order may has to be converted. Reading individual 32-bit words can be done with the help of the class FileCacheImageInputStream. It provides the possibility to adjust the byte order automatically to the big-endian byte order that is used as Java's default byte order.

Here you can see the detection of the image byte order with the help of the Java library:
int MAGIC_NUMBER = 6502;

ByteOrder detectAndSetEndianess() {
ByteOrder result = null;
try {
int firstWordinImage = super.readInt();

if (firstWordinImage == MAGIC_NUMBER) {
result = ByteOrder.BIG_ENDIAN; // Java default
} else if (Integer.reverseBytes(
firstWordinImage) == MAGIC_NUMBER) {
result = ByteOrder.LITTLE_ENDIAN;
}
// ...

super.seek(0); // reset stream
} catch (IOException ex) { ... }
return result;
}

The first word in the image file is compared to a fixed value: 6502. In the original version of Potato, these values were created from four bytes manually, after which their order was inverted manually, too.

In particular, when reading the image, one has to take into account that, for certain binary data, such as byte code and strings, the byte order must not be inverted during byte order conversion. Since the adjustment to the Java byte order, for reasons of simplicity, is done initially for all content,the byte order must be fixed later. This is done in the method decodeBytes of the class SqueakObject. Whether or not such a conversion is required for a particular object can be decided on the base of the format field of the SqueakObject-Header. This recovery step is required when the value of the field is greater than or equal to eight.

Extraction of Objects
In the course of studying the image reading procedure for extracting the individual elements from the Squeak image file we revised and simplified the code. The classes SqueakImageInputStream, SqueakImageHeader and SqueakObjectHeader were introduced.


The class SqueakImageInputStream has methods for reading complex data structures from the image file. The method readImageHeader reads all values from the image header and the methods readSqueakObject and readSqueakObjectHeader read complete objects from the image.


Squeak Object Layout
A Squeak object is represented by a header describing the object and a body part containing the actual object data. The header may consist of 1 to 3 header words, where the rightmost (mandatory) header contains information on the format of the object.
The header of a Squeak object consists of three fields. The first two fields are optional and indicate the size or the class identification in the case that these values are too big to be stored in the small standard header field. We realize this via a switch statement, where the case blocks are not ending with a break statement for the optional headers. Therefore the following header fields are read, too.

The processing of the different optional headers:
switch (headerType) {
case HeaderTypeSizeAndClass:
readSizeHeader(currentHeaderWord);
currentHeaderWord = imageInputStream.readInt();
case HeaderTypeClass:
readClassHeader(currentHeaderWord);
currentHeaderWord = imageInputStream.readInt();
case HeaderTypeShort:
readBaseHeader(currentHeaderWord);
break;
default:
throw new RuntimeException("unknown header");
}

In order to deal with the complex object header structure we introduced the class SqueakObjectHeader. The class contains fields for all header elements (e.g., object size and format) as well as utility methods for reading the header directly from a SqueakImageInputStream. After reading the object header, the rest of the object is read from the file and decoded into a SqueakObject instance according to the type indicated by the form field in the header.

Sunday, July 27, 2008

Large Integer Handling in Potato

Different Kinds of Squeak Objects

In Squeak, there exist different kinds of objects that differ in how their contents are addressed. An object can, for example, hold references to other objects that are addressed by names that are mapped to slot indices. Most "ordinary" objects are shaped this way. An object can also hold elements (e.g., machine words, integers, or bytes) that are addressed by slot index. Strings and arrays are an example of this: each character or array element is stored at a given position. A third possibility is for an object to hold both references and arrayed elements; e.g., CompiledMethod instances store the literals used in the corresponding method as references, and the bytecodes representing the method code as an indexable byte array.

The format field in an object's header controls the nature of the object's contents. For instance, a value of 12 and greater implies that the object is a CompiledMethod. Depending on the format field's value, the virtual machine interprets the contiguous memory area representing the object in different ways.

In JSqueak, and, consequently, in Potato, it is not straightforwardly possible to simply interpret a collection of slots in such strongly different ways. To the end of supporting all kinds of Squeak objects nevertheless, any Potato object holds two references that are used to store different kinds of data:

class SqueakObject {
...
public Object bits;
...
public Object[] pointers;
...
}


The pointers field is used to store pointers of all kinds. The bits field is used to store indexable binary data. Note that its type is Object: that way, it can store arrays of int values (e.g., for machine-word arrays) and arrays of byte values (e.g., for strings) equally well.

Representing Large Numbers

Squeak features two classes for representing integers that do not fit the 30+1-bit SmallInteger space: LargePositiveInteger and LargeNegativeInteger. Both share the property that they store, in the bits field, a byte array whose elements represent the absolute value of the corresponding number - the signum of the number is indeed determined by the object's type. A Large****tiveInteger's byte array can have any length that is required to represent the number in question. The first array element is the least significant byte of the number's binary representation.

The original JSqueak did not feature primitive implementations for arithmetic operations on Large****tiveIntegers. Instead, in-image functionality would be executed for each such operation. As each and every single bytecode of such in-image functionality has to be evaluated by the interpreter, the lack of primitives for Large****tiveInteger arithmetics most certainly implies a slowdown. This seems unnecessary given that Java, the platform on which JSqueak and Potato are implemented, has strong support for large integers in the form of the java.math.BigInteger class.

Since Large****tiveInteger objects are of a kind that only stores indexed binary data, the pointers reference is unused in such objects. The pointers array hence looks like the natural place to cache a Java BigInteger object shadowing the Squeak Large****tiveInteger instance in question. The concept of shadows was also used in the SPy project, an implementation of the Squeak VM using the PyPy infrastructure. Potato maintains, for Large****tiveInteger objects, a pointers array with size 1, which contains exactly one object: a reference to the shadow BigInteger representing the respective Squeak object.

The large integer primitives implemented in Potato create shadows on demand, i.e., the first time they are actually required to perform an arithmetic operation, for Large****tiveInteger instances that were stored in the image. All Large****tiveIntegers created by the Potato VM are equipped with shadows right away.

Creating a BigInteger shadowing a Large****tiveInteger involves reversing the byte array stored in the Squeak object as Java's BigInteger requires a big-endian representation of the number in question. This functionality is implemented in the potato.objects.LargeInteger class.

Example: Adding Large Integers

Here is how a typical implementation of a large integer primitive looks. The example represents the primitive for adding two large integers; most large integer primitives look very similar.

boolean primitiveLargeIntegerAdd() {
return pop2andPushPossiblyCoercedBigIfOK(
getBig((SqueakObject)stack.stackValue(1)).add(
coerceToBig(stack.stackValue(0))));
}


The two objects in question can be found on the stack. The object at index 1 is self, which is guaranteed to be a Large****tiveInteger. The object at index 0 is the second (right-hand) parameter to the operation; it may be something entirely different, e.g., a SmallInteger, or even an illegal object.

The getBig() method interprets its parameter as a Large****tiveInteger and returns its Java BigInteger representation. This method has a side effect: if the Squeak object does not already hold a cached BigInteger shadow, it is created and stored in the Squeak object. This is arguably not very good style and may be changed in the future.

The coerceToBig() method checks whether the passed object is a SmallInteger and, if so, returns a BigInteger representation thereof. Otherwise, it falls back to getBig().

Having obtained two BigInteger instances, the two are added using java.math.BigInteger's add() method. The result is passed to pop2andPushPossiblyCoercedBigIfOK(), which checks whether the result is actually a Large****tiveInteger or whether it can be coerced to SmallInteger instead. Eventually, the two parameters are popped off the stack, and the correct result object is pushed onto it. If it is a Large****tiveInteger, a shadow BigInteger is created right away and cached in it.

More on Shadows

The shadow concept requires some care. While most primitives yield new Large****tiveInteger objects, two of them deserve special attention. These primitives are #digitAt:put: and #replaceFrom:to:with:startingAt: - they both directly access the byte array stored in the bits field of Large****tiveInteger instances. In case such a primitive is executed, the shadow BigInteger must be updated accordingly, otherwise the two representations - the Squeak one in the byte array and the Java one in the BigInteger stored in the pointers array - are out of sync, which implies all kinds of trouble for the arithmetic primitives that rely on the cached BigInteger.

To overcome this issue, the implementations of said two primitives check whether the object on which they operate is a Large****tiveInteger and update the shadow BigInteger if necessary.

This important issue was pointed out by Carl Friedrich Bolz.

Results

Some superficial benchmarks were implemented to assess the performance of two pieces of functionality: the creation of successive LargePositiveIntegers, and adding up a large number of such objects.

The first benchmark creates 1,000,000 LargePositiveIntegers by successively adding 1. It exhibits an almost ninefold speedup over Potato without the LargeInteger primitives.

The second benchmark fills an array with 500 elements with the sums of these elements. The first two slots contain the two smallest possible LargePositiveIntegers; slot 3, the sum of the first two slots; slot 4, the sum of the first three slots, and so forth. This involves huge amounts of large integer primitives being executed (albeit only addition). This benchmark exhibits a roughly 22-fold speedup.

While these results were obtained using rather simple and straightforward benchmarks that are likely not sustainable, they still show that using the capabilities of the underlying platform in Potato is worthwhile.

Besides, it was fun implementing them.

Status

As of this writing, the large integer support can be found in the LargeInteger branch of the Potato SVN repository at SourceForge. It will be merged into the trunk once it will have undergone some more testing, benchmarking, and refactoring.

Edit: Shadows were not originally used in PyPy, but in SPy. Thanks to Adrian Kuhn for pointing this out.

First Post

Welcome to the Potato blog!

The Potato virtual machine project is an implementation of the Squeak virtual machine in the Java programming language. Potato is a direct derivative of JSqueak, which was developed by Dan Ingalls at Sun.

Potato started out as a coursework assignment in a virtual machines course taught at the Software Architecture Group of the Hasso-Plattner-Institut in Potsdam, Germany. At the start of the assignment, JSqueak was able to run Squeak 2.2 mini images in black and white. Two students were given the assignment to implement colour support in particular, and support for more recent Squeak images in general. After some intense hacking, some of which involved Dan Ingalls, they managed to enable Potato to load Squeak 2.2 images with 32 bit colour depth.

The idea of the Potato project is to evaluate the possibilities of implementing a virtual machine for a high-level dynamic programming language such as Smalltalk on top of another high-level programming language virtual machine, namely the Java VM. This allows for various interesting optimisations, such as exploitation of the rich standard Java API.

This blog will feature posts by the Potato developers and will serve the purpose of providing introductions to the structure of Potato, and to interesting ongoing development in the project.