Friday, April 3, 2015

E-book version

The Kindle e-book version of the book has turned out be pretty good. Most of the people who have read it have favourable reviews. In some places I though that the diagrams could have been bigger. However, this is a limitation of the technology. With newer versions of Kindle this should not be an issue.

I am increasingly turning towards e-books for other courses also. They are ready, convenient, and there is no fear of losing them!

Half a Semester with the Book

My computer architecture book, ``Computer Organisation and Architecture'' is now available through Amazon. I have received a lot of suggestions and corrections from different readers. I am grateful to all of them.

I have also been teaching a course around this book in IIT Delhi to predominantly second year students. The experience up till now has been good so far. I would like to share the experience of the first few assignments and exams.

Lab work is a very important part of this course. The first assignment that I gave to the students was to write SimpleRisc assembly code to multiply two matrices. They used the interpreter given in the companion website of the book. Nobody reported any problem with the interpreter. It worked seamlessly. The code also ran seamlessly and did not cause any problems for small matrices.

However, when we consider large matrices, there is a problem. Our interpreter only supports mapping memory in the stack. I had not taught the students the concept of a heap. As a result if we need to initialize and multiply large matrices, all of this needs to be done in a single function. This is not desirable. Consequently, I wrote two additional pages at the end of the assignment to explain the concept of a heap to the students. Recall that a heap is a dynamic memory region that is shared across functions. Programmers typically use the malloc function in C to initialize memory on the heap. We need to do something similar in our assembly programs.

I thus introduced an additional assembly directive called .alloc that tells the intepreter to allocate a given amount of memory at a given address. The interpreter can use any kind of underlying data structure to manage this memory. I further encouraged students to use .alloc statements as frequently as possible to allocate chunks of memory on the heap. Their assembly code needless to say has to be aware of the locations of these chunks of memory and use them appropriately. By implementing the .alloc function properly, the assembly code became far more modular and easy to write.

The second part of the assignment was tricky. The students were told to optimize the assembly code as much as possible. They were supposed to use advanced tricks such as loop unrolling, constant folding, and strength reduction to speed up their code. This strategy was able to speed up the assembly code by roughly 2-3X.

For some of the students, who were not able to do the first assignment, I gave an additional assignment, which asked them to multiply two matrices using the blocking method. The students haven't learnt about the memory system yet. Nevertheless, they found blocking interesting.

In the second assignment, we looked at circuit design. It was about designing a divider (both restoring and non-restoring) using Logisim. Note that Logisim is a simple tool; still students need some amount of training to understand all the features of Logisim and to use it properly. One of my TAs arranged for such a training and it was very well received.

After this the students started with their assignment. I did not see any problems. After the initial Logisim training, students were able to build a divider that divided two unsigned 8 bit numbers. In this particular assignment, we went for automated grading. We built a Logisim template file, which the students had to take and modify. They need to insert their circuit at a given place. This allowed all of use to grade all the assignments instantaneously using a script.

The last assignment is to implement a processor with pipelining. Students can start with the processor that is available in the course's website. I will discuss more about that effort after all the assignments are submitted.