Thesis Part II

So after a “small” hiatus, I’ve gone back to work on the thesis. First, I’ll begin with additional details that were omitted in the previous post, then discuss the selection and installation of a real-time OS on the Raspberry Pi, which is being used as the eXternal Benchmarking Host (XBH).

The secondary goal of having the power measurement run fast enough to perform power analysis for side channel attacks tasks was shelved. The hardware interfaces on the Pi are not capable of this, and analog to digital converters capable of this task were expensive and/or did not provide very high resolution. In addition, target devices would have to be stripped of capacitors to provide meaningful data. Finally this was an example of scope creep that did not fit with the primary goal of this project, which is low cost automated power and performance benchmarking of cryptographic algorithms.

To measure power, we exploit Ohm’s Law (V=IR) and insert a resistor with a known value(R) into the power supply and measure the voltage drop (V) across the resistor. From this, we can derive the current (I=V/R). Using Kirchoff’s Law we know that current flowing into the resistor is the same as current flowing out of the resistor and into the target device. The power can be derived by multiplying the current by the voltage after the resistor, relative to ground.

In order to measure voltage, we need a sensor. Analog to digital converters (ADC) usually take as input a voltage signal, so they can be used directly as a sensor, given a proper biasing network of resistors to bring the voltage down to a usable level. There are multiple possible interfaces from the converter to the pi, including general purpose digital IO pins (GPIO), SPI, and I²C, and asynchronous serial. For our purpose, SPI and I²C are preferable, as they reduce the wiring that is required and are faster than asynchronous serial.

The sensor we used is the INA219, which is a delta-sigma ADC specifically for the purpose of providing current, voltage, and power measurements by performing all the calculations above in-chip, and is a common power sensor in motherboards, although a generic ADC could have been used with calculations done inside the Pi with an appropriate resistor network for scaling. The INA219 interfaces using I²C and is capable of 12-bit resolution every 532 microseconds or 9-bit every 84 microseconds, and various compromises between resolution and sampling rate in between.

As we want to perform measurements at a consistent rate, a realtime OS is desirable. The stock Linux kernel that comes w/ Raspbian (a version of Debian Linux targeted towards the Pi) with PREEMPT (which is considered soft-realtime) should meet deadlines for sampling the power sensor as frequently as possible most of the time, but this is not guaranteed, which means the occasional dropped result. A hard-realtime OS is thus desirable.

For this purpose, three hard-realtime OS options were evaluated for the Raspberry Pi, as well as looking at programming the pi directly. 

The first and simplest-looking option was Linux with the PREEMPT_RT patches applied, which makes the kernel itself fully preemptible by realtime tasks. This has the benefit of having a likelyhood of eventually being merged into the mainline Linux kernel, removing the need to patch in the future. Unfortunately, this patchset breaks certain drivers, such as the one for MMC (SD card), which is problematic, as the SD card holds the OS. Workarounds that were used was to run the operating system off of a USB stick attached to the Pi. This was considered inelegant and undesirable.

The second option for a realtime OS was to use Xenomai, which virtualizes Linux and a hard realtime microkernel for realtime tasks. Processes can migrate between these two domains and written to run in userspace using the POSIX API like Linux with PREEMPT_RT, although other interfaces (known as “skins”) allow the use of APIs belonging to VxWorks and other RTOSes. The advantage of using Xenomai was that the lack of issues with drivers and somewhat greater performance.

The third option was to use ChibiOS. This had the advantage of probably having less overhead than Linux. However, driver support is limited and much functionality would have to be written from scratch (i.e. communications over ethernet). Interfacing to the Pi

Programming directly without an operating system (“baremetal”) has the same problems as using ChibiOS, with more functionality having to be dealt with manually.

Out of all the options, Xenomai seemed to be the most mature and well supported of the available hard-realtime options. It does not have the driver problems of PREEMPT_RT, and as it runs Linux, utilities such as SSH, compilers, and a web server can be run on the Pi, making the XBH self-hosting to a degree.

The specific version of Xenomai being used is adc3d for Linux kernel 3.5.7. There are certain compile issues due to missing dma code from commit 64ccc that was incompletely backported from the 3.6 kernel. Additional patches are needed in order to get the debian package to cross compile and build properly on a ubuntu 13.04 system. These issues will be addressed in a future post and patches will be submitted to upstream after being cleaned up.

A feature that was not mentioned in the previous post was providing a power supply to the target device (eXternal Benchmarking Device [XBD]). As current must pass through the XBH to be measured, it makes sense to perform voltage regulation here. Target XBDs will generally be either 3.3 or 5V, and possibly 1.2V. This circuit will be discussed in a future post.


So other projects are winding down and I’ll finally have time to work on my thesis soon.

The topic of my thesis will be extending the eXternal Benchmarking eXtension (XBX) to support energy measurements, in addition to the current functionality of measuring speed and memory/rom usage. An additional goal is to implement this cheaply, i.e. without utilizing expensive oscilloscopes. This would facilitate performing many of measurements in parallel on multiple platforms.

With the proliferation of electronic devices everywhere into ever-smaller form factors, energy consumption of algorithms and their implementations is an important part in deciding what to use for a particular application. 

The XBX project is itself (as the name would indicate) an extension to the SUPERCOP project. The SUPERCOP project aims to benchmark cryptographic algorithms on self-hosting platforms such as PCs and embedded Linux. The XBX project is targeted towards platforms which are incapable of hosting their own compiler, such as microcontrollers and smaller embedded Linux platforms.

XBX currently consists of the eXternal Benchmark Harness (XBH) based off of a AVR-based home automation board, which receives code from a PC over a network and programs the XBD,, as well as records timing signals and other information emitted by the XBD,and the eXternal Benchmark Device (XBD), which is the microcontroller to be measured. We intend to replace the AVR-based XBH platform with a more capable ARM-based Raspberry Pi, which would allow frequent polling of power measurements over time in addition to the current functionality.

If the polling rate is fast enough, the platform could also be used to perform side channel attacks, i.e. deriving the key via power measurements and testing particular implementations for resistance against this.

I’ll add further updates as I progress along with this project.

Brute Forcing

Type any number into a calculator and then divide it by 7, 11 and 13. Why do the first six decimal places always sum to 27? #mathspuzzle

So @musegarden retweeted the above and I got interested. Technically the above isn’t correct, as it is only applicable to numbers that do not factor into 7, 11, or 13- 1001 divided by any of these has the sum equal to 0. But then this comes to mind.

What other numbers can be divided into anything not a factor and have a consistent sum of digits to a certain point, after the decimal point? So I wrote a program to brute force this search, with numbers (i) up to 1001, and between 2 to 20 digits to be summed (x).

It should be necessary to only test 1->(i-1) as divisible, as the digits after the decimal point repeat themselves after i digits.

The results:

i: 2, sumX: 5, x: 2
i: 11, sumX: 9, x: 2
i: 2, sumX: 5, x: 3
i: 2, sumX: 5, x: 4
i: 11, sumX: 18, x: 4
i: 101, sumX: 18, x: 4
i: 2, sumX: 5, x: 5
i: 2, sumX: 5, x: 6
i: 7, sumX: 27, x: 6
i: 11, sumX: 27, x: 6
i: 13, sumX: 27, x: 6
i: 77, sumX: 27, x: 6
i: 91, sumX: 27, x: 6
i: 143, sumX: 27, x: 6
i: 2, sumX: 5, x: 7
i: 2, sumX: 5, x: 8
i: 11, sumX: 36, x: 8
i: 73, sumX: 36, x: 8
i: 101, sumX: 36, x: 8
i: 137, sumX: 36, x: 8
i: 2, sumX: 5, x: 9
i: 2, sumX: 5, x: 10
i: 11, sumX: 45, x: 10
i: 2, sumX: 5, x: 11
i: 2, sumX: 5, x: 12
i: 7, sumX: 54, x: 12
i: 11, sumX: 54, x: 12
i: 13, sumX: 54, x: 12
i: 77, sumX: 54, x: 12
i: 91, sumX: 54, x: 12
i: 101, sumX: 54, x: 12
i: 143, sumX: 54, x: 12
i: 2, sumX: 5, x: 13
i: 2, sumX: 5, x: 14
i: 11, sumX: 63, x: 14
i: 2, sumX: 5, x: 15
i: 2, sumX: 5, x: 16
i: 2, sumX: 5, x: 17
i: 2, sumX: 5, x: 18
i: 2, sumX: 5, x: 19

sumX is the sum of all digits up to the x’th digit after the decimal. i is the number being tested.

The program:

#! /usr/bin/python
import math

#messy, maybe better to use string manipulation?
def sumXdec(value,x):
    s = 0
    for i in range(1,x+1):
        r = value*(10**i) 
        r -= math.trunc(value*(10**(i-1)))*10
        r = math.trunc(r)
        s += r
    return s

for x in range(2,20):
    for i in range(1,1001):
        sumX = sumXdec(1./i,x)
        valid = True
        for j in range(1,i):
            if sumX != sumXdec(j*1./i,x):
                valid = False
        if valid and sumX != 0:
            print 'i: '+str(i)+', sumX: '+str(sumX)+', x: '+str(x)

Automated solver to SetGame

Possible set members are hardcoded for Nov 7

Not the most efficient since it makes redundant computations, as it compares permutations instead of combinations. It’d probably be more efficient to precompute the possible combinations, then do the validity check, instead of sorting the tuple and using a set to uniqify the results.

Of course, this is a silly game and the search space is small enough that efficiency doesn’t matter. Code size does.

#! /usr/bin/python
p = "purple"
g = "green"
r = "red"
d = "diamond"
s = "squiggly"
o = "oval"
so = "solid"
l = "lined"
e = "empty"
setList = [
output = set()
for i in setList:
    for j in setList:
        for k in setList:
            if (i == j or j == k or i == k):
            valid = True 
            for l in range(0,4):
                if not ( (i[l] != j[l] and k[l] != j[l] and k[l] != i[l]) 
                        or (i[l] == j[l] and k[l] == j[l] and k[l] == i[l])
                    valid = False
            if valid:
                data = [i,j,k]
                data.sort(key=lambda tup: tup[0])
                data.sort(key=lambda tup: tup[1])
                data.sort(key=lambda tup: tup[2])

for i in output:
    print i

Android and troubles with the NDK

I’m currently working on implementing and benchmarking erasure codes on Android devices. As most implementations are in C/C++, the NDK must be used in order to use these without translation to Java, which would be a significant effort.

JNI must be used in order for the standard Android Java code to communicate with the erasure codes implemented in C. As writing wrappers for various types and data structures is tedious, I am using SWIG to automate this process.

The Eclipse environment for Android development (ADT) appears to be somewhat buggy and feature-incomplete for native code development. It has the following problems:

  • Partially incompatible with the latest eclipse. The current version of the ADT thows errors with C code due to not finding the header files in the appropriate location. See
  • It is not possible to debug JNI code that has been separated out into its own library project.
  • Statically loading the JNI C library from the wrapper causes problems with debugging, causing the symbols to not be recognized. In order to get debugging to work, the C library has to be loaded with the main Android activity, and use of the functions must be delayed until the debugger has started. See