Some thoughts on microkernel

System is really hard to understand as there are so many ‘abstractions’, which are quite different meaning in different contexts, like kernel/process/IPC etc. There is some really confusing concepts like user space kernel, kernel thread… I have to say this field is somewhat very satisfied and enjoy doing its own dream and “big picture” while many of them are just doing shits that no one wants to read except paper chaser like me. However, I don’t want to stuck in this game for too long, while I have to conquer them by understanding them.

So what’s microkernel? A kernel is usually the middle layer, the priviledged entity that manage the hardware resource while serve the user space application. While what makes this ‘entity’? Why do we need IPC in microkernel? Because in monolithic kernel, we have a unique kernel address space, we organize all of our kernel code there and we just use function call(stack stuff) to get the code we want. Address space is very important for any executable entities(process, if that term still fits, in this way kernel is also a process, which is event-based), different process means fault isolation, an amazing but intrinsic feature of microkernel. Consider if one user space process has segment fault, other process or kernel won’t be influenced, as you cannot overwrite their own data.

That’s why we need IPC to communicate in microkernel, as we can see shared memory/ function call is the natural way to communicate within the same address space.

So I am thinking about my research direction, possibly on how to make an specification uni-kernel for cloud application? That might be interesting. I read about a Columbia OS course assignments “create a x86 kernel from scratch optimized for a specific application”. But we still need the POSIX for compatibility, that means we can “emulate the interface” while freedomly do our internal design?

Performance isolation

Link

This paper talks about the lack of mechanism to provide fairness to higher-level logical entities, like users, a group of users, or a group of processes. I feel that indicates the need of cgroup in linux.

The paper defines the concepts of performance isolation:

If the resource requirements of an SPU are less than its allocated fraction of the machine, the SPU should see no degradation in performance, regardless of the load placed on the system by others

Another thing we need to mention is, what if there are idle resource in the machine? It would be nice if our processes are able to utilize these idle resource to improve its response time and throughput.

The paper also proposes a framework for us to understand the performance isolation and sharing. As we want to reuse the idle resource in entities(the so called SPU in paper), we need to consider the following factors:

For isolation, we need counters to monitor the resource usage of each entities. We need mechanism to deal with the entities that request more resource than it’s entitled to(kill them, possibly).

For sharing, we need to be able to transfer idle resource to the entities that are in need and also be able to revoke the resource if the original SPU is in load. The revocation cost is the key factor of sharing.

The first level is the amount of resources that the SPU is entitled to initially. This level is decided by the division of system resources based on the sharing contract for the system. The second level is the amount of resources that the SPU is allowed to use currently. The third level is the amount of resources currently used by the SPU.

Sharing is implemented by changing the allowed level for SPUs based on resource requirements and availability. In a system under load where all SPUs are utilizing their share of the resources, all three levels will be at about the same value for the SPUs. No sharing will happen. At some point one or more SPUs may go idle or be under utilized. Their used level will now be much less than their entitled level, indicating idle resources. The sharing policy can now transfer some of these idle resources from the under-utilized SPUs to the others by increasing the value of the allowed level for the latter. When the SPUs want their resources back, the sharing policy will lower the allowed level of the borrowing SPUs, potentially to the entitled level.

The next task is revisiting the cgroup implementation

A blog posting system

As a researcher you always need to learn new things, you need to take notes, and look back constantly to get new understanding. That rule also applies to reading papers, some interesting paper’s very rich in meaning and many devils hide in details.

Remzi proposes an interesting idea in his famous OSTEP book, the mental model matters in system research. The mental model is mainly procedural in system paper, by asking questions we can get a systematic organization of how this systems work. I also want to attach a design model to this mental model, which ask questions about why to choose this design. That one is motivated by yiying‘s design doc format, for us to revisit these design choice and understand the tradeoff.

I have to say a newbie in system research(like me, of course!) has many confusion in his first time reading paper. Thus I will try to use an asynchronous way of writing blogs(about important things!). I wrote many notes, many of them are gists of ideas, periodically I will revisit these notes and rethink to see what I forgot and what’s my new feelings. I will choose some of the interesting ideas to post it in my blogs. It’s a little bit like the LSM background writing thread, regularly fetch notes from my “note pool” and write it to the persistence.

Bitnami