The topics are on the right ====>>

Linux Kernel titbits and debugging tricks


Collection of some interesting facts (I can think of) in Linux kernel and some pointers on  debugging lessons I learnt from my experience

1. Loadable Kernel modules are NOT allocated memory as part of kernel address space, contrary to the general notion! 
LKM are allocated memory (vmalloc'ed) just under Kernel Base (on a 3/1G split with upper 1G for kernel, the BASE is at 0xC000_0000) and above TASK_SIZE (the maximum size of user-space tasks). This region is 16MB (in the 3.0.35 kernel I used) in size and this buffered space is the address space of all LKMs. 

2. Requesting a threaded_irq allows us to register a top-half IRQ handler (quick-check-handler) in the regular way, but also provides a way to register a (threaded) bottom-half handler that can sleep! 
   I always wondered why not schedule a workqueue worker thread from top-half quick-check handler, and since workers can sleep, that would be akin to threaded_irq concept. I never got proper answer anywhere, not even in the original article on threaded irqs (http://lwn.net/Articles/302043/).
But here is my take on why threaded IRQ might be a better design:
threaded_irq runs the bottom-half as soon as top-half finishes, as part of kernel core IRQ handling. Hence it is less vulnerable to scheduling vagaries as compared to running a worker, whose execution depends on scheduling delays. 

3. On devices where DDR size is critical, make sure to allocate enough space needed for DMA and not more. i.e., tweak DMA_ZONE to little over what your drivers are using for DMA. This is important because, memory allocated for DMA are not move-able, so whenever kernel is in dire need of pages, it will look at other ZONES before looking into DMA Zone. Hence memory chunk as part of DMA Zone may not be used by kernel even though its lying freely there, leading to other ugly consequences like Out-Of-Memory kills and thrashing. 

4. ATAGS is a nice way to pass board, memory and kernel parameters from Uboot to kernel.  ATAGS are essentially a list of tags, where tags contain specific data to be passed to kernel. We can defined our own custom tag and append it to the list. ATAGs are usually placed at (dram_phys_start+0x100).

5. It probably has been mentioned so many times, but container_of() is such a beautiful trick kernel uses to get the parent/embedded structure, which is reminiscent of object oriented programming and this pointer of C++. Linked lists are implemented this way and a tons of drivers and kernel core using this beautiful trick.

6. Scheduling a cleanup work (either delegating to workqueue or a kernel thread) as part of driver de-registration/exit leads to interesting results. During quick fast testing of functionality/feature on/off which involves associated driver loads and unloads; weird driver crashes and errors can be seen. These are mainly because of timing variations of kernel scheduling of workqueues and threads. In this case, its possible that the workqueue was queued-scheduled-executed AFTER the driver got unloaded leading to incomplete and messedup state when to driver was loading again. 

7. The concept of platform devices and drivers in Linux kernel is powerful to define meaningful structured and well-designed kernel drivers. A platform device is a logical/hypothetical device/peripheral on a logical 'platform bus'. A driver for such a logical device would be a 'platform driver'. Some examples would be the devices or controllers within SoC with direct interconnect to CPU core. 
Say, you want to power control UART peripheral. i.e., switch PLL for some duration/activity so UART runs at high speed, turn-off UART etc. Now, hardware-wise, UART controller is within SoC with some direct internal bus-interconnect to the CPU Core.  No software drivers are needed to drive this internal bus, so there is no concept of hardware bus for our power control.  Note, that there will still be a UART driver (serial driver) fot the physical UART contoller/hw module within SoC. We can assume this logical 'power controller' to be an external component to the UART hw module. So, in software, we can imagine the 'UART power controller' to be a 'platform device' on a logical platform bus connected to actual UART Hw module within SoC. This then requires a 'platform driver' to control it. 
And since this would be platform-specific (you may not need to power control or may want to do it differently on different platforms), the names platform-* is apt
(https://www.kernel.org/doc/Documentation/driver-model/platform.txt)


8. Debugging watchdogs had been our team's constant nightmare. We spent about an year to collectively reduce watchdogs and improve overall system stability, and the amount of time and effort we spend on root-causing various issues and solving them are immense! First of all, dynamic voltage and frequency scaling is hard to implement. Add flaky hardware components (did I say DDR?) to the equation. Debugging is a nightmare expecially when watchdog occurrences reduced and became more sporadic.
If you are part of initial hardware design, atleast make sure to enable warm reset feature (our Power Management IC didn't support it and hence any kind of instrumentation was lost during a watchdog reset)

9. Every kernel engineer and device driver writer should read this 101 on writing portable kernel and driver code (maybe reread it(applies to me too) several times)
www.linuxjournal.com/article/5783
 


No comments: