Saturday, November 25, 2006

Suse 10.1 - is this what Vista should look like?

After seeing the XGL magic in Suse 10.1, I really must try Vista.
I installed Suse 10.1 over the weekend as I had heard of the XGL magic. XGL was not installed by default, and I used yast2 to get the right packages installed.

I could then get gnome-xgl-settings to work, but it couldn't enable XGL as my graphics card (NVIDIA GEForce FX5200) was not supported. This post suggested that I download the driver from nvidia.

The package I downloaded from nvidia actually built a kernel module, installed the driver, prompted me to log out and back again, and voila, I had XGL.

[BTW, I couldn't run the install while I had X running, so I simply changed the default run level in /etc/inittab to 3, rebooted and installed from the console]

Lots of fun using the 3D desktop!

Wednesday, November 15, 2006

debugging the linux kernel

I'm taking baby steps today, debugging the kernel.
This seems a very useful resource.

I tested the com ports between the 2 machines using the instructions there and it works great.

A tiny missing step that would be useful to a newbie like me unfamiliar with patching the kernel:

Under "Applying the kgdb patch", you need to first go into the linux directory before executing the patch command. So between 1 and 2, there has to be a step:

cd {$BASEDIR}/linux-2.6.7

Also the path to the patches is missing a sub-dir. The patches are all unzipped into {$BASE_DIR}/patch-kgdb/linux-2.6.7-kgdb-2.2/ so the patch commands from 2 to 7 should read this way:

patch -p1 < ${BASE_DIR}/patch-kgdb/linux-2.6.7-kgdb-2.2/core-lite.patch

Snag 2:

I realized I needed the qt-devel package to do 'make xconfig' - this is fully graphical. So I decided to do make menuconfig (which I've done before, and which uses a text based menu). Then I ran into this compiler error. By following the simple patch (remove static from declaration of 'current_menu'), I could proceed with the build. Now I had the menu where I could make the kernel selections - progress!

Snag 3:

I realized that 2.6 has been compiled with GCC 3 and does not compile with GCC 4 (which is what most newer machines have). I decided to be macho and 'fix' the kernel code to get the compiler to sing. Still going strong after 1 day...
This is a good article on the GCC 4.0 changes that affect Linux 2.6.7 as well.

Snag 4:

I managed to fix all the compiler / linker errors (they were due to the stricter way GCC 4.0 treated inline and static keywords). I transfered the image over to the test machine, changed grub, but it wouldn't boot -problem mounting the file system. Upon investigating I found that linux needs the initrd image to boot (the initrd image loads the drivers in RAM). But to build the initrd image, you need to install the modules as well. (mkinitrd that you use to make the initrd image needs the kernel version and it looks inside /lib/modules/ for the drivers). Now since I didn't have the modules installed on the test machine for this version of the kernel, I just specified a valid kernel that was on the box to mkinitrd, and it successfully built me an initrd image. Now I moved it over to /boot, edited grub to note that and rebooted. No joy! It was still failing to mount. Over the weekend, I was chatting with a good friend who is now hacking away in the Linux kernel (he was previously an engineer in the Windows kernel) and he said that this is probably because the driver versions are being checked by Linux. So I decided to install the modules on the dev machine, make a correct initrd image and copy it over to the test machine. This time it actually booted!

Here is where you can read how to do a regular kernel compilation (for the modules install part)

Snag 5:

SEGSEV in the kernel! Well I was happy. We came this far and here's my chance to look at some kernel code, see what's going wrong.

Here's the immediate output from gdb:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1]
0x00000000 in ?? ()
(gdb) bt
#0 0x00000000 in ?? ()
#1 0xc03051db in psmouse_interrupt (serio=0xc048cde0, data=250 '\uffff', flags=0,
regs=0x0) at drivers/input/mouse/psmouse-base.c:206


And here is line 206 from psmouse-base.c:

rc = psmouse->protocol_handler(psmouse, regs);

So it seems that somehow the protocol handler for the mouse is not set. This is a USB mouse, so perhaps I forgot to set an option in make menuconfig...

hacking inside the grub prompt

Yesterday, I was setting up Linux on a new PC and went ahead with a custom partition table using druid. I didn't set up the /boot partition and as a result the grub configuration file (/etc/grub.conf) was incorrect. Then grub stopped at the grub prompt (thus giving me a chance to temporarily correct the problem)

I found this article very helpful in telling grub what I wanted, so that it proceeded with finding the Linux image and loading it.

Incidentally, this machine had a SATA drive, and ata_piix used to complain and kernel panic at boot, I removed the SSC pin (spread spectrum clocking feature) and it booted fine after that.

sudo without password

You can execute root-level commands with sudo without typing passwords by editing /etc/sudoers file to have a line like this (if your user name is alice):

alice (ALL)=ALL NOPASSWD: ALL

The file is read-only. You edit this file with the following command (which opens the file in vi):

visudo -f /etc/sudoers

Thursday, November 09, 2006

gcc offseof macro warning

Nathan Sidwell had this really neat trick that gets us around the gcc compiler warning on the usage of the offsetof macro.

Rather than using the offsetof macro, use the one below:

#define myoff(T,m) ((size_t)(&((T *)1)->m) - 1)

The trick is to use the address '1' rather than '0' (which the offsetof macro uses) and thus pacify gcc.

The issue of this compiler warning is a hotly debated issue.