Linux on the Thinkpad T40


Updates


I recently bought a new laptop, an IBM thinkpad T40 (model 2373-92U) , featuring the new Pentium-M processor. This model is pretty cool, with a great battery life, and a very fast processor. The amount of L1 cache (1MB) compared to other Intel processors (128 or 512KB) certainly explains partly the speed gain, for a relative modest CPU speed of 1.6 GHz.

Look 'n' feel

The T40 is a very slim machine. The cover borders around the LCD screen are asymmetric which surprised me a bit but allowed the engineers to gain few millimeter for the thickness of the laptop. The airflow extractor is located on the left of the machine, and I can feel the hot air blowed outside when using AC power, but it slowly becomes colder and more silent once the laptop is switced back to battery. When I boot linux with AC power off, I see that the CPU speed is set to 600MHz instead of the regular 1600MHz, which probably explains why the machine doesn't need much cooling power in this mode. I'll criticize the small plastic door shutting the location of the two PCMCIA slots. This little element is very fragile, and is heavily sollicited when inserting and removing PCMCIA cards. The external laptop battery also has two locking points only, which can be considered a bit light compared to the weight of the battery itself. Even when locked, the battery still supports small displacements.

Kernel compilation time comparison

I timed the compilation of a 2.4.21-pre7 kernel, with this configuration file, needed for my T40. The results are quite impressive :

T40, Intel Pentium-M, 1.6GHz, 1GB SDRAM PC2100 : 208.260u 9.620s 3:41.49 98.3%0+0k 0+0io 864968pf+0w
Desktop, Intel P4 2.54GHz, 1GB SDRAM PC2700, 80GB IDE133 HDD : 257.490u 17.770s 4:44.05 96.9% 0+0k 0+0io 835051pf+0w
Desktop, AMD MP1900+, 1.5GB SDRAM PC2100, 34GB SCSI160 HDD : 312.930u 22.450s 5:37.99 99.2% 0+0k 0+0io 898531pf+0w

I installed a basic RedHat Linux 9 on the machine as a starting point. The BIOS configuration masquerade completely the IBM recovery partition in my case, so there's no risk to accidently override this hidden partition by mistake in the partitioning stage. Even if I doubt I'll ever need this recovery method. You can guess that there's a protected area in the boot messages.

hda: host protected area => 1
hda: setmax LBA 156301488, native 150198690
hda: 150198690 sectors (76902 MB) w/7884KiB Cache, CHS=9933/240/63,
UDMA(100)

The remaining area is formatted with only a Linux ext3 filesystem, and the traditionnal swap area sized to occupy twice the amount of installed memory.

$ fdisk -l /dev/hda

Disk /dev/hda: 76.9 GB, 76901729280 bytes
240 heads, 63 sectors/track, 9933 cylinders
Units = cylinders of 15120 * 512 = 7741440 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 1 9655 72991768+ 83 Linux
/dev/hda2 9656 9932 2094120 82 Linux swap

Customizations

You'll have a install a recent kernel for two reasons at least :

You can grab my .config file.

BIOS settings

Suspend and resume through bios is currently still broken, and should be avoided. Specifically, I had to change some default setting in the BIOS related to power management. I had to disable all automatic suspend actions.

I defined the Custom mode as follows:

I also followed BIOS settings advices found on Klaus Weidner's Linux on T40 document

Security Chip

The ThinkPad T40 contains a TCPA chip. A GPL driver is availble from IBM Watson Research Web site. The current version of the driver is tpm-1.1b.tar.gz. Another useful resource is an article in the August 2003 Issue of the Linux Journal, that gives some examples on how to program the chip. Latest version of the tpm driver includes related example programs, as described in this article.

Bios settings

HDD

Once you have enabled UDMA for this drive, the performances are rather good. My model is shipped with a 80GB disk, model IC25N080ATMR04-0.

# hdparm -i -v /dev/hda

/dev/hda:
multcount = 16 (on)
IO_support = 3 (32-bit w/sync)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 9933/240/63, sectors = 150198690, start = 0

Model=IC25N080ATMR04-0, FwRev=MO4OAC0J, SerialNo=MRG400K4G0Y05A
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=DualPortCache, BuffSize=7884kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=150198690
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5
AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled
Drive conforms to: ATA/ATAPI-6 T13 1410D revision 3a: 2 3 4 5 6

# hdparm -t /dev/hda

/dev/hda:
Timing buffered disk reads: 64 MB in 2.40 seconds = 26.67 MB/sec

Update 2004/01/11: I received a new 60GB 7200RPM hard disk, to replace my builtin 80GB 4200RPM. I'm a bit disappointed by the performances of this new drive, compared to the 4200RPM one. Here is some information :


# hdparm -i -v /dev/hda
/dev/hda:
multcount = 16 (on)
IO_support = 3 (32-bit w/sync)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 256 (on)
geometry = 16383/255/63, sectors = 111107442, start = 0
Model=HTS726060M9AT00, FwRev=MH4OA60A, SerialNo=MRH400M4G1589A
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=DualPortCache, BuffSize=7877kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=111107442
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled
Drive conforms to: ATA/ATAPI-6 T13 1410D revision 3a: * signifies the current active mode
# hdparm -t /dev/hda
/dev/hda:
Timing buffered disk reads: 104 MB in 3.05 seconds = 34.14 MB/sec

tiobench provides some more realistic time measures. This first test shows results for the 80GB 4200RPM hard disk on a 2.6.1-mm2 kernel.


Unit information
================
File size = megabytes
Blk Size = bytes
Rate = megabytes per second
CPU% = percentage of CPU used during the test
Latency = milliseconds
Lat% = percent of requests that took longer than X seconds
CPU Eff = Rate divided by CPU% - throughput per cpu load
Sequential Reads
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 26.73 3.985% 0.146 417.05 0.00000 0.00000 671
2.6.1-mm2 1792 4096 2 22.32 3.047% 0.349 475.69 0.00000 0.00000 732
2.6.1-mm2 1792 4096 4 19.63 2.643% 0.792 794.71 0.00000 0.00000 743
2.6.1-mm2 1792 4096 8 20.04 2.675% 1.551 562.11 0.00000 0.00000 749
Random Reads
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 0.69 0.335% 5.664 36.79 0.00000 0.00000 206
2.6.1-mm2 1792 4096 2 0.71 0.145% 10.920 117.36 0.00000 0.00000 488
2.6.1-mm2 1792 4096 4 0.69 0.150% 22.064 205.43 0.00000 0.00000 460
2.6.1-mm2 1792 4096 8 0.73 0.233% 40.649 458.29 0.00000 0.00000 313
Sequential Writes
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 22.44 8.927% 0.128 1612.57 0.00000 0.00000 251
2.6.1-mm2 1792 4096 2 21.97 8.851% 0.255 1380.63 0.00000 0.00000 248
2.6.1-mm2 1792 4096 4 21.49 8.797% 0.530 2146.41 0.00087 0.00000 244
2.6.1-mm2 1792 4096 8 20.89 8.479% 0.888 4415.89 0.02245 0.00000 246
Random Writes
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 0.62 0.257% 0.011 0.10 0.00000 0.00000 240
2.6.1-mm2 1792 4096 2 0.60 0.229% 0.011 0.10 0.00000 0.00000 260
2.6.1-mm2 1792 4096 4 0.61 0.271% 0.011 0.11 0.00000 0.00000 226
2.6.1-mm2 1792 4096 8 0.67 0.279% 0.011 0.09 0.00000 0.00000 240

And this second test shows results for the 60GB 7200RPM hard disk. As expected, this disk provides a lower latency, and a better throughput for random reads. Other values are very similar.


Unit information
================
File size = megabytes
Blk Size = bytes
Rate = megabytes per second
CPU% = percentage of CPU used during the test
Latency = milliseconds
Lat% = percent of requests that took longer than X seconds
CPU Eff = Rate divided by CPU% - throughput per cpu load
Sequential Reads
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 22.97 3.567% 0.170 387.24 0.00000 0.00000 644
2.6.1-mm2 1792 4096 2 19.81 2.635% 0.385 349.83 0.00000 0.00000 751
2.6.1-mm2 1792 4096 4 19.16 2.629% 0.807 752.50 0.00000 0.00000 729
2.6.1-mm2 1792 4096 8 18.66 2.561% 1.637 462.07 0.00000 0.00000 728
Random Reads
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 0.91 0.543% 4.273 27.66 0.00000 0.00000 168
2.6.1-mm2 1792 4096 2 0.95 0.175% 7.784 112.97 0.00000 0.00000 539
2.6.1-mm2 1792 4096 4 0.96 0.252% 15.766 150.34 0.00000 0.00000 381
2.6.1-mm2 1792 4096 8 0.93 0.173% 30.318 357.22 0.00000 0.00000 539
Sequential Writes
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 21.76 8.807% 0.133 2364.58 0.00109 0.00000 247
2.6.1-mm2 1792 4096 2 20.44 8.368% 0.268 3570.25 0.00392 0.00000 244
2.6.1-mm2 1792 4096 4 21.03 8.449% 0.528 7065.97 0.01068 0.00000 249
2.6.1-mm2 1792 4096 8 20.19 8.328% 1.010 7164.87 0.02529 0.00000 242
Random Writes
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
2.6.1-mm2 1792 4096 1 0.56 0.246% 0.011 0.10 0.00000 0.00000 226
2.6.1-mm2 1792 4096 2 0.55 0.214% 0.011 0.14 0.00000 0.00000 256
2.6.1-mm2 1792 4096 4 0.57 0.217% 0.011 0.12 0.00000 0.00000 260
2.6.1-mm2 1792 4096 8 0.57 0.208% 0.011 0.14 0.00000 0.00000 274

Update 2004/02/21: I upgraded the firmware of my 60GB/7200 7K60 Hitachi disk (model HTS726060M9AT00) a few days ago. Now, hdparm reports FwRev=MH4OA6BA. This upgrade is advertized by IBM, as a mandatory firmware upgrade. Unfortunately, after the upgrade, the predesktop area is no longer accessible, neither by the Access IBM setup, nor by Linux, nor by the IBM restore tools. The HPA (hidden protected area) now fully justifies its name, because this area is really, really well hidden... Someone else reported the same state on a T40p. The problem has been currently submitted to both IBM and Hitachi customer services. It may certainly be worth waiting, until another firmware fix is available. Alas, IBM didn't explain why this firmware upgrade is mandatory...

Update 2004/02/28: I received yesterday from IBM a set of recovery CD, that didn't helped me to regerate the predesktop area. At least they preserved the existing Linux partition, and simply reinstalled WinXP in its previous location. Neat.

I still think that the HDD firmware upgrade broke something related to the SET_MAX ATA command. This command allows to change the Reported Max of the drive, by opposition with the Native Max. See this document from Phoenix, that explains how they handle this drive feature to store recovery data. When the security mode of the predesktop area is set to normal in the BIOS, it is possible to change the reported max value, by software, for example with a program called setmax.c.

Here is the output of setmax, on my two HDDs :


# ./setmax /dev/hdc
Using device /dev/hdc
native max address: 156301487
that is 80026361856 bytes, 80.0 GB
lba capacity: 150198690 sectors (76901729280 bytes)
# ./setmax /dev/hda
Using device /dev/hda
native max address: 117210239
that is 60011642880 bytes, 60.0 GB
lba capacity: 111107442 sectors (56887010304 bytes)

The SET_MAX command is only accepted by my other unmodified 80GB driver (hdc), and fails on the 7k60 Hitachi patched drive. I think that here is the root of the problem : both drives should behave the same way here, but they don't.


# ./setmax -d0 /dev/hdc
Using device /dev/hdc
setting delta=0
nativemax=156301487 (0x950f8af)
lba capacity: 156301488 sectors (80026361856 bytes)
# ./setmax -d0 /dev/hda
Using device /dev/hda
setting delta=0
nativemax=117210239 (0x6fc7c7f)
HDIO_DRIVE_CMD_AEB failed SET_MAX: Input/output error
81 = 0x51
4 = 0x4
0 = 0x0
127 = 0x7f
124 = 0x7c
252 = 0xfc
230 = 0xe6

The specifications of this 7K60 disk are available online.

Update 2004/04/22: Finally my 7K60 HDD hard drive has been replaced by Hitachi. The replacement drive, that I received, has a slightly different model number, and shows the following specifications (manufactured Feb 04):


Model=HTE726060M9AT00, FwRev=MH4OA6AA, SerialNo=MRH4X3M4G9J09B
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=DualPortCache, BuffSize=7877kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=111107442
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled
Drive conforms to: ATA/ATAPI-6 T13 1410D revision 3a:

The moral of this story is that the firmware upgrade, that I applied was apparently not suitable for my Hitachi drive, because this drive was not built-in by IBM, but purchased separately, altough this drive model had the right model number to support this new firmware, and altough the firmware upgrade program didn't complain at all during the upgrade process.

CDRW/DVD

The T40 is shipped with a combo Matshita UJDA745 DVD/CDRW model. Unfortunately as expected, this model cannot be converted to RPC-1, because no patched firmware seems to be available on usual web sites. With the help of the rpcmgr.c utility, I succeeded in setting my current DVD zone. The model came with no region pre-configured.

The XVideo extension is available and working fine with my usual DVD player, xine..

3D

I recompiled XFree86-DRI from CVS at the beginning of May 03, and helped by the ICH4 support in AGPGART, I have a working 3D configuration, but somewhat slow compared to similar graphic cards on desktops machines. The 3D worked fine with the XFree86-4.3 shipped by RedHat as far as I remember, but I prefered to use DRI to have a chance to test latests acceleration stuff and Mesa 5.0. You just have to change the ProjectRoot variable in dri/xc/xc/config/cf/host.def, to have a parallel installation of XFree86-DRI with a regular XFree86-4.3.0 in /usr/X11R6. My model 237392U is shipped with a Radeon 9000 Mobility (R250 Lf). The Intel chipset is rightly detected.

Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory: 816M
agpgart: Detected Intel(R) 855PM chipset
agpgart: AGP aperture is 256M @ 0xd0000000

glxgears scores 1700 fps with my Radeon 8500 built by ATI on my desktop machine in 24 BPP, AGPx4, Intel P4 2,5 GHz. I obtain 1500 fps with a Radeon 9000 on another machine in 24 BPP, AGPx4, AGPFastWrite enabled, AMD Athlon MP1900+, and I only got 900 on my T40, in 24 BPP, AGPx4. The AGPFastWrite doesn't work and freezes the machine. I expected a performance loss for a same chip between the Mobilily and the desktop version, but I hope there's still way to improve 3D performance a bit with this card.

Update 2003/05/14: I changed the default Depth from 24 to 16 bpp, and obtains a better score of 1500 fps with glxgears. Playing with FlightGear is pleasant and fluid, even if it locks the screen randomly. The machine is still reachable remotely, and you can see that the 3D app process disappeared, and X is using all the CPU time. I already experienced these kind of problems on my desktop machine, so I would say this is related to the driver implementation (the ATI closed source driver solved this issue on my desktop machine)

Update 2003/05/16: I ran the viewperf benchmark, and obtain the following results in 16 bpp for all tests. These tests give rather uniform performances for the ATI series, unrelated to the fact the card is in a desktop box or in a laptop. The gap performance is filled when compared to glxgears in 24 bpp. We also have to keep in mind that glxgears is in itself *not* a benchmark in any way.

3dsmax-02 drv-09 dx-08 light-06 proe-02 ugs-03
Radeon 9000 5.314 10.42 26.27 5.935 4.348 2.226
T40, Radeon Mobility 9000 5.516 12.54 27.74 7.365 5.361 2.449
Radeon 8500 LE 5.483 9.578 27.59 4.410 3.837 x

Update 2003/08/10: I experienced a weird behaviour of my graphic card since yesterday: a kind of screen corruption, that persisted even over a cold reboot. I took some photos here, before the problem vanished as silently as it appeared. On first cool reboot, the initial boot screen was corrupted, like on this photo (look at the I and M letter), the next GRUB screen was crippled too. Issueing a warn reboot with CTRL-ALT-DEL caused a worse screen corruption : look at the ThinkPad boot screen and the GRUB screen :-(. All 3D applications, like FlightGear in this example, in 16bpp, were filled with white moving dots and other image artifacts, like a dark translucent area moving fast up and down over the 3D drawing area. The integrated PC-Doctor diagnostic tool showed errors in the Video Adapter tests. Now all tests pass fine again.

This corruption disappeared in a second, while running Flightgear to take the photo you saw above: the 3D image became clear again, and the white dots disappeared. It showed up yesterday, while rebooting the machine after a kernel crash. I also played flightgear yesterday before the problem appeared, but I only noticed the damage at reboot. So I have the feeling that this problem may be DRI-related, but that's just my intuition. No errors in the XFree86 log. (my current radeon driver settings are 24bpp, AGPMode "4", AgpFastWrite "off", EnableDepthMoves "on", and EnablePageFlip "on").

Update 2004/06/10: The screen corruption observed here happened again last week. The machine got serviced, and the motherboard has been replaced. The problem was related to bad video RAM. I have no idea on what caused this damage. Kudos for the IBM technical service, that repaired the laptop within two days !

Internal Ethernet

Works fine with the e1000.o driver. I made some tests with my gigabit AT network switch, and the performance results are blazzingly fast. I obtained a sustained bandwidth of 300Mbits, while eating network resources from several NFS mounted servers, plugged of the same switch.

Update 2003/06/15: A specificity of the e1000 driver is that it falls back to half-duplex if the partner doesn't do auto negociation. So if you have a 100 Mbps ethernet switch, your link will probably be detected at 100 Mbps half-duplex, even if the switch can do full duplex. Module parameters can disable auto-negociation. Add "options e1000 Duplex=2 Speed=100". Both speed and duplex must be set to disable auto-negociation. See Intel's driver documentation for further details.

Soundcard

Works equaly fine with either the i810_audio.o driver or with ALSA snd-intel8x0.o driver.

Internal Modem

No success currently, but the Smart Link driver looks promising, and almost works for me in version 2.7.14. The driver recognizes the hardware. With minicom, the modem responds to the ATZ command, but the dialing usually fails with a "NO_DIALTONE" error. This may be related to the fact that I'm in France, and the kind of telecom modulation used here is not (yet) properly supported. So, until I discover a better solution, I'll keep my trusted MegaHertz 3CXEM556 card, with a real modem inside and an useful XJack connector.

Update 2003/08/06: wow, the hack to successfuly use the SmartLink driver came from this message on the linux-thinkpad mailing-list, relating the experience of toshiba laptop user with the same kind of winmodem in his machine. The original news is here at the Modem section. The key of the solution is to use a particular version of the modem driver, slmdm-2.7.10.tar.gz, (all later versions suffer from the "twice dialtones" problem, already described on the linux-thinkpad list), and to tweak the amrmo_init.c file, replacing the #define PCI_DEVICE_ID_ICH3 value to 0x24C6 instead of 0x2486 (this is the value of PCI_DEVICE_ID_ICH4 too, on the line below). The installation and configuration process remains unmodified. As you can see further in this file, the support for ICH4 was previously disabled (line 218), this hack just re-enables it.

I can confirm that the modem works on my T40, at least in the V90 mode that I tested, using my regular dial-up ISP, connected without problem, at 45000 bauds, as usual. Then I tryied to use a service, specific to France, called minitel, connecting at 1200/75 bauds in V.23 mode, without success this time. Maybe, the init sequence (at+ms=23) was incomplete. The connection is okay, incoming data from the server are correct, but keystrokes sent to the server, and coming back to screen are somewhat garbled during the trip. Fax and voice modes are untested.

Update 2003/09/09: I found on the web the SmartLink Modems AT Commands Set and S-Registers documentation.

Update 2003/10/25: SmartLink released a new linux driver for the AMR modem, in their unsupported directory. This driver doesn't implement all the expected features of the modem yet (Fax, Voice modem, V23 are missing for example), but this new release support the V90/V92 data mode, and doesn't suffer from the previous dial twice problem. It works with the T40 and its ICH4 modem without code modifications, which is good too. I tested slmodem-2.9.2.tar.gz, and I could connect to my ISP successfully.

Update 2003/11/29: While browsing the "Software and Device Drivers" web page from IBM for my T40, I found I new link to an agere driver for the softmodem of my laptop. You can download the driver here. I tweaked the code a bit, to make it compile on my Fedora kernel, but unfortunately, the driver generates an oops, while running pppd. The crash occured after the modem received the "CONNECT" token, and pppd died on this error "pppd[4377]: Couldn't set tty to PPP discipline: Invalid argument". You may have a better luck with one of the officially supported kernel : RH7.3 (2.4.18-3, 2.4.18-24), RH8.0 (2.4.18-14, 2.4.18-27), RH9 (2.4.20-8), and Suse8.1 (2.4.19-4GB).

Update 2004/02/10: Yes! The 2.6 linux kernel on my laptop is no longer tainted. The last proprietary module (slamr.o) for the AMR softmodem can be removed, and replaced by an equivalent driver from the ALSA project. You have to install at least version 1.0.2 of ALSA, and configure drivers intel8x0m (for the modem) and intel8x0 (for the regular audio device, I82801DBICH4).

Then, install a recent 2.9 SmartLink driver, and compile it to use ALSA instead of the proprietary slamr module. The README file explains this setup with details. The proprietary knowledge of this modem driver is now completely located in the userland daemon, that is to be started during boot. The kernel is no longer tainted : no proprietary module is needed in kernel space anymore (note : the radeon card is a mobility 9000, not a FireGL one, so the 3D is handled by the DRI project successfully, and the ATI proprietary driver is not required).

Touchpad support / 3rd mouse button support

The synaptics XFree86 mouse driver provides support for the TouchPad including tap on the touchpad to do a click, and the two buttons located below the touchpad are supported, so you have to emulate the third mouse button in this configuration. This driver works fine with latest DRI from CVS currently, but YMMW.

With the standard XFree86 mouse driver, everything works except the third button between the trackpoint and the touchpad.

Another touchpad driver could provide a way to maybe enable the third button (tpconfig -3 ?). The program identifies a Synaptics Touchpad, but the detected firmware version (single-byte mode) disables this interesting option :

# ./tpconfig -3 

========================================================================
= =
= tpconfig version: 3.1.3 =
= =
= Synaptics Touchpad and ALPS GlidePad/Stickpointer configuration tool =
= =
= Copyright (C) 1997 C. Scott Ananian<cananian@alumni.princeton.edu> =
= Copyright (C) 1998-2001 Bruce Kall <kall@compass.com> =
= Last Modified (Version 3.1.3) by Bruce Kall, 2/22/2002 =
= =
= tpconfig comes with ABSOLUTELY NO WARRANTY. This is free software, =
= and you are welcome to redistribute it under the terms of the GPL. =
= =
========================================================================

Found Synaptics Touchpad.
Firmware: 5.9 (single-byte mode).
Button mode not supported on this TouchPad.

With this driver, you can enable (--tapmode=1) the tap gesture : tap on the pad generates button1 press events. Moreover, the sensors type Sensor type: unknown (44). is not recognized by the program, and is not documented in the docs provided by Synaptics.

Update 2003/05/14: I've been able to get a working 3rd button, altogether with the touchpad and it's extra features with gpm and the help of its "repeater" feature. This option allows gpm to gather mouse event information from the real mouse device (potentially several devices), and to rewrite them on a fifo for use with X11. In this configuration, you don't have conflicts between X and gpm anymore, for the simple reason that only gpm is reading the real mouse device. The interest of this approach is that latest gpm does have support for the Ultranav synaptics. The remaining restriction is that the "repeater" feature doesn't forward mouse wheel events to X11. So if you have a supplementary external wheel mouse, you'll have to configuration it from X only if you want to use its wheel with X. I can live with this glitch.

I have two mice in my configuration, the first one is the integrated pointing device, and the second one is an external USB wheel mouse. I installed latest gpm version 1.20.1 from ftp://arcana.linux.it/pub/gpm/ because the RH9 installed version (gpm-1.19.3-27) was too old to recognize my touchpad properly, and I launched it with the options : /usr/local/sbin/gpm -m /dev/psaux -t synps2 -R.

On the XF86Config side, the interesting part of the configuration file is:

Section "ServerLayout"
Identifier "Default Layout"
Screen 0 "Screen0" 0 0
InputDevice "mouse0" "CorePointer"
InputDevice "mouse1" "AlwaysCore"
InputDevice "Keyboard0" "CoreKeyboard"
EndSection

Section "InputDevice"
# Internal Trackpoint/Touchpad through gpm
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "MouseSystems"
Option "Device" "/dev/gpmdata"
Option "Emulate3Buttons" "no"
EndSection

Section "InputDevice"
# USB mouse
Identifier "Mouse1"
Driver "mouse"
Option "Protocol" "IMPS/2"
Option "Device" "/dev/input/mice"
Option "ZAxisMappig" "4 5"
Option "Emulate3Buttons" "no"
EndSection

Update 2003/07/20: You can create the /dev/gpmdata communication device between gpm and X, if it doesn't exist on you system with mknod /dev/gpmdata p. This is a simple pipe to allow communication between both processes.

Update 2003/12/05: I was probably on crack when I wrote the above entries, because without this patched version of synaptics.c.gz, originaly available from the gpm development mailing-list, the buttons located above the touchpad exhibit a rather erratic behaviour while drag-and-dropping. Just substitute the file src/synaptics.c from the gpm source code with this new one, recompile and install. You may have to adapt the file /etc/gpm-syn.conf, because some options are no longer available. Anyway, this updated version of the synaptics driver works fine, with both the touchpad and the trackpoint enabled in the BIOS (configured as "automatic"). All buttons are available and functional, and both pointing devices can be used concurrently.

Update 2004/01/05: Thank you to Luca Gugelmann, who provided me a way to forward USB wheel events from gpm to X, by forcing gpm to talk to X using the IntelliMouse protocol (ms3) instead of the default Mouse-Systems protocol (msc). The first one knows about wheel events, so it can forward them properly. The ultranav device (trackpoint + touchpad) is still handled by gpm only. This new gpm/X mouse configuration is very simple. gpm is configured to listen to both input devices, and to merge and forward them to /dev/gpmdata : gpm -Rms3 -m /dev/psaux -t synps2 -M -m /dev/input/mice -t imps2 (the second device definition may have to be adapted).

And the XF86Config file just needs a single input section:


Section "InputDevice"
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "IntelliMouse"
Option "Device" "/dev/gpmdata"
Option "ZAxisMapping" "4 5"
Option "Buttons" "5"
EndSection

Bluetooth

Not tested.

IR

Not tested.

Wireless

The wireless card included in the T40 is NOT the intel one. That's why this model is not advertized as a "Centrino" laptop, because it doesn't have the whole required components to use this label. The wireless card is currently unsupported, and you'll have to rely on an auxilliary PCMCIA card to go wireless.

Update 2003/05/23: Promising developments are worth to be followed: A GPL driver is going to be written for the Atheros cards (pci id 168c:0012 on the T40). This is still at the early stage of developement. A mailing-list is available too. Thanks to Valient Gough for finding this information. Update 2003/10/18: This project doesn't show much activity, and the best place now to get drivers for the Atheros cards that come with the T40 is the Madwifi project.

Update 2003/06/26: With the help of Ted Ts'o web page on the T40, I finally swapped the unsupported internal IBM 802.11 a/b wireless card for a Cisco mini-PCI card (IBM P/N 31P8301) instead. This card seems to be the only solution to have a working wireless connection currently, except using an external PCMCIA card. In fact, the BIOS refuses to boot if the mini-PCI wireless card is not the Intel one, the IBM one or the Cisco one, for wireless regulatory reasons. The manual of the Cisco card is clear on the subject : Attention for the BIOS Lock Protection: The ThinkPad computers listed in the above IBM site are designed to operate with the proper wireless options. If you install an unauthorized wireless LAN Mini-PCI Card, your ThinkPad does not start and emits beeps with the BIOS lock out.

The T40 Hardware Maintenance Manual is of great help to know how to access the mini-PCI connector. I had to remove the keyboard, the hard disk, and the palm rest for this operation. As usual, some care has to be taken when dealing with fragile elements, for example, the keyboard ribbon connector, or the antenna connectors.

The mpi350 driver builds fine in a 2.4.21 kernel. The wireless configuration tool is unfortunately a proprietary GUI. WEP encryption worked only in the "enterprise configuration profile", not in the "home network configuration". You can get some information in /proc/driver/mpu350. But unfortunately, this driver doesn't handle the wireless extensions, so you don't have the possibility to use the iwconfig tool. You also don't have access to the strength of the signal through /proc/net/wireless.

Update 2003/06/27: A new driver is out for the IBM dual band wireless card on sourceforge. Currently, this driver still fails, because it does not power the card on. The wireless light on the panel remains off. Feedback from this driver's author informs us that this bug will soon be corrected.

Update 2003/07/02: The airo-linux driver from the sourceforge CVS provides a preliminary support for the Cisco 350 mini-pci card (pci id 14b9:a504). Currently the tx mode is not working. I hacked the driver a bit, from a close comparison between the airo code and the code of the driver provided by Cisco, and the tx mode is now working. You can grab the original airo_mpi.c file, and the patched airo_mpi.c.new version. Just add obj-m += airo_mpi.o in the Makefile at linux/driver/net/wireless. You can now happily use the wireless extensions with this card. I currently only successfully tested association with and without WEP. Feedback welcome.

Update 2003/07/04: The patch described above is now integrated into the airo-linux driver in sourceforge CVS. Thanks to Benjamin Reed for his work on this project. I now have a working wifi Cisco card, and working wireless extensions too. I can happily say goodby to the Cisco proprietary configuration tool. But I have a recognize a merit to Cisco : they published the source code of the driver for their card, so a comparison between their driver and the one from the airo-linux project helped a lot to find the remaining bugs.

Update 2003/07/20: I hacked around airo_mpi.c again, I merged the locking changes proposed by Daniel Ritz on LKML last week, and merged more stuff from the working mpi350.c driver from Cisco : transmit functions, Tx fid, Rx fid descriptors, added debug helper functions, changed the way the quality level was computed, added the noise level. The goal was to obtain the same stability with driver than with the one from cisco. You can download my patched version here : airo_mpi.c-20030719. RFMON is still not working, packets are captured without their 802.11 header. Help is welcome to debug this problem, because I have no documentation for this chipset, and the cisco driver doesn't implement RFMON, so I cannot find what's going wrong. MIC also is disabled in this driver and completely untested.

Update 2003/08/07: A message on the linux-thinkpad mailing-list reported a success with the IBM dual band wireless card of his T40p using the latest version (20030802) of the madwifi driver.

Update 2003/08/08: I wrote a short HOWTO, airo_mpi.HOWTO.txt, to setup the Cisco wireless mini-PCI card, using my modified version of the airo-linux driver. Thank you to Dan Borello and to Alexis de Lattre for providing the initial material for this doc.

Update 2003/08/18: The driver shows a flaky behaviour when handling high network load for a long time (more than 1 hour typically). This problem has been reported by several person, which excludes an isolated hardware problem. Ironically, this same bug also applies to the cisco driver too...

Update 2003/09/26: This snapshot airo_mpi.c-20030926 of the airo driver for mpi 350 cards seems to work better than previous versions, for me at least. The card worked fine with a high sustained bandwidth (and also with interleaved calls to iwconfig and ifconfig to stress RID accesses too) for more than 15 hours without any locking situation. See this MRTG graph... The airo_mpi.HOWTO.txt file still describes the installation procedure.

Update 2003/10/11: I added basic power management support in airo_mpi.c-20031011, so the card can now enter and quit suspend mode, without loosing its configuration, and without crashing. I also removed the wifi0 interface (dedicated to monitor mode), until I find a way to properly access the whole 802.11b header when monitor mode is enabled.

Update 2003/10/23: I merged the 2.6 patch from Peter Johanson, fixed compilation problems with the RedHat kernel, fixed the suspend/resume problem with 2.6 kernel and acpi, and packaged all the needed files in a single tarball.

Update 2003/11/06: Thanks to Nickolai Zeldovich's suggestion, I tested the driverloader from Linuxant. Primarily targeted towards Broadcom (AirForce) card users, this driver also works with other NICs, including the Cisco mini-pci MPI350 that comes with the T40. This driver acts as a compatibility layer between the Windows XP regular driver of your card and the linux kernel. The technical challenge is interesting.

I tested the current version of the driverloader, and I got a copy of the Windows XP drivers for my card from Cisco. I chose version 3.4.9 from Cisco web site, because this version comes along with a firmware version 5.00.03, which is the closest version to my own firmware version 5b00.08. The driverloader is currently provided under a restrictive 30-days trial license. You have to register to linuxant.com and provide the Ethernet address of the card you wish to use driverloader with.

The configuration of the driverloader is done through a regular browser, you select the .INF file of the driver, and the corresponding .SYS et .VDX files, and hopefully your card will be configured and will show up as interface eth1. The wireless extensions are basically available, except the signal quality information. The driver uses the EEPROM-stored WEP key, because I didn't have to manually enter the WEP key. Nickolai told me that the driver has been stable for several days. I tested the RFMON monitor mode, but it didn't work. According to the FAQ, this feature is not officially supported by NDIS, so I cannot tell if this is a limitation of driverloader, of the cisco driver, or of the firmware of the card.

Is driverloader the right solution for you ? If you have a totally unsupported wireless card, and if you don't want or don't have the possibility to use a well supported Prism2 PCMCIA card instead, well, it might be a solution worth to consider. But remember that using binary-only modules in the kernel has several drawbacks, including these:

Update 2003/11/08: A new Makefile provided by Peter Johanson should compile properly on a 2.6 kernel. No changes to the driver code in this new release. I added a note about the Ad-Hoc mode, that's been reported to be broken by several persons. You can grab the tarball.

Update 2003/11/29: I didn't update the airo_mpi driver since a long time. The reason of this is that I tried to improve its stability. This is not acceptable to have a driver crash twice a day or more. So I made some profiling on the driver code to find what functions are called, and when, to try to isolate bottlenecks in the code that could cause unexpected bahaviour.

When the driver is strictly only used to transmit/receive packets, It is noticeable that its stability is globally improved. The interrupt handler is executed very quickly, basically 3 usecs to ack a transmitted packet, and 10 usecs to copy an incoming packet to the network layer. The handler used to tell the card that there's a packet to be transmitted is executed in 10 usecs too. So the interrupt handler doesn't waste much time, and that interrupts processing is not delayed too much for my taste. In this scheme, the driver is relatively stable. I could transfer several GB of data over the radio. Although I wouldn't say that its behaviour is totally crash-free. The symptom of a crash is always the same : all the registers of the card get garbled with random values, and the card no longer acknowledges any command, even the reset one. This kind of crash is unrecoverable, the card needs to be powered off (by suspend, hibernation, reboot for example).

When statistics data are requested from the card, it becomes a lot less stable while dealing with high radio traffic in parallel. This happens when /proc/net/wireless is read by the gnome wireless applet, when iwconfig is called, or simply when gkrellm, snmpd or ifconfig ask for the amount of data transmitted/received on this interface. A typical command to request a RID from the card needs 500usecs to complete. Interrupts are not disabled during this time, but this is anyway a large amount of time, compared to the delay required to ack an interrupt. Most of the the time is spent in the issuecommand() function, that consequently is a very sensitive part of the driver.

Yesterday, I reinstalled my laptop, due to strange uncorrectable "ECC Error", and "Bad address match" errors on my hard disk. A low level format with PC-Doctor solved this problem, that seemed to be software related. This time, I didn't make the same mistake too times, so I didn't remove my Windows XP partition. Instead, I just resized it with parted, before the restoration of my linux filesystem. So I could upgrade to the latest firmware from Cisco for my mpi350 card (5.30.17) with the Windows XP tools. The airo_mpi driver doesn't crash at startup with this new firmware, but it cannot link to the access point anymore (no EV_LINK interrupt is received). I'll have to investigate this point. I downgraded the firmware to version 5.00.01 from Cisco, and it works (5.00.03 works too, but 5.20.17 doesn't). So the good news is that the custom version 5b00.08 is not the only one, supported by the linux driver. I think I won't go much further with this driver, it is too much demotivating. I already ordered a tri band IBM 802.11 a/b/g card to replace this cisco one...

By the way, the hack that consists in hidding the file called windows/system32/convert.exe to prevent the conversion of the WinXP partition from FAT32 to NTFS didn't work. The conversion process occured anyway.

Update 2003/12/04: Okay, I made a new snapshot of my work-in-progress with the airo_mpi driver. You can grab the version of day tarball. This version is still unstable, but is expected to correct some missing features : ad-hoc mode works, and the driver now wakes up from hibernation.

Update 2003/12/16: Here is another update of the airo_mpi driver, you can grab the tarball airo_mpi-20031216.tar.gz. I'm rather satisfied this time, by the stability of this version. Specifically, I got rid of the infamous txreclaim function, that seemed to be the cause of all the stability issues that were previously observed. This card becomes finally usable under linux, supposed that you have a supported firmware (5.00.xx).

Update 2003/12/17: New update of airo_mpi (airo_mpi-20031217.tar.gz) with small fixes.

Update 2003/12/20: New update of airo_mpi (airo_mpi-20031220.tar.gz) with small fixes mainly for the 2.6 kernel.

Update 2004/01/06: I have made a first attempt to merge the MPI350 support back in airo.c. You can grab the patch airo.c-2.6.1-rc1-mm2.diff.

Update 2004/01/11: Updated patches for the MPI350 support in airo.c. According to your kernel version, you can use airo.c-2.6.1-mm2.diff, airo.c-2.4.25-pre4.diff, or airo.c-2.4.22-1.2149.nptl.diff for the latest Fedora kernel. This patch only contains a small change related to the maximum and average values of the signal quality encountered with the MPI350 card, that seem to slightly differ from the ones seen with the other aironet cards.

Update 2004/02/07: The support for the MPI350 card has been integrated in the kernel 2.6.3-rc1 !

Update 2004/02/19: I backported the support for MPI350 card, now in 2.6 stock kernel, for the 2.4.25 kernel too. You can apply the patch airo.c-2.4.25.diff.

Update 2004/04/01: Here is a patch that should solve locking issues observed on 2.6 kernels, with preemption disabled, when resuming from power management. This patch also contains a fix for ACPI suspend, that properly resets the card (why does "echo 3 > /proc/acpi/sleep" invokes the related suspend callback with a state value of 2 ???). You can apply the patch airo.c-2.6.5-rc3.diff.

Update 2004/04/22: The patch to fix the APM suspend freeze problem is now in 2.6.6-rc2.

Update 2004/05/20: This week, Cisco released an updated Linux driver for the MPI350 card. This version has some exciting new features including : support of recent firmwares (up to 5.30.17), support of wireless extensions, and support of RFMONITOR mode. If you want to use this driver, on a 2.4 kernel, apply the patch mpi350-v21-fix-for-rfmon.diff to mpi350.c, before the compilation. It will enable both wireless extensions, and RFMON, and will also correct a small offset problem in RFMON mode.

The cool news, is that this new driver allowed me to port RFMON and support for recent firmwares to the regular airo.c kernel driver. You can apply the patch airo.c-2.6.6-20040520.diff. Note that this patch also works on a 2.4.25 kernel, after applying the patch airo.c-2.4.25.diff.

The fix to support recent firmwares is the hunk located at @@ -3934,7 +4049,7 @@ in the patch file. Note that the RFMON mode doesn't work with firmware 5.00.03. So I suggest to upgrade to the most recent version supported by the driver (5.30.17). Use the ACU program under Windows to upgrade the firmware. You can test RFMON capabilities with the following commands. Remember that the airo driver creates a new network interface wifi0 to deal with raw 802.11 frames:


# iwconfig eth1 mode monitor
# echo "Mode: y" > /proc/driver/aironet/eth1/Config
# ifconfig wifi0 up
# tcpdump -i wifi0 -c 50

Update 2004/05/21: I reworked a bit my patch to airo.c to fix this problem : when the interface has been configured in RFMONITOR mode, and when the laptop resumes from APM suspend, then the card could no longer associate back in managed mode. I removed a bunch of CMD_SOFTRESET commands, that presumably were the cause of this buggy behaviour. You can apply the patch airo.c-2.6.6-20040521.diff.

Update 2004/06/10: I added again a card reset command when the module is loaded, that I hastily removed in my previous patches. The driver was broken when running kudzu on FC2 at boot time. The driver should also no longer complain when leapscript calls writeConfigRid() via the driver ioctl. Don't hesitate to report me your success or failure stories in a LEAP-enabled environment, as I cannot test this configuration myself. You can apply the patch airo.c-2.6.7-rc2-20040610.diff.

Update 2004/06/27: I modified a bit the way the signal quality is reported through the wireless extensions. The latest cisco driver provides some code, that operates a different conversion on the raw value, depending on the model of the card (350 model or not). I merged this part of code. You can apply the patch airo.c-2.6.7-bk10-20040627.diff.

A small patch to gnome-applets handles correctly the link average and max values as returned by iw_get_range_info() from the wireless extensions library. You can apply the patch gnome-applets-wireless-link-range-20040627.diff.

Update 2004/06/28: This patch fixes a race condition in the kernel thread, that causes the thread to hold and keep a lock by mistake. You can apply the patch airo.c-2.6.7-bk10-20040628.diff.

Update 2004/09/22: A patch to airo.c has been added since the kernel 2.6.9-rc2-bk7 snapshot. This new patch should resolve issues when using the driver in a MIC and LEAP enabled environment.

CPUID support

While playing with valgrind and more precisely with the cachegrind skin of valgrind, I found that the CPUID values returned by the Pentium-M processor were unknown, and could not allow this program to find the CPU cache configuration.

--5991-- warning: Unknown Intel cache config value (0xB0), ignoring
--5991-- warning: Unknown Intel cache config value (0xB3), ignoring
--5991-- warning: Unknown Intel cache config value (0x87), ignoring
--5991-- warning: Unknown Intel cache config value (0x30), ignoring
--5991-- warning: Unknown Intel cache config value (0x2C), ignoring

The site cpuid.com appears to be a valuable source of information about the cache characteristics of the new Intel Pentium-M processor, the L2 cache size is 1MB, and both the L1 instruction cache, and the L1 data cache have 32KB, 8 ways, 64B lines. With the help of the information found on sandpile.org too, I could add the missing information about the CPUID values 0xb0, 0xb3, 0x87, 0x30 and 0x2c with a certain amount of confidence. With this patch to valgrind sources, I made valgrind Pentium-M aware.

A similar patch has to be applied to calltree, part of the kcachegrind package too.

Currently, you also can patch your kernel to report the correct cache information in /proc/cpuinfo, else just add cachesize=1024 to grub.conf or lilo.conf to have to exact size of the L2 cache reported in /proc/cpuinfo.

Suspend/Resume

The suspend and resume procedure can be tricky, but here is some clue on how to get a fairly reproductible and successful behaviour when trying to put the laptop in suspend mode :

Update 2003/06/08: Suspension via APM is broken in console mode, for what I can say. The machine cannot be brought back to life when resume is invoked. Fn-F3 also locks the machine. Until this issue is resolved, I strongly advises you to totally disable console blanking (especially is you enabled "Console blanking using APM" in the APM section), with the command setterm -blank 0, at the end of your init scripts.

Update 2003/06/08: Recently, (at least since kernel 2.4.21-rc7-ac1), a patch has been included to support the Radeon 9000 Mobility in the console driver radeonfb.o. According to the source code of this driver, it doesn't currently provide any acceleration. Although, this driver is functionnal, I stayed with vesafb, because the radeonfb driver seems to badly interfere with X, and suspend is no longer working in X when radeonfb is loaded. Using vesafb, and vga=834 still works fine, and doesn't harm the XFree86 driver.

Update 2003/06/08: In X, when resuming from suspend, the gpm mouse driver is puzzled, and doesn't recognizes the synaptics device events anymore. Also the drive remains unresponsive for a limited period of time.


hda: dma_timer_expiry: dma status == 0x04
hda: DMA interrupt recovery
hda: lost interrupt

I resolved this issue with a customization of my APM scripts. In /etc/sysconfig/apm-scripts, I added two files, containing commands to be launched before suspending, and after resuming: the first one is called apmcontinue:


PROG="$1"
case "$PROG" in
suspend|standby)
rmmod ide-scsi
alsactl power off
service gpm stop
hdparm -d0 -q /dev/hda
;;
esac

And the second one is named apmcontinue-pre:


PROG="$1"
case "$PROG" in
resume)
hdparm -m16 -d1 -c3 -q /dev/hda
service gpm start
alsactl power on
modprobe ide-scsi
;;
esac

Update 2003/06/15: Frederic Gaus on the linux-thinkpad mailing-list found how to obtain a real poweroff of the machine when shutting it down. Local APIC support on uniprocessors has to be disabled in the kernel configuration. Moreover, if you run a RedHat distribution, and if you compiled your own kernel, with apm support as a module, be sure to remove the line "modprobe -r apm" in /etc/init.d/apmd. The apmd shutdown script tries to remove the apm.o module. So the final poweroff instructions, being executed after all shutdown scripts, have no longer apm support available, and fail. This bug has been corrected in recent apmd RPM package in Rawhide at the time of this writing.

Update 2003/08/02: I added rmmod ide-scsi and modprobe ide-scsi to my apmscripts, with hope to prevent these SCSI timeout errors in my logs, when going to sleep mode : scsi : aborting command due to timeout : pid 6453, scsi0, channel 0, id 0, lun 5 Request Sense a0 00 00 40 00, followed by an hdc: lost interrupt, and an SCSI bus reset. It appears that the SCSI emulation code behaves in a somewhat strange manner when coming back from suspend mode. Naturally, removing the module is possible when the drive is not in use. Also be sure not to compile the IDE/ATAPI CD-ROM support in your kernel. I discovered recently that both ide-cd and ide-scsi where loaded concurrently, which is unnecessary.

Update 2003/09/13: The DRI-resume patch for the Radeon driver has finally been integrated into the DRI CVS repository.

ACPI

I tested ACPI patches 2003/05/22 on a kernel 2.4.21-rc7-ac1, and on a kernel 2.5.70-bk14. An additionnal proprietary module from Intel, allows to have the *enhanced speedstep* features through ACPI. For what I've read, enhanced stands here for *more than two power level* for the processor. With the 2.5.70 kernel, acpi causes a kernel oops at boot time, and cannot be further tested. With the 2.4 kernel, most sensor information is available through the /proc/acpi pseudo filesystem as expected. Suspend is not working. The intel proprietary driver for enhanced speedstep is loading, and allows me to customize the performance by writing into /proc/acpi/processus/cpuname/performance. Unfornately, the frequency, power, latency associated to each state is not displayed correctly (all values are zero).

Without the possibility to properly enter in suspend mode in ACPI, I gave up, and reinstalled the APM modules (both cannot coexist so a choice has to be done).

Update 2003/06/11: Using regular ACPI (20030522) from the 2.4.21-rc7-ac1 kernel, I configured each ACPI element as a module, and rebuilt my kernel. The kernel logs are copious about ACPI :

ACPI: have wakeup address 0xc0001000
On node 0 totalpages: 229376
zone(0): 4096 pages.
zone(1): 225280 pages.
zone(2): 0 pages.
ACPI: RSDP (v002 IBM ) @ 0x000f6ba0
ACPI: XSDT (v001 IBM TP-1R 00000.04176) @ 0x3ff6d7e6
ACPI: FADT (v003 IBM TP-1R 00000.04176) @ 0x3ff6d900
ACPI: SSDT (v001 IBM TP-1R 00000.04176) @ 0x3ff6dab4
ACPI: ECDT (v001 IBM TP-1R 00000.04176) @ 0x3ff78ea3
ACPI: TCPA (v001 IBM TP-1R 00000.04176) @ 0x3ff78ef5
ACPI: BOOT (v001 IBM TP-1R 00000.04176) @ 0x3ff78fd8
ACPI: DSDT (v001 IBM TP-1R 00000.04176) @ 0x00000000
ACPI: BIOS passes blacklist
ACPI: MADT not present
[...]
tbxface-0117 [03] acpi_load_tables : ACPI Tables successfully acquired
Parsing all Control Methods:....................................................................................................................................................................................................................................................................................................................................................................................................
Table [DSDT](id F005) - 1308 Objects with 63 Devices 388 Methods 20 Regions
Parsing all Control Methods:.
Table [SSDT](id F003) - 1 Objects with 0 Devices 1 Methods 0 Regions
ACPI Namespace successfully loaded at root c02e3ddc
evxfevnt-0093 [04] acpi_enable : Transition to ACPI mode successful
evgpeblk-0743 [06] ev_create_gpe_block : GPE 00 to 31 [_GPE] 4 regs at 0000000000001028 on int 9
evgpeblk-0262 [08] ev_save_method_info : Registered GPE method _L18 as GPE number 0x18
ACPI: Found ECDT
Completing Region/Field/Buffer/Package initialization:.........................................................................................................................................................................................................................................................
Initialized 19/20 Regions 123/123 Fields 61/61 Buffers 46/46 Packages (1317 nodes)
Executing all Device _STA and_INI methods:.............................<3>schedule_task(): keventd has not started
......................... exfldio-0129 [22] ex_setup_region : Field [PWKI] access width (4 bytes) too large for region [U7CS] (length 2)
exfldio-0140 [22] ex_setup_region : Field [PWKI] Base+Offset+Width 0+0+4 is beyond end of region [U7CS] (length 2)
exfldio-0129 [22] ex_setup_region : Field [PWKI] access width (4 bytes) too large for region [U7CS] (length 2)
exfldio-0140 [22] ex_setup_region : Field [PWKI] Base+Offset+Width 0+0+4 is beyond end of region [U7CS] (length 2)
psparse-1121: *** Error: Method execution failed [\_SB_.PCI0.USB7._INI] (Node c19eb788), AE_AML_REGION_LIMIT
nsinit-0397 [06] ns_init_one_device : \_SB_.PCI0.USB7._INI failed: AE_AML_REGION_LIMIT
....
58 Devices found containing: 58 _STA, 6 _INI methods
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: System [ACPI] (supports S0 S3 S4 S5)
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 11, disabled)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 11, disabled)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 11, disabled)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Root Bridge [PCI0] (00:00)
[...]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: Embedded Controller [EC] (gpe 28)
schedule_task(): keventd has not started
nseval-0152: *** Error: ut_remove_allocation: Empty allocation list, nothing to free!
ACPI: Power Resource [PUBS] (on)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGP_._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
PCI: Probing PCI hardware
ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 10
ACPI: PCI Interrupt Link [LNKF] enabled at IRQ 9
ACPI: PCI Interrupt Link [LNKG] enabled at IRQ 5
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off'
[...]
hwregs-0753 [27] hw_low_level_read : Unsupported address space: 4

Later, when insmod()ing each ACPI module separately : fan.o, ac.o, battery.o, button.o, processor.o, thermal.o :

ACPI: Fan [FIR] (off)
ACPI: AC Adapter [AC] (on-line)
ACPI: Battery Slot [BAT0] (battery present)
ACPI: Battery Slot [BAT1] (battery absent)
ACPI: Power Button (FF) [PWRF]
ACPI: Lid Switch [LID]
ACPI: Sleep Button (CM) [SLPB]
acpi_processor-0899 [43] acpi_processor_get_per: Unsupported address space [127] (control_register)
ACPI: Processor [CPU] (supports C1 C2 C3, 8 throttling states)
ACPI: Thermal Zone [THM0] (52 C)

From the processor module, I can read the following information in /proc/acpi/processor:

[root@bonobo acpi]# more processor/CPU/*
::::::::::::::
processor/CPU/info
::::::::::::::
processor id: 0
acpi id: 1
bus mastering control: yes
power management: yes
throttling control: yes
performance management: no
limit interface: yes
::::::::::::::
processor/CPU/limit
::::::::::::::
active limit: P0:T0
platform limit: P0:T0
user limit: P0:T0
thermal limit: P0:T0
::::::::::::::
processor/CPU/performance
::::::::::::::
<not supported>
::::::::::::::
processor/CPU/power
::::::::::::::
active state: C2
default state: C1
bus master activity: ffffffff
states:
C1: promotion[C2] demotion[--] latency[000] usage[00000010]
*C2: promotion[C3] demotion[C1] latency[001] usage[00082343]
C3: promotion[--] demotion[C2] latency[085] usage[00000000]
::::::::::::::
processor/CPU/throttling
::::::::::::::
state count: 8
active state: T0
states:
*T0: 00%
T1: 12%
T2: 25%
T3: 37%
T4: 50%
T5: 62%
T6: 75%
T7: 87%

When I insmod the processor.o module from Intel : warning, this module will not work properly if the corresponding module from vanilla kernel has previously been loaded (even if it's been rmmod()ed before insmod()ing Intel's one). I obtain the following warnings:

acpi_processor-0924 [35] acpi_processor_get_per: FFH address non-zero, setting to zero
acpi_processor-2137 [33] acpi_processor_get_inf: Error evaluating processor object

And /proc/acpi/processor/CPU/performance provides more information, and a way to change the active state, as advertised in the readme.txt. Notice that corresponding frequencies and power are not reliable :-)

[root@bonobo root]# more /proc/acpi/processor/CPU/performance
state count: 6
active state: P0
states:
*P0: 0 MHz, 0 mW, 500 uS
P1: 0 MHz, 0 mW, 500 uS
P2: 0 MHz, 0 mW, 500 uS
P3: 0 MHz, 0 mW, 500 uS
P4: 0 MHz, 0 mW, 500 uS
P5: 0 MHz, 0 mW, 500 uS
[root@bonobo root]# echo 3 > /proc/acpi/processor/CPU/performance
[root@bonobo root]# cat /proc/acpi/processor/CPU/performance
state count: 6
active state: P3
states:
P0: 0 MHz, 0 mW, 500 uS
P1: 0 MHz, 0 mW, 500 uS
P2: 0 MHz, 0 mW, 500 uS
*P3: 0 MHz, 0 mW, 500 uS
P4: 0 MHz, 0 mW, 500 uS
P5: 0 MHz, 0 mW, 500 uS

Strange behaviour related to the values provided here: even when I boot up on battery, the processor is supposed to run at 600MHz (Speedstep mode in bios for battery mode : maximum battery life). /proc/cpuinfo tells me that the cpu is runnig at this frequency, but both /proc/acpi/processor/CPU/performance and /proc/acpi/processor/CPU/throttling tell me, on the contrary, that the active state is respectively P0, and T0, aka the maximum power mode...

I couldn't put the laptop in sleep mode, echo 5 > /proc/acpi/sleep even caused an oops in the underlying bash process. (/proc/acpi/sleep provides values S0 S3 S4 S5).

Update 2003/06/12: Ensure to disable any call to /sbin/hwclock during the boot sequence when ACPI is enabled, because this program invariably freezes the machine when invoked if ACPI is enabled is kernel.

Update 2003/06/12: The output of /proc/acpi/processor/CPU/performance is wrongly initialized. In fact, when you start Linux on battery mode, the processor is in P5 mode (600MHz), altough the performance file tells that the processor is in P0 mode. At startup, the performance file always tells that the processor is in P0 mode. Any modification of this state is working fine. For example, you can come back to the P0 state, and obtain back the full processor power, just by writing the desired value : "echo 0 > /proc/acpi/processor/CPU/performance".

with a sample program doing crypt() invokations in a loop, the elapsed time for the same amount of iterations varies in the following range :

Update 2003/06/12: Bill Nottingham proposed a patch on the LKML to add Centrino support to the cpufreq API. Two patches, linux-2.4.20-cpufreq.patch and cpufreq-centrino.patch apply to a 2.4.21 kernel, and allow to tweak the CPU frequency through the /proc/cpufreq interface. Have a look on the documentation linux/Documentation/cpufreq in your favorite kernel sources for details about the cpufreq API. This API for enhanced speedstep technology of the centrino works for me, like the proprietary intel module did. The advantage is that ACPI is not required for cpufreq.

Update 2004/07/15: ACPI support improved in kernel 2.6. It is now mostly usable, and both suspend modes (S3, aka suspend to RAM and S4 aka suspend to disk) work. I enabled CONFIG_SOFTWARE_SUSPEND in my kernel config file, coupled with other features needed for ACPI : CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC.

Some problems remain : the time required to enter the S4 state can be long (too long ?), typically 2 minutes, and the machine still consumes a significant amount of power, while in S3 state (much more than with APM suspend to ram). As an exemple, with my fully changed battery (45 Wh), the machine almost completely discharged in 8h30 hours in S3 mode, see this mrtg graph. So we can conclude that the machine consumed 5W/h in this S3 state, while it typically consumes 10-11W/h in normal state, which is much more than expected...

As a final note, you should also definitely read the power management kernel documentation in kernel-tree/Documentation/power/swsusp.txt.

Patches

Update 2003/06/24: I installed kernel 2.4.21-ac2. Some patches have been integrated, other are still pending. The new cpufreq stuff now works with Pentium-M processors, and provides an interface to the enhanced speed step capabilities of this processor, without the need to use ACPI. The kernel driver is called speedstep-centrino.o, and the typical content of /proc/cpufreq is as follows :


% more /proc/cpufreq minimum CPU frequency - maximum CPU frequency - policy
CPU 0 600000 kHz ( 37 %) - 1600000 kHz (100 %) - performance

This processor regularily runs at full speed (1.6GHz) on AC power, and slows down at 600MHz on battery, this is the frequency that you can read in /proc/cpuinfo. The difference is that this value displayed in cpuinfo reflects the CPU frequency measured at boot time only, depending on the way your laptop is powered at this moment, and it will not change afterwards.

I you prefer to enter in a mode saving more power, just write to this interface : echo "0%0%100%powersave" > /proc/cpufreq


% more /proc/cpufreq minimum CPU frequency - maximum CPU frequency - policy
CPU 0 600000 kHz ( 37 %) - 1600000 kHz (100 %) - powersave

You can also choose to explicitely set the cpu frequency, by setting the lower and upper bounds : echo "0%50%50%performance" > /proc/cpufreq


% more /proc/cpufreq minimum CPU frequency - maximum CPU frequency - policy
CPU 0 800000 kHz ( 50 %) - 800000 kHz ( 50 %) - performance

More information about cpufreq is available in the sources of the kernel in /usr/src/linux-2.4.21-ac2/Documentation/cpu-freq/

The pending patches for kernel 2.4.21-ac2 are patch-2.4.21-rc7-dri.patch (this patch is still required to work with latest radeon DRM from the DRI project), and patch-2.4.21-rc7-pentiumm-cache.patch.

Update 2003/06/27: TODO : test this fix (from the LKML).

Update 2003/08/02: I installed kernel-2.4.22-pre10-ac1. I recommend to use Alan Cox's patches, because they carry the enhanced speedstep patch (for centrino processors). The pending patch for this kernel version is patch-2.4.21-rc7-pentiumm-cache.patch. I dropped the patch patch-2.4.21-non-transparent-pci-to-pci-bridge.patch, as it seems to concern machines with more than 1GB RAM installed.

I also applied the patch send-to-self-2.4.21-1.diff from Julian Anastasov that allow to route traffic between two local network interfaces externally. You can download the related documentation here : send-to-self.txt, or directly from home page. This patch is useful to me, as it allows me to test the internal wireless card, with a single access point, whose wired interface is connected to the wired interface of my laptop. In this configuration, I don't need another test machine to simulate traffic flowing through the wireless access point.

speedstep-centrino is currenly broken in this version, I had to revert some initialization code with this patch patch-2.4.22-pre10-ac1-speedstep-centrino.patch.

Update 2003/09/11: I installed and tested the cpudyn program, to automatically adapt the CPU speed according to the processor load. The goal of this program is to maximize the battery life, by slowing down the CPU only when it's idle. cpudyn works with both 2.5 and 2.4 kernel. Another program, speedfreq, addresses the same issue but only works with the cpufreq interface of the 2.5 kernels (with /sysfs). cpudyn also works with the /proc/cpufreq interface for Enhenced SpeedStep, available in -ac 2.4 kernels from Alan Cox. So this later one is better suited to my current setup. The cpudyn package interfaces nicely with the RedHat distribution (the only needed modification is to add chkconfig init levels in the /etc/init.d/cpudyn script after the installation finished). Also note that /proc/cpuinfo is now also updated when the CPU frequency changes.

Update 2003/09/16: The gkx86info plugin for gkrellm can be used to display the current CPU frequency, while being modified by cpudynd.

Console framebuffer

Since kernel-2.4.21-ac2, the radeon framebuffer is working just fine with my Radeon 9000 Mobility chipset. Just disable vesafb support at boot time (vga=normal in the boot options), and load the radeonfb module instead. The advantage of using the radeonfb instead of the vesafb framebuffer, is that you have a noticeable speed improvement (scrolling is a lot faster) in console mode. Also note that suspend/resume in console mode is still broken currently, whatever framebuffer driver is in use.

Update 2003/06/27: radeonfb still badly interacts with X. Using radeonfb prevents X to resume from suspend properly.

Kernel 2.6 test

My current test of the kernel 2.6 series involves a version 2.6.0-test8, with this config file. The ACPI works better now, most of the sensor information is available through ACPI, except the fan status. I now can put the laptop in a kind of suspend mode, with an echo 3 > /proc/acpi/sleep command, which is a big progress since previous buggy versions. I said a kind, because it seems that the lid is not completely blanked, and the crescent-shaped symbol on the lid panel doesn't light up, like it does with apm.

The radeonfb driver still fails to change the console screen resolution on the fly.

And I couldn't enable 3D acceleration with my current version of DRI, and the builting radeon DRM and AGP drivers from the kernel.

Update 2003/12/18: I switched my default Fedora Core installation to a new exciting 2.6.0-test11-mm1 kernel yesterday. With success. I could keep all the features that were previously working fine with the stock Fedora Core 2.4 kernel, and gained some new bonus features :

Here are some notes about this migration

Memory problems

Lately, I experienced all kind of memory problems (unreproductible gcc internal error, sig 11, random kernel crashes), that suggested me a problem with memory modules. My machine is running with one 1GB SDRAM, I purchased an additionnal memory module from IBM (512MB PC2100, CL2.5, FRU 10K0033, 2003-01-23) to add to the built-in 512MB modules. This module is correctly referenced by IBM, and is supposed to work in my T40.

memtest86 quickly found several memory errors, on test 3 and 4, all addresses locations were above 512MB. Similar errors were found both on battery mode (CPU clocked at 600MHZ), and on AC mode (CPU clocked at full speed at 1600MHz). The errors disappeared when running memtest86 with only the 512MB built-in module. This additionnaly memory module is stamped with an IBM sticker, and memory chipsets come from Micron Technologies (P/N MT16VDDS6464HG-265). Running a full memory test with the shipped PC-Doctor program didn't find anything wrong with both modules. Should I suspect random errors due to overheat ? The back of the laptop, where the memory module is inserted, is not really air cooled, and the module appeared to be rather rather hot, when I removed it to proceed to my investigations. Any similar feedback welcome.

Update 2003/06/14: Some more diagnostics about the memory problems. The memory errors are currently only detected with the memtest86 program. This program is free software, and released under the GNU GPL License. This patch has to be applied on version 3.0, so the cache size of the Pentium M processor is correctly detected by CPUID. Note: gcc-2.9x is required to properly compile memtest86. On a RedHat distribution, this old compiler version is available in the compat-gcc package, and its name is gcc296.

I extracted the builtin memory module of my laptop. The T40 - Hardware Maintenance Manual is of great help to provide useful information about the internal organisation of your laptop. Accessing the builtin memory module requires to remove the keyboard. This operation is relatively easy to achieve. It requires no special screw driver, just a little care for the connection ribbon of the keyboard.

The internal memory module is Infineon chipsets, the spec sheet is not (yet?) available on Infineon's web site (P/N HYS64D64020GBDL-7-B, PC2100S-2033-0-Z, 512MB, DDR, 133MHZ, CL2). So the noticeable difference between both internal and supplementary modules is their Cas Latency (CL).

Running with only the Micron memory chip in the builtin memory slot worked fine with memtest86. No error detected after 4 iterations of the whole tests sequence. It took approx 3 hours to complete these 4 iterations. Then I reinserted the Infineon memory chip back in the supplementary memory slot on the bottom of the machine. I came back to a 1GB configuration and both modules were swapped from their original location. memtest86 found errors in this configuration too.

To sum up, each module works fine separately, and they fail when put together.

Problem with a mixing CL ? I'll probably contact IBM very soon to obtain some explanation about this problem. Stay tuned!

Update 2003/06/17: I tested again the supplementary memory module alone. And after a more longer testing period than the previous one (a whole night long), the memtest86 program finally found some errors, after somewhat 9 passes. Not many errors, 40 approximately, all concerning test number 3 and 4. Less errors were found than when both chips were in use. This confirmed my impression that a single memory module is buggy : the supplementary one. I contacted IBM France, and they accepted to change this module after some discussion. For now, I run Linux with the builtin memory only, and I didn't experienced any single failure yet. Quite rassuring.

Update 2003/06/21: I received yesterday my new 512MB memory module FRU: 10K0033, Samsung Chipsets (P/N PC2100S-25330-Z, M470L6423DN0-CB0, CL=2.5, date 2003-04-11) and made some tests. The memtester userland program did successfully run several hours long without problem, and the memtest86 program did successfully run during a whole night, 7 passes with defaults tests. I also recompiled without problem the whole GNOME environment from CVS. According to these tests, I can conclude that the memory combination is now suitable for a daily use of this laptop.

Reducing power consumption

A few combined tricks can help you to reduce your power consumption, by preventing the system to write too often to the hard disk, and so by allowing it to spin down when it's idle.

You can first check that your policy is efficient by running a session in text mode first, without any graphical program. This small script does statistics of the standby/active periods duration. You have to be root to access the drive state information.


#!/bin/sh
c0=0
c1=0
n=0
f0=standby
d0=`date +%s`
while true
do
f1=`hdparm -C /dev/hda | grep 'drive state' | awk '{print $4}'`
if test $f0 != $f1
then
d1=`date +%s`
c=`expr $d1 - $d0`
if test $f0 = standby
then
c0=`expr $c0 + $c`
else
c1=`expr $c1 + $c`
n=`expr $n + 1`
fi
echo "$c seconds in $f0 mode. ($c0 s standby/$c1 s active/$n spinups)"
f0=$f1
d0=$d1
fi
sleep 1
done

You can diagnose some undesirable processes that still do disk accesses with this little script, and a top -i in another window. The culprit process should show up quickly in the top list, when it generates a disk access that wakes the disk up. You can also use this patch from Mukesh Agarwal, to log all read/write accesses with the associated process name.

Then, you can run your usual graphical environment, and detect if some tools can be tweaked to do less disk accesses (for example, my version of gkrellm prevented the disk to spin down, due to frequent disk access).

Update 2003/10/27: The new version of cpufreqd, version 1.1-rc1, works with the 2.4 kernel /proc/cpufreq interface. It is more configurable than cpudyn. You can tweak the deamon power conservative behaviour by editing the /etc/cpufreqd.conf file. By default, the CPU will run at full speed when on AC power, will run a 66% when a demanding CPU task will run on battery, and will run a 37% (the minimum value), when the laptop is idle. An interesting feature of the configuration file is that you can force a specific power management type, when a certain process is launched. For example, you can enable full CPU speed, when xine is playing a DVD. You can grab a RPM binary package, compiled for RedHat 9, cpufreqd-1.1-rc1.1.i386.rpm and the SRPMS package cpufreqd-1.1-rc1.1.src.rpm, from my RpmFind mirror.

The predesktop area

preliminary comment : playing with the predesktop area should be considered with care, because this could lead to undesirable situations, like:

The T40 is not shipped with a recovery CD, but instead contains an hidden area on the hard drive, that contains all needed data and programs to restore your laptop to its factory settings (with formatted disk, and installed Windows XP). This hidden area is often refered as HPA for Hidden Protected Area, and is described in this document. It consists of reserved space at the end of the disk, unpartitionned. This area can be seen under linux, it depends on the value of the "Access IBM Predesktop Area" entry in the "Security/IBM Predesktop Area" section of the BIOS. If the value is set to "Disabled", the number of cylinders reported by fdisks reflects the real limit of the disk capacity, if another value is set ("Normal" or "Secure"), the number of cylinders is lowered to hide this protected area.

I disabled the Access Predesktop Area Security in the BIOS, so I could backup the HPA. With my 80GB drive, the HPA starts at cylinder 9933, and a cylinder contains 7741440 bytes, as reported by fdisk, so I backuped the data with this command :


dd if=/dev/hda of=hidden.img bs=7741440 skip=9932

With an hexadecimal editor (like hexedit for example), I looked for the beginning of a valid partition, identified by the string "IBM 7.0" or "IBM 7.1". I found several consecutive partitions, that I extracted into six distinct files :


dd if=hidden.img of=diags.img bs=512 skip=26850 count=15063
dd if=hidden.img of=diags2.img bs=512 skip=41913 count=15060
dd if=hidden.img of=recovery.img bs=512 skip=56973 count=5847040
dd if=hidden.img of=unknown.img bs=512 skip=5904013 count=15060
dd if=hidden.img of=unknown2.img bs=512 skip=5919073 count=2883
dd if=hidden.img of=unknown3.img bs=512 skip=5921956 count=2883

My algorithm to find the partition limits is rather simple. Starting from the beginning of the hidden.img area, I mount the file using a loop device, I obtain the partition size in 1k blocks with df, and I search the beginning of a new partition around the corresponding offset, in the image file. I iterate this algorithm with the new discovered partition, until the end of the hidden.img area is reached.

These areas can be mounted as individual partitions, with the loop device :


mount diags.img /mnt/mountpoint -oloop,ro

The first one is a bootable partition, that seems to contain the PC Doctor diagnostic program (8MB), in an expanded form. The second one (8MB too), seems to be the floppy installer for the PC Doctor program. The third one (2.9GB) contains the recovery stuff. The bootable sector is in a file called bootimg.bin, and the recovery subdirectory contains the data to be recovered : a serie of .cri files containing file lists, with their checksum, and a serie of related .imz files containing ZIP archives. The fourth one (1.4MB) contains all the required executables needed by the recovery process (in a subdirectory called RECOVERY).

How to create a bootable CD-ROM/DVD-ROM with this content :

I looked for a way to create a bootable DVD-ROM with this content. For an unknown reason, the data and the programs used to setup the recovery process are in two separate partitions. But on my homebrewed DVD-ROM, I merged both in the same iso9660 filesystem :


mount recovery.img /mnt/mountpoint -oloop,rw
mount unknown.img /mnt/mountpoint2 -oloop,ro
cp -a /mnt/mountpoint2/recovery/* /mnt/mountpoint/recovery
mkisofs -b bootimg.bin -c bootcat.bin -o recovery.iso -r -L -J /mnt/mountpoint

I burned the resulting iso image. The obtained customized DVD-ROM successfully booted, launched the "IBM Product Recovery program", and showed me its main menu, with a single entry "reformat your hard disk and install Windows XP, device drivers and preinstalled applications". I stopped here, because I wanted to preserve my existing Linux installation. If someone with a spare disk can test the whole recovery process with this method on a blank harddrive, I'd be very happy to ear about it.

TODO : test the IBM Product Recovery program!

How to backup the predesktop area :

The hpa_aibm.pdf document contains the procedure to backup/restore this predesktop area from one disk to another one. This can be achieved with two simple utilities, fwbackup.exe and fwrestor.exe, running in DOS mode. The challenge is to find +3GB free space, to receive the backup data when using these tools, when only Linux is on the laptop.

I used a customized MSDOS-6.22 bootdisk, including network support, and the DOS driver for my intel gigagit NIC. The result is quite amazing, the network is configured via DHCP, and a share available on my samba server is mounted as a network drive. All of this with on single floppy.

I downloaded a basic DOS bootable disk from here. I removed all the crap except the minimal DOS stuff. Then I followed Bart's instructions to create his network bootdisk. The underlying program is MODBOOT, which stands for modular boot disk. This is a really nice tool, because each needed feature can be added to the current bootdisk, just by downloading a .cab file in a given location on the floppy. For example, my network card driver followed this rule, the driver is available in the form of a .cab file.

I followed the instructions given on pages 9-11 of the hpa_aibm.pdf document. The network drive of my bootdisk pointed to a samba share on another linux machine. The two utilities (fwbackup and fwrestore) are located in a:\recovery when starting the restore-factory-settings utility from the pre-boot menu (caution : a c:\recovery directory exists too).

I now have 5 backup files (650MB each approx) of my predesktop area, ready to be burned. I didn't test the restoration process, as it requires to have a fresh unformatted hard disk. I will probably do it if I acquire another hard driver in the future. Currently, I assume that I could safely remove my predesktop area from my current hard drive, because hopefully I have a solution to recreate it back on another disk.

Update 2004/01/11: I received my new laptop hard disk, so I could test the restoration of the predesktop area on this new drive, using the information provided in the IBM white paper, and the previously backuped area from my other hard disk. Although the new disk is a 60GB one, and the previous one is a 80GB, the restore process works fine, and provides back the same stuff than the one that was previously available.

Now that the predesktop area is restored, I put back the security level of this area in the BIOS as "normal", so this disk area is normally no longer visible by partitionning tools.

Then I run the "recovery program" from the predesktop area, to restore a fresh windows XP partition. The laptop copies a huge amount of files, does unzip them, reboots several time. All works fine, except one disturbing detail : I cannot resize this new FAT32 windows XP partition with parted, as I did on my previous disk. parted warns me about a geometry problem :


Using /dev/hda
Warning: The partition table on /dev/hda is inconsistent. There are many reasons why this might be the case. Often, the reason is that Linux detected
the BIOS geometry incorrectly. However, this does not appear to be the case
here. It is safe to ignore,but ignoring may cause (fixable) problems with some
boot loaders, and may cause problems with FAT file systems. Using LBA is
recommended.
Ignore/Cancel?

I ignore this warning, and process anyway, I resize this single partition to 7GB (resize 1 0.031 7000), but on the next reboot, the NT loader is no longer available (NTLDR missing). On the contrary, fdisk doesn't find any structural problem with this initial partitionning scheme. As a workaround to resize the WinXP partition anyway, I finally use FIPS instead of parted. This unfortunately requires to restart the "recovery program" from zero.

The restore process continues without any other difficulties : WinXP finishes its setup, and I can restore my linux partitions, and install grub successfully.

Kernel boot messages

You can grab my kernel boot messages.

/etc/modules.conf file

alias eth0 e1000
alias usb-controller ehci-hcd
alias usb-controller1 usb-uhci
post-install sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -L >/dev/null 2>&1 || :
pre-remove sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -S >/dev/null 2>&1 || :

## OSS
# options sound dmabuf=1
# alias sound-slot-0 i810_audio
## ALSA
alias char-major-116 snd
alias snd-card-0 snd-intel8x0
alias char-major-14 soundcore
alias sound-slot-0 snd-card-0
alias sound-service-0-0 snd-mixer-oss
alias sound-service-0-1 snd-seq-oss
alias sound-service-0-3 snd-pcm-oss
alias sound-service-0-8 snd-seq-oss
alias sound-service-0-12 snd-pcm-oss
alias sound-slot-1 off
alias sound-service-1-0 off
alias snd-minor-oss-0 snd-cs4236
alias snd-minor-oss-1 snd-opl3
alias snd-minor-oss-3 snd-pcm-oss
alias /dev/dsp snd-pcm-oss
alias /dev/mixer snd-mixer-oss
alias /dev/sequencer snd-seq-oss
options snd major=116 cards_limit=1 device_mode=0666 device_gid=0 device_uid=0
post-install snd-card-0 /usr/sbin/alsactl restore >/dev/null 2>&1 || :
pre-remove snd-card-0 /usr/sbin/alsactl store >/dev/null 2>&1 || :

alias /dev/ppp ppp_generic
alias char-major-108 ppp_generic
alias tty-ldisc-3 ppp_async
alias tty-ldisc-14 ppp_synctty
alias ppp-compress-21 bsd_comp
alias ppp-compress-24 ppp_deflate
alias ppp-compress-26 ppp_deflate

keep
path[thinkpad]=/lib/modules/`uname -r`/thinkpad
options thinkpad enable_smapi=1 enable_superio=1 enable_rtcmosram=1 enable_thinkpadpm=1
alias char-major-10-170 thinkpad
alias /dev/thinkpad thinkpad
below thinkpadpm thinkpad
below superio thinkpad
below smapi thinkpad
below rtcmosram thinkpad

/etc/X11/XF86Config file

# XFree86 4 configuration created by pyxf86config

Section "ServerLayout"
Identifier "Default Layout"
Screen 0 "Screen0" 0 0
InputDevice "mouse0" "CorePointer"
InputDevice "mouse1" "AlwaysCore"
InputDevice "Keyboard0" "CoreKeyboard"
EndSection

Section "Files"
# RgbPath is the location of the RGB database. Note, this is the name of the
# file minus the extension (like ".txt" or ".db"). There is normally
# no need to change the default.

# Multiple FontPath entries are allowed (they are concatenated together)
# By default, Red Hat 6.0 and later now use a font server independent of
# the X server to render fonts.

RgbPath "/usr/X11R6/lib/X11/rgb"
FontPath "unix/:7100"
EndSection

Section "Module"
Load "ddc"
Load "dbe"
Load "extmod"
Load "GLcore"
Load "dri"
Load "glx"
Load "fbdevhw"
Load "record"
Load "freetype"
Load "type1"
SubSection "extmod"
Option "omit xfree86-dga"
EndSubSection
EndSection

Section "InputDevice"
# Specify which keyboard LEDs can be user-controlled (eg, with xset(1))
# Option "Xleds" "1 2 3"

# To disable the XKEYBOARD extension, uncomment XkbDisable.
# Option "XkbDisable"

# To customise the XKB settings to suit your keyboard, modify the
# lines below (which are the defaults). For example, for a non-U.S.
# keyboard, you will probably want to use:
# Option "XkbModel" "pc102"
# If you have a US Microsoft Natural keyboard, you can use:
# Option "XkbModel" "microsoft"
#
# Then to change the language, change the Layout setting.
# For example, a german layout can be obtained with:
# Option "XkbLayout" "de"
# or:
# Option "XkbLayout" "de"
# Option "XkbVariant" "nodeadkeys"
#
# If you'd like to switch the positions of your capslock and
# control keys, use:
# Option "XkbOptions" "ctrl:swapcaps"
# Or if you just want both to be control, use:
# Option "XkbOptions" "ctrl:nocaps"
#
Identifier "Keyboard0"
Driver "keyboard"
Option "XkbRules" "xfree86"
Option "XkbModel" "pc105"
Option "XkbLayout" "us"
EndSection

Section "InputDevice"
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "MouseSystems"
Option "Device" "/dev/gpmdata"
Option "Emulate3Buttons" "no"
EndSection

Section "InputDevice"
# If the normal CorePointer mouse is not a USB mouse then
# this input device can be used in AlwaysCore mode to let you
# also use USB mice at the same time.
Identifier "Mouse1"
Driver "mouse"
Option "Protocol" "IMPS/2"
Option "Device" "/dev/input/mice"
Option "ZAxisMapping" "4 5"
Option "Emulate3Buttons" "no"
EndSection

Section "Monitor"
Identifier "Monitor0"
VendorName "Monitor Vendor"
ModelName "Generic Laptop Display Panel 1400x1050"
HorizSync 31.5 - 90.0
VertRefresh 59.0 - 75.0
Option "dpms"
EndSection

Section "Device"
Identifier "Videocard0"
Driver "radeon"
VendorName "Videocard vendor"
BoardName "ATI Radeon Mobility M9"
Option "AGPMode" "4"
Option "EnableDepthMoves" "on"
Option "EnablePageFlip" "on"
EndSection

Section "Screen"
Identifier "Screen0"
Device "Videocard0"
Monitor "Monitor0"
DefaultDepth 24
SubSection "Display"
Depth 24
Modes "1400x1050" "1280x1024" "1280x960" "1152x864" "1024x768" "640x480"
Virtual 0 0
EndSubSection
EndSection

Section "DRI"
Group 0
Mode 0666
EndSection

lspci -n output

00:00.0 Class 0600: 8086:3340 (rev 03)
00:01.0 Class 0604: 8086:3341 (rev 03)
00:1d.0 Class 0c03: 8086:24c2 (rev 01)
00:1d.1 Class 0c03: 8086:24c4 (rev 01)
00:1d.2 Class 0c03: 8086:24c7 (rev 01)
00:1d.7 Class 0c03: 8086:24cd (rev 01)
00:1e.0 Class 0604: 8086:2448 (rev 81)
00:1f.0 Class 0601: 8086:24cc (rev 01)
00:1f.1 Class 0101: 8086:24ca (rev 01)
00:1f.3 Class 0c05: 8086:24c3 (rev 01)
00:1f.5 Class 0401: 8086:24c5 (rev 01)
00:1f.6 Class 0703: 8086:24c6 (rev 01)
01:00.0 Class 0300: 1002:4c66 (rev 02)
02:00.0 Class 0607: 104c:ac55 (rev 01)
02:00.1 Class 0607: 104c:ac55 (rev 01)
02:01.0 Class 0200: 8086:101e (rev 03)
02:02.0 Class 0200: 168c:0012 (rev 01)

lspnp -v output

01 PNP0c02 system peripheral: other
io 0x0010-0x001f
io 0x0024-0x0025
io 0x0028-0x0029
io 0x002c-0x002d
io 0x0030-0x0031
io 0x0034-0x0035
io 0x0038-0x0039
io 0x003c-0x003d
io 0x0050-0x0053
io 0x0072-0x0073
io 0x0074-0x0075
io 0x0076-0x0077
io 0x0080-0x0080
io 0x0090-0x0091
io 0x0092-0x0092
io 0x0093-0x009f
io 0x00a4-0x00a5
io 0x00a8-0x00a9
io 0x00ac-0x00ad
io 0x00b0-0x00b1
io 0x00b2-0x00b3
io 0x00b4-0x00b5
io 0x00b8-0x00b9
io 0x00bc-0x00bd
io 0x15e0-0x15ef
io 0x1600-0x167f
io 0x004e-0x004f
mem 0xffc00000-0xffffffff

02 PNP0c01 memory controller: RAM
mem 0x00000000-0x0009ffff
mem 0x000e0000-0x000fffff
mem 0x00100000-0x3fffffff

03 PNP0200 system peripheral: DMA controller
io 0x0000-0x000f
io 0x0081-0x008f
io 0x00c0-0x00df
dma 4

04 PNP0000 system peripheral: programmable interrupt controller
io 0x0020-0x0021
io 0x00a0-0x00a1
irq 2

05 PNP0100 system peripheral: system timer
io 0x0040-0x0043
irq 0

06 PNP0b00 system peripheral: real time clock
io 0x0070-0x0071
irq 8

07 PNP0303 input device: keyboard
io 0x0060-0x0060
io 0x0064-0x0064
irq 1

08 PNP0c04 reserved: other
io 0x00f0-0x00ff
irq 13

09 PNP0800 multimedia controller: audio
io 0x0061-0x0061

0a PNP0a03 bridge controller: PCI
io 0x0cf8-0x0cff

0b PNP0c02 bridge controller: ISA
io 0x04d0-0x04d1
io 0x1000-0x105f
io 0x1060-0x107f
io 0x1180-0x11bf

0c INT0800 memory controller: flash
mem 0xffb80000-0xffbfffff
mem 0xffb00000-0xffb7ffff
mem 0xffa80000-0xffafffff
mem 0xffa00000-0xffa7ffff
mem 0xff980000-0xff9fffff
mem 0xff900000-0xff97ffff
mem 0xff880000-0xff8fffff
mem 0xff800000-0xff87ffff
mem 0xff000000-0xff07ffff

0d PNP0c02 memory controller: RAM
mem 0x000d2000-0x000d3fff

0e PNP0c02 memory controller: RAM
mem 0x000dc000-0x000dffff

0f PNP0680 mass storage device: IDE
io 0x01f0-0x01f7
io 0x03f6-0x03f6
irq 14
io 0x1860-0x1867

10 PNP0680 mass storage device: IDE
io 0x0170-0x0177
io 0x0376-0x0376
irq 15
io 0x1868-0x186f

11 IBM0057 input device: mouse
irq 12

12 PNP0501 communications device: RS-232
io disabled
irq disabled

13 IBM0071 communications device: RS-232
dma 3
io 0x02f8-0x02ff
irq 3

16 PNP0400 communications device: AT parallel port
io 0x03bc-0x03bf
irq 7

19 PNP0e03 bridge controller: PCMCIA
io disabled

lspci output

00:00.0 Host bridge: Intel Corp.: Unknown device 3340 (rev 03)
00:01.0 PCI bridge: Intel Corp.: Unknown device 3341 (rev 03)
00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 01)
00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 01)
00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 01)
00:1d.7 USB Controller: Intel Corp. 82801DB USB EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corp. 82801BAM/CAM PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corp.: Unknown device 24cc (rev 01)
00:1f.1 IDE interface: Intel Corp.: Unknown device 24ca (rev 01)
00:1f.3 SMBus: Intel Corp. 82801DB SMBus (rev 01)
00:1f.5 Multimedia audio controller: Intel Corp. 82801DB AC'97 Audio (rev 01)
00:1f.6 Modem: Intel Corp. 82801DB AC'97 Modem (rev 01)
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon R250 Lf [Radeon Mobility 9000] (rev 02)
02:00.0 CardBus bridge: Texas Instruments PCI1250 PC card Cardbus Controller (rev 01)
02:00.1 CardBus bridge: Texas Instruments PCI1250 PC card Cardbus Controller (rev 01)
02:01.0 Ethernet controller: Intel Corp.: Unknown device 101e (rev 03)
02:02.0 Ethernet controller: Unknown device 168c:0012 (rev 01)

lsmod output

Module Size Used by Not tainted
radeon 110692 1
agpgart 30952 3
prism2_cs 76800 1
autofs4 12308 1 (autoclean)
p80211 23052 1 [prism2_cs]
ds 8680 2 [prism2_cs]
yenta_socket 13632 2
pcmcia_core 60544 0 [prism2_cs ds yenta_socket]
e1000 60128 0
af_packet 15560 1 (autoclean)
sg 31820 0 (autoclean)
sr_mod 16184 0 (autoclean)
ide-scsi 12208 0
scsi_mod 99860 3 [sg sr_mod ide-scsi]
ide-cd 35680 0
cdrom 33696 0 [sr_mod ide-cd]
snd-mixer-oss 16504 0 (autoclean)
snd-intel8x0 22308 0
snd-pcm 85376 0 [snd-intel8x0]
snd-timer 19784 0 [snd-pcm]
snd-ac97-codec 46728 0 [snd-intel8x0]
snd-page-alloc 8516 0 [snd-intel8x0 snd-pcm]
snd-mpu401-uart 5216 0 [snd-intel8x0]
snd-rawmidi 18688 0 [snd-mpu401-uart]
snd-seq-device 6348 0 [snd-rawmidi]
snd 43940 0 [snd-mixer-oss snd-intel8x0 snd-pcm snd-timer snd-ac97-codec snd-mpu401-uart snd-rawmidi snd-seq-device]
soundcore 6404 1 [snd]
keybdev 2976 0 (unused)
mousedev 5556 1
hid 22244 0 (unused)
input 5792 0 [keybdev mousedev hid]
usb-uhci 26380 0 (unused)
ehci-hcd 20104 0 (unused)
usbcore 79136 1 [hid usb-uhci ehci-hcd]
rtc 8380 0 (autoclean)
vga16fb 11680 0 (unused)
fbcon-vga-planes 5192 0 [vga16fb]
ext3 69732 1
jbd 51764 1 [ext3]

Content of /proc/cpuinfo

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 9
model name : Intel(R) Pentium(R) M processor 1600MHz
stepping : 5
cpu MHz : 1598.669
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca cmov pat
clflush dts acpi mmx fxsr sse sse2 tm
bogomips : 3191.60

Content of /proc/interrupts

 CPU0 
0: 40463 XT-PIC timer
1: 7 XT-PIC keyboard
2: 0 XT-PIC cascade
3: 3120 XT-PIC prism2_cs
4: 15507 XT-PIC usb-uhci, Texas Instruments PCI1250 PC card Cardbus Controller, radeon@PCI:1:0:0
5: 0 XT-PIC Intel 82801DB-ICH4, Texas Instruments PCI1250 PC card Cardbus Controller (#2)
6: 0 XT-PIC usb-uhci
8: 1 XT-PIC rtc
9: 0 XT-PIC usb-uhci
11: 0 XT-PIC ehci-hcd
12: 32 XT-PIC PS/2 Mouse
14: 8757 XT-PIC ide0
15: 225 XT-PIC ide1
NMI: 0
LOC: 40428
ERR: 0
MIS: 0

$Id: t40.html,v 1.167 2008/03/10 13:32:22 bellet Exp $

Fabrice Bellet

Valid XHTML 1.1!

This site is on the Linux on ThinkPads webring.
[ Previous 5 Sites | Previous | Next | Next 5 Sites | Random Site | List Sites ]