Linux


Previously I mentioned some issues I had been having on Kubuntu Feisty Fawn with disk utilization seemingly caused by unflushed disk buffers. I alluded to believing that my “laptop-mode.conf” parameters were at fault.

With my recent upgrade of that same laptop to Kubuntu Gutsy Gibbon, I kept the laptop-mode.conf file a bit closer to the maintaner’s version. There are some changes to the “dirty-writeback-centiseconds” and the “dirty-background-ratio” values from what I posted, and my issue seems to have gone away. I’ve been able to go back to running my Windows 2003 SBS server with a Centrify DirectControl lab environment and a RHEL 4 Oracle 10g server attached at the same time.

The configuration files that work MUCH better are attached here:

laptop-mode.conf

cpufreqd.conf

I upgraded my Dell D620 from Feisty to Gutsy this weekend, which included an upgrade to kernel 2.6.22. Every time there’s a kernel upgrade, VMWare Workstation needs to be reconfigured with “vmware-config.pl”. This isn’t an issue normally, but today it was. Thanks to Chris Hope with Electric Toolbox I was able to fix the problem quick and easy.

For completeness the error I was getting was the same:
/tmp/vmware-config1/vmnet-only/userif.c:630: error: ‘const struct sk_buff’ has no member named ‘h’
when trying to build the VMNet module – VMMon built and inserted perfectly. Downloaded 6.0.1 and installed it, and I’m back and in the game.

On my Ubuntu Dell D620 laptop (which dual-boots into Windows XP occasionally), I run some pretty demanding software. Sometimes. As a systems architect, I spend a lot of time in standard “Information Worker” tools – email, office suite, web browser for white papers and product reviews. For me, OpenOffice.org, Evolution, and Firefox work great. I even like Evolution’s Exchange plugin better than Outlook – everything updates on my Windows Mobile phone just like Outlook, it’s faster than Outlook, and it threads my messages. I love message threading for all the reasons that it was invented.

However, because I mostly design Windows networks, I also run VMWare Workstation 6 with the following VMs: Windows XP Pro joined to primary domain (full workstation with Office 2003, Visio, Outlook (and EMC EmailExtender plugins), Windows Admin Pack, Resource Kit, SQL Enterprise Manager, and Exchange Admin Pack), Windows Vista Enterprise, Windows 2003 DC, Exchange, SQL server, Windows XP Pro joined to VM domain, RHEL 4, RHEL 5, and a Live-CD system that I often use to test bootable CDs. And a sysprep’d Windows install.

All that, plus Evolution caching my email and Firefox being disk-happy (did you ever “strace -p $(pgrep firefox) -e trace=open,close,read,write”? It’s busier than Evolution during an offline mail check!), means my 7200 RPM 160GB SATA disk gets hammered. Me also using laptop-mode-tools changing my vm.dirty_writeback_* settings and read/write cache isn’t helping either, I’m sure.

Today and yesterday I ran into an issue where my disk would begin a sync that would last 10-20 minutes, leaving me unable to work the entire time that was happening. Hunting down WHAT was causing this, however, was even more frusterating than it happening (If I shut down any of the 3 above-mentioned programs, the problem went away – it only happened with all 3 open). In Windows, you can open Task Manager, go to the “Process” tab, click “View-> Columns” and add “IO Write Bytes” and “IO Read Bytes” and watch the numbers count up. Or you can use Perfmon and look at IO reads/writes/bytes per second or total, and know immediately what’s causing all your disk IO pain. I still don’t know how to do this in Linux.

First, any hunt for “disk utilization” and “Linux” on Google directs you to hundreds of sites, forums, and blogs evangelizing the wonders of “df” for disk utilization. Yes, it’s really nice to know how much free space I have on my hard drives- that’s why I have SuperKaramba to tell me. But when a problem hits and leaves me unable to work, it’s useless.

“iostat -k 1″ is great – you’ll know immediately which disk is being used, and how hard. But on a laptop with a single disk, you already know.

“top” sorted by process-state will show you what’s in “waiting on IO” state, but not what’s CAUSING the IO that’s causing everything else to wait.

“sar” seems to be the only tool that can provide per-process IO stats, but it has to be pre-set up to write to a log. And I can’t begin to guess how well that will work when my disk is at 100% utilization (peaked at 120tps today).

So if anyone knows of any way to know what’s causing disk IO in a “right now” fashion, please comment or email me. And if you’re curious more about my problem:

  1. Only happens when VMWare (with a guest), firefox, and Evolution are all running.
  2. VMWare with multiple guests runs fine, and never has this issue.
  3. rauch@lt00-bofh:~$ free
    total used free shared buffers cached
    Mem: 3348960 1099216 2249744 0 68668 533124
    -/+ buffers/cache: 497424 2851536
    Swap: 6000268 0 6000268
  4. Happens with Laptop_mode disabled or enabled, on AC or on battery
  5. “sync” causes the exact same symptoms, leading me to believe that somehow I’m getting a LOT more dirty pages than my parameters are set at.
  6. dirty_background_ratio
    1
    dirty_expire_centisecs
    60003
    dirty_ratio
    60
    dirty_writeback_centisecs
    60003
  7. For now, I just close Firefox when I have VMWare open, which means I spend a lot more time in IE than I want to.

As a final note, I’ve updated my Linux EVDO post here with my new built-in card’s info.

I just finished evaluating an excellent piece of software for Windows / Linux hybrid shops: Centrify Corporation’s DirectControl Suite. This is a fantastically well executed integration suite which allows administrators to bring their GNU/Linux and Unix boxes into the Windows ActiveDirectory domain. This brings centralized control of UID/GID (like NIS), the mutual authentication of Kerberos, and centralized Group Policy control to Linux/Unix.

First off, I’d like to mention that the software installs first on a Windows “console” system. That install has the option of extending the schema, but it is not required (the extensions allows administrators to use the Centrify Profile tab for users and computers without installing the Centrify Console locally).  All required pieces work with the standard out-of-the-box Windows 2003 AD schema.  Although the view extensions are well worth it, if you can get them approved by your AD administrative team.

I installed this on a Debian Etch system and a Red Hat Enterprise Linux 4 box.  They ship RPM and DEB installers, so installation is a snap, and shows up in your package manager.  Restarting the systems was not required, but a few systems may not pick up the new PAM settings without at least a reload (OpenSSH did fine).

One of the best parts of this software, however, is in their updated version of OpenSSH to support Windows Kerberos tickets for authentication of users.  Single-signon to any Linux box from Linux or Windows (customized Putty for the same reason) without having to copy RSA keys across your network every time you build a box.  Now my Oracle admins can log into the 10g databases seamlessly (yes, they support Oracle authenticating through AD as well).

Of course, no solution that integrates into AD would be complete without support for Group Policy.  As a huge user of Group Policy (I have 8 GPOs on my home domain), this is key for me.  The thing that makes it so spectacular, is that they just install new ADM files to your console system.  That’s it – no new trees needed, just new ADM files with settings specific to Linux like “SuDoers entries” and “SSH settings”.  Just like GPO on Windows, they’re applied every 90+-30 minutes, and when you remove the system from the policy, the settings get pulled.  For the Sudoers settings, they are appended to the end of the existing file.  Also, many of your security settings for Windows boxes are read directly by the Centrify systems as well, including password expiration notices, lockout policy handling, etc.

There are so many other little features that show how well thought-out the system is.  The client can be configured to cache logons similar to Windows, so you can control your Linux laptops, and still enable the users to log in when they’re on the road. There are several scripts and other tools to help “suck” the users out of /etc/passwd and NIS into AD, to help keep your UIDs in check if you’re installing the client into existing servers.

And that’s just the operating system.  JBoss, WebSphere, Apache and other applications and middleware can be AD-enabled, and anything that uses PAM is automatically AD-enabled, giving you the ability to set up true single sign-on everywhere in your network, if you so choose.

Needless to say, we purchased it, and I’ll be integrating this into all my deployments from this point forward.

I’ve used this configuration, with minor tweaks on 3 different laptops, with 3 different OSes (if not 4) with great success. There’s so little good Linux info on evdoforums.com (at least I have a hard time finding it), and the posts I made link to a site that’s non-existant now, so I realized I had to repost this info. I’ve had both the Merlin S620 Sprint PCS EVDO card, and the Sierra Mobile AirCard 575, and now have a Dell built-in EVDO modem.

For all 3 of them, the only difference was the modprobe line, for vendor and product ID, as noted below.
Merlin S620:
modprobe usbserial vendor=0x1410 product=0x1110
Sierra 575:
modprobe usbserial vendor=0x1199 product=0x0019
Dell Sprint 5720 PCI-Express Modem
modprobe usbserial vendor=0x413c product=0x8134

I saved the appropriate PPP files at /etc/ppp/peers/1xevdo and at /etc/chatscripts/1xevdo_chat. In this post, I’ve blanked out the parts that are particular to my install (phone number), but it should be pretty easy to recreate your settings.

/etc/ppp/peers/1xevdo:

-detach
ttyUSB0
115200
debug
noauth
defaultroute
usepeerdns
user $(full-phone-number)@sprintpcs.com
show-password
crtscts
lock
lcp-echo-failure 4
lcp-echo-interval 65535
connect '/usr/sbin/chat -v -t3 -f /etc/chatscripts/1xevdo_chat'

/etc/chatscripts/1xevdo_chat:

'' 'AT'
'OK' 'ATZ'
'OK' 'ATE0V1&F&D2&C1&C2S0=0'
'OK' 'ATE0V1'
'OK' 'ATS7=60'
'OK' 'ATDT#777'
CONNECT CLIENT

Since I do several bits of work through the console, including accessing my Cisco VPN, and in some cases naim and tmsnc (console AOL and MSN chat) inside a screen session, scripts for these setups work great for me. I wrapped the whole thing up inside a bash script called $HOME/bin/evdo.sh – and I just call that when I want to get online, after inserting the card.
~/bin/evdo.sh:

sudo /sbin/modprobe usbserial vendor=0x1199 product=0x0019
sleep 5
sudo /usr/sbin/pppd call 1xevdo

The sleep statement helps make sure that the modprobe has completed, scanned the device, and settled before calling PPP.

My Dell D620 arrived, and I was able to quickly determine the changes for an “always there” card, vs. a pluggable PCMCIA card.
First I created /etc/modprobe.d/usbserial with the line:
options usbserial vendor=0x413c product=0x8134
and added “usbserial” to the end of /etc/modules so that the card would always come up at boot (sudo lspci -v | less to find the exact product ID and vendor – I only have 2 “Dell” devices on my laptop). I set my radio kill switch to affect only my EVDO and bluetooth radios, letting the software (~/bin/rfkill.sh in Linux and the Dell software in Windows) handle the WiFi – I use WiFi all the time, but only want the battery-draining EVDO in a few specific instances. So I added “cat 0 > /sys/bus/pci/devices/0000\:03\:00.0\rf_kill” to my “evdo.sh” file to kill the wireless when I wanted to use EVDO – no need to ever have them both on.

« Previous PageNext Page »