OSX, XQuartz, Xterm Paste

Something that has been annoying me for a while is OSX’s X implementation’s behavior when trying to paste into an Xterm. Using the usual hotkey doesn’t paste as you’d expect (though, oddly, copy does work). After a perusal of the man page, it turns out this is rather easy to fix:

First, quit X and run these commands:

defaults write org.x.X11 sync_pasteboard_to_primary -boolean true
defaults write org.x.X11 sync_pasteboard_to_clipboard -boolean true
defaults write org.x.X11 sync_pasteboard -boolean true
defaults write org.x.X11 enable_fake_buttons -boolean true

Then you can restart X, and you should find that holding down the Option key and clicking (one finger) in the Xterm will paste (as you’d expect on most other *NIX boxes)

Writing Matlab files from C

Normally I’m not a fan of closed tools and even less a fan of closed file formats, but lately I’ve been struggling with some numerical code and the FOSS tools haven’t been working well enough (which is a topic for another day). Simply put, I need something that does matrix arithmetic correctly and that is easy to manipulate/debug data structures in. I could use SciPy, but I’ve never liked Python and Matlab makes what I need trivial. Since gfortran and ifort both seem to generate incorrect debug symbols on the code I’m working with, and Matlab’s text->double parser seems buggy I’ve been looking for a solution to get my data (correctly!) into Matlab.

Matlab/Mathworks does provide an API for this, but using it would require me to have at least most of a Matlab install on all the machines I’m running on – not feasible. Luckily, an open source solution for writing .mat files does exist: MatIO. I found a wiki that claims to document how to use it, but their examples don’t seem to work with the current version (they compile and run, but the output isn’t right).

Based on the wiki and “test_mat.c” included with the MatIO distribution, I was able to get this minimal example working on OSX Lion:


#include <matio.h>

int main(int argc, char** argv)
{
 double a[5] = {1.0,2.0,3.0,4.0,5.0};
 mat_t* mat;
 matvar_t* matvar;
 size_t dims[2] = {1,5};
 mat = Mat_CreateVer("test.mat", NULL, MAT_FT_DEFAULT);
 matvar = Mat_VarCreate("vector", MAT_C_DOUBLE, MAT_T_DOUBLE, 2, dims,a,0);
 Mat_VarWrite(mat, matvar, MAT_COMPRESSION_NONE);
 Mat_VarFree(matvar);
 Mat_Close(mat);
 return(0);
}

I’m compiling this as follows (though you could probably use gcc or icc if you wanted)

clang -g -Wall -L../1.5.0/lib -I../1.5.0/include/ vector.c -lmatio -lm -lz

(where my MatIO install is in ../1.5.0 and my test code is called “vector.c”)

This is all I really need, but the library can write a few different formats and various data types (apparently a few recent versions of Matlab even use HDF5 rather than a closed format!).

“MyTomTom” on OSX 10.7.x

My old (great, reliable — until it got wet) TomTom One series finally died a while back, and the android phone isn’t cutting it in areas without cell coverage so I finally upgraded to a new Via series. Unfortunately, it refused to talk to OSX at first. The status icon would say “Connecting” and the right part number would show up in their web interface, but I was unable to associate the device with my account or add any updates.

Luckily I stumbled on a solution: Instead of USB mass storage like the old models, the VIA seems to show up as a network interface. As a result, OSX wants the new “network interface” to be “configured”. I was able to simply open the network settings panel choose “TomTom” and click apply with the “mytomtom” software running and the device connected almost instantly. The TomTom device should (it seems) be set to get a DHCP address (hopefully the default).

If it works the same on Windows machines, I’d imagine that the Windows Firewall and any 3rd party firewall software would have the potential to cause similar symptoms. Windows users might want to check into that.

For what it’s worth, I’m using the TomTom web interface in Chrome (currently version 20.0.1132.27). There are some minor JavaScript issues (buttons that don’t always register clicks correctly), but it’s usable. I also tested Firefox 13, which seemed to work a little better.

Building NMatrix on OSX

NMatrix is a Ruby library providing some linear algebra functions and sparse data-structures. Unfortunately, it’s surprisingly difficult to get running on OSX. Here’s a hackish method, using rvm and macports (I’ll show what I had to install on my system, you may need more — the documentation is rather sparse at this point).

sudo port install atlas

cd

rvm use ruby-1.9.3

gem install isolate hoe rspec rake-compiler hoe-git json packable
git clone https://github.com/mohawkjohn/nmatrix.git

cd nmatrix

rake compile

sed -i "" 's:clang\+\+:g++ -I/opt/local/include:g' tmp/x86_64-darwin11.3.0/nmatrix/1.9.3/Makefile

sed -i "" 's:clang:gcc -I/opt/local/include:g' tmp/x86_64-darwin11.3.0/nmatrix/1.9.3/Makefile

sed -i "" 's:nmatrix\.so:nmatrix\.bundle:g' lib/nmatrix.rb

rake compile

rake compile

rake newb

cd nmatrix

ln nmatrix/nmatrix.bundle .

cd

irb -I nmatrix/lib/

require 'nmatrix'

Hopefully this process will improve soon (SciRuby has Google Summer of Code support this year), but this procedure should get you started.

update: for debugging purposes, I’m running an up to date OSX 10.7.4 and macports (2.1.0), ruby-1.9.3-p125, clang version 3.1 (tags/Apple/clang-318.0.58).

Disabling Spotlight on OSX Lion

Apple’s search indexer “Spotlight” likes to run and use all the disk bandwidth and most of the CPU on my machine quite regularly. It seems to correlate with large numbers of file changes, and can keep running for hours. For example, it can’t handle doing a git checkout which changes a few thousand files. Backup programs which generate a large number of short lived temporary files also seem to incapacitate the indexer (for example, BackBlaze). Unfortunately, it’s not that easy to disable globally from the GUI. Luckily Apple provides a command line tool. To disable indexing for the main volume you can do:


sudo mdutil -i off /

I also discovered that deleting the index frees a shocking amount of space (around 7G on my machine!)


sudo mdutil -Ea

If your Mac is being slow, and not sure if you have this problem you can check your favorite monitoring tool for two processes: “mds” and “mdworker”

Poor Man’s Function Tracing

Tracing tools for executables are surprisingly hard to come by. Profilers are plentiful, but if you’re willing to take the performance hit and want to see a complete, recorded, call tree your options are rather limited. In the HPC world, the “standard” is probably Tau, but it’s a pain to integrate with non-trivial build systems and complicated to use. I also found a tool called “etrace” which uses a lesser known feature of gcc that adds a callback at each function entrance/exit (“-finstrument-functions”). Unfortunately, it relies on some specific behavior in nm to look up function names which doesn’t seem to work anymore.

Instead of a tool linked into your binary, you could also script GDB to record each function entry. Unfortunately, it turns out that this is both surprisingly difficult to do (more on that in a future post) and amazingly slow, particularly if the original source was in Fortran.

Luckily for me, Glibc has a set of functions which provide stack unwinding and symbol lookup (creatively named “backtrace()” and “backtrace_symbols()”). Combined with the technique used by etrace, we can easily write a tool which prints the name of the calling function (or a full backtrace) every time it enteres a non-library function (which is generally more desirable than looking at every function call anyway).

My quick implementation of this is available on github. To use it all you have to do is compile your code with -finstrument-functions (gcc/g++/gfortran and icc/ifort at least) and link with my c file. When you run your new executable, redirect the output to a file.

Once you have this log, you can post process to generate a variety of things. A (not well debugged) ruby script is provided in the github repository which strips out the function names. From the output of that script, you can post process with any number of standard *NIX tools. For example, if you pipe to “c++filt -n” you can get the C++ functions demangled. It would probably also be fairly easy to feed the output into dot/graphviz or LaTex and generate a nice call graph.

If you come up with an interesting visualization or other improvement, feel free to send a pull request. I’d also love to hear if anyone knows about any easier/better ways of doing this: blog-contact at kc2vjw dot com

Fun with LSI RAID

If you deal with “enterprise” hardware at all, you’re probably familiar with LSI’s RAID controllers. Their stuff seems to generally work rather well, though the user interface could be improved. In any case, one of the machines I run with 12 disks on an LSI controller recently started exhibiting some unusual performance characteristics. For the most part things were working fine, but maybe once a week the system would slow to a crawl until it was rebooted or left alone for a few hours. I eventually tracked down the problem to a failing drive which the controller never recognized as failed.  Hopefully the troubleshooting technique can be useful to others having similar problems (or at least as notes for next time ). Before we get started, it’s worth mentioning that it’s very easy to loose data when doing this sort of thing. Standard disclaimers apply: you’re responsible for your actions (not me, not my webhost, etc).

Since we’re not using an OS on which the LSI GUI tools run, I’m stuck using the (rather cryptic) LSI CLI tools. First we ask for the status of each drive and look for anomalies:


MegaCli64 -PDList -aALL | grep "Count" | sort -u

MegaCli64 -PDList -aALL | grep "state" | sort -u

Assuming that returns nothing interesting, you can always look at the output without any filtering


MegaCLI64 -PDList -aALL | less

but chances are, you’ll have to look at the event log:


MegaCli64 -AdpEventLog -GetEvents -f log.txt -aALL

less log.txt

This is where problems started to show up on my system, in the form of sense errors.


Code: 0x00000071
Class: 0
Locale: 0x02
Event Description: Unexpected sense: PD 12(e0x08/s11) Path 5003048000779f4f, CDB: 28 00 00 e8 ba 01 00 00 02 00, Sense: 3/11/00

I’m told that it is normal to see a few such communication errors on a loaded system with this series of controller, but something was obviously wrong and the problems were generally occurring on the same disk, so I decided to try replacing it with my cold spare. You can prepare a drive to be removed as follows (be very careful to get the various ids correct — LSI uses different addresses for different things)

MegaCli64 -CfgDsply -a0 # figure out the enclosure id
MegaCli64 -PDInfo -PhysDrv [8:11] -a0 # make sure this is the drive
MegaCli64 -PDOffline -PhysDrv [8:11] -a0 # mark offline enlosure 8, disk 11 ("PD12" in the event log in my case)
MegaCli64 -PDMarkMissing -PhysDrv [8:11] -a0 #mark missing
MegaCli64 -PdPrpRmv -PhysDrv [8:11] -a0 #prepare for removal
MegaCli64 -PdLocate -start -PhysDrv [8:11] -a0 #turn on the error LED
MegaCli64 -PDInfo -PhysDrv [8:11] -a0 #verify and swap the disk
MegaCli64 -PDOnline -PhysDrv [8:11] -a0 #mark the new disk online if it's not already

You may also want to monitor the progress of the rebuild:

MegaCli64 -PDRbld -ShowProg -PhysDrv [8:11] -a0

Note that my example uses “PD12″, which is marked “11” on the supermicro chassis and which is enclosure 8, disk 11 as detected by the rest of the controller (confused yet?). “-a0″ refers to the zeroith adapter. You’ll need to change the numbers for your situation.

After all this, I was able to do some testing and see *much* better performance

#Write a big file
dd if=/dev/zero of=/test.img bs=1024M count=4096

*NIX Output Logging

I was asked today about how to log the output of a command on Linux. It turns out that there are a number of ways of capturing output in *NIX environments. The well known method is to make the shell take care of it, for example, if you use bash you could:


echo "test" &>log.txt

Unfortunately, this does have some limitations. It will capture standard output and standard error, but it won’t (directly) capture any input from the user and it won’t capture anything directly written to the pty by the program. One way to work around this would be to manipulate the pty manually, but that’s a pain. Most *NIX flavors ship with a tool called “script” which will launch a shell and keep track of the output for you. For example you might do the following:


$ script

Script started output file is typescript

$ echo "foo" > /dev/tty

^D

Script done, output file is typescript

After the shell exits, you should get a file called “typescript” which contains a plain text log of everything sent to the terminal. You can also have script record timing data so that the log can be played back (check out the manpage).

Most scripting languages (and, of course, “expect”) can do something similar if you want to filter the output on the fly.

 

One thing to be aware of with this method is that it will probably catch some control characters. If you look at your output in an editor these likely won’t be rendered by their meaning. If you just “cat” the file from a terminal they may very well be displayed. You can always filter any unwanted characters with sed (or similar)