This README is for OpenSM and the InfiniBand diagnostic utilities in this directory (management). The master source repositories are: libibumad: https://github.com/linux-rdma/rdma-core git://git.openfabrics.org/~halr/opensm.git https://github.com/linux-rdma/infiniband-diags Packages -------- libibumad - interface to ib_umad module (user_mad) library libibmad - (OBSOLETE) generic MAD handling library opensm - OpenSM infiniband-diags - various diagnostic tools Building -------- Building libibumad is part of the rdma-core project now. Build that library and either have it installed or configure the following packages to build against it in place. "make install" does not work with rdma-core. To make this unpack tarballs and in directories opensm, infiniband-diags (in that order) run: ./configure && make && make install Typically the autogen and configure steps only need be done the first time unless configure.in or Makefile.am changes in the directories. Libraries are installed by default at /usr/local/lib and binaries at /usr/local/sbin. Running ------- After compiling and installing, you can run opensm by invoking /usr/local/sbin/opensm opensm must be run as root. Run 'opensm --help' to see the options. Note also that you must have udev mount /dev/infiniband or do it manually. See .../src/linux-kernel/docs/user_mad.txt. Also, ib_umad module must be loaded. opensm will run on the first existing port on the first IB device (HCA). You can override that by using "-g ". Verify that the first port is active. This assumes the port is plugged into another IB device. In case of problems, run the opensm with -V and send the log file (/var/log/opensm.log). IMPORTANT: Don't forget to modprobe ib_umad and make sure udev is configured before using any of the userspace programs. OpenSM Limitations: 1. Retry mechanism in SM is primitive and needs enhancing to deal with ports which are active but don't respond to SM MADs. 2. Async events are not yet supported (by OpenSM). The only one supported is local LID change (and this is handled in the mthca driver). Future versions of OpenSM may need to act on more local events. Tuning OpenSM for Large Clusters -------------------------------- Currently OpenSM is compiled with debug and no optimization. This should be changed to at least -O2 (and perhaps -O4) but I would start with -O2. This results in a 2x speedup for some code paths. OpenSM supports a pipelining mode for SMPs. The default is 4 outstanding SMPs. -maxsmps <#> indicates the number of outstanding SMPs allowed and should speed up the initialization. Useful values of this are 16 and 32. Beyond this, there may be some issue with a link which is causing timeout and retries to kick in. The OpenSM log should have some messages in there indicating this. Other utilities (infiniband diagnostics) ---------------------------------------- See man infiniband-diags for a list of utilities and general overview of diagnostics.