Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 1 | ==================== |
| 2 | Userspace MAD access |
| 3 | ==================== |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 4 | |
| 5 | Device files |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 6 | ============ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 7 | |
| 8 | Each port of each InfiniBand device has a "umad" device and an |
| 9 | "issm" device attached. For example, a two-port HCA will have two |
| 10 | umad devices and two issm devices, while a switch will have one |
| 11 | device of each type (for switch port 0). |
| 12 | |
| 13 | Creating MAD agents |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 14 | =================== |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 15 | |
| 16 | A MAD agent can be created by filling in a struct ib_user_mad_reg_req |
| 17 | and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file |
| 18 | descriptor for the appropriate device file. If the registration |
| 19 | request succeeds, a 32-bit id will be returned in the structure. |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 20 | For example:: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 21 | |
| 22 | struct ib_user_mad_reg_req req = { /* ... */ }; |
| 23 | ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req); |
| 24 | if (!ret) |
| 25 | my_agent = req.id; |
| 26 | else |
| 27 | perror("agent register"); |
| 28 | |
| 29 | Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT |
| 30 | ioctl. Also, all agents registered through a file descriptor will |
| 31 | be unregistered when the descriptor is closed. |
| 32 | |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 33 | 2014 |
| 34 | a new registration ioctl is now provided which allows additional |
Ira Weiny | 0f29b46 | 2014-08-08 19:00:55 -0400 | [diff] [blame] | 35 | fields to be provided during registration. |
| 36 | Users of this registration call are implicitly setting the use of |
| 37 | pkey_index (see below). |
| 38 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 39 | Receiving MADs |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 40 | ============== |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 41 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 42 | MADs are received using read(). The receive side now supports |
| 43 | RMPP. The buffer passed to read() must be at least one |
| 44 | struct ib_user_mad + 256 bytes. For example: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 45 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 46 | If the buffer passed is not large enough to hold the received |
| 47 | MAD (RMPP), the errno is set to ENOSPC and the length of the |
| 48 | buffer needed is set in mad.length. |
| 49 | |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 50 | Example for normal MAD (non RMPP) reads:: |
| 51 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 52 | struct ib_user_mad *mad; |
| 53 | mad = malloc(sizeof *mad + 256); |
| 54 | ret = read(fd, mad, sizeof *mad + 256); |
| 55 | if (ret != sizeof mad + 256) { |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 56 | perror("read"); |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 57 | free(mad); |
| 58 | } |
| 59 | |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 60 | Example for RMPP reads:: |
| 61 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 62 | struct ib_user_mad *mad; |
| 63 | mad = malloc(sizeof *mad + 256); |
| 64 | ret = read(fd, mad, sizeof *mad + 256); |
| 65 | if (ret == -ENOSPC)) { |
| 66 | length = mad.length; |
| 67 | free(mad); |
| 68 | mad = malloc(sizeof *mad + length); |
| 69 | ret = read(fd, mad, sizeof *mad + length); |
| 70 | } |
| 71 | if (ret < 0) { |
| 72 | perror("read"); |
| 73 | free(mad); |
| 74 | } |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 75 | |
| 76 | In addition to the actual MAD contents, the other struct ib_user_mad |
| 77 | fields will be filled in with information on the received MAD. For |
| 78 | example, the remote LID will be in mad.lid. |
| 79 | |
| 80 | If a send times out, a receive will be generated with mad.status set |
| 81 | to ETIMEDOUT. Otherwise when a MAD has been successfully received, |
| 82 | mad.status will be 0. |
| 83 | |
| 84 | poll()/select() may be used to wait until a MAD can be read. |
| 85 | |
| 86 | Sending MADs |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 87 | ============ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 88 | |
| 89 | MADs are sent using write(). The agent ID for sending should be |
| 90 | filled into the id field of the MAD, the destination LID should be |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 91 | filled into the lid field, and so on. The send side does support |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 92 | RMPP so arbitrary length MAD can be sent. For example:: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 93 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 94 | struct ib_user_mad *mad; |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 95 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 96 | mad = malloc(sizeof *mad + mad_length); |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 97 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 98 | /* fill in mad->data */ |
| 99 | |
| 100 | mad->hdr.id = my_agent; /* req.id from agent registration */ |
| 101 | mad->hdr.lid = my_dest; /* in network byte order... */ |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 102 | /* etc. */ |
| 103 | |
Hal Rosenstock | 3f75dad | 2005-07-27 11:45:41 -0700 | [diff] [blame] | 104 | ret = write(fd, &mad, sizeof *mad + mad_length); |
| 105 | if (ret != sizeof *mad + mad_length) |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 106 | perror("write"); |
| 107 | |
Hal Rosenstock | bd8031b | 2007-04-24 21:30:38 -0700 | [diff] [blame] | 108 | Transaction IDs |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 109 | =============== |
Hal Rosenstock | bd8031b | 2007-04-24 21:30:38 -0700 | [diff] [blame] | 110 | |
| 111 | Users of the umad devices can use the lower 32 bits of the |
| 112 | transaction ID field (that is, the least significant half of the |
| 113 | field in network byte order) in MADs being sent to match |
| 114 | request/response pairs. The upper 32 bits are reserved for use by |
| 115 | the kernel and will be overwritten before a MAD is sent. |
| 116 | |
Roland Dreier | 2be8e3e | 2007-10-09 19:59:15 -0700 | [diff] [blame] | 117 | P_Key Index Handling |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 118 | ==================== |
Roland Dreier | 2be8e3e | 2007-10-09 19:59:15 -0700 | [diff] [blame] | 119 | |
| 120 | The old ib_umad interface did not allow setting the P_Key index for |
| 121 | MADs that are sent and did not provide a way for obtaining the P_Key |
| 122 | index of received MADs. A new layout for struct ib_user_mad_hdr |
Ira Weiny | 0f29b46 | 2014-08-08 19:00:55 -0400 | [diff] [blame] | 123 | with a pkey_index member has been defined; however, to preserve binary |
| 124 | compatibility with older applications, this new layout will not be used |
| 125 | unless one of IB_USER_MAD_ENABLE_PKEY or IB_USER_MAD_REGISTER_AGENT2 ioctl's |
| 126 | are called before a file descriptor is used for anything else. |
Roland Dreier | 2be8e3e | 2007-10-09 19:59:15 -0700 | [diff] [blame] | 127 | |
| 128 | In September 2008, the IB_USER_MAD_ABI_VERSION will be incremented |
| 129 | to 6, the new layout of struct ib_user_mad_hdr will be used by |
| 130 | default, and the IB_USER_MAD_ENABLE_PKEY ioctl will be removed. |
| 131 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 132 | Setting IsSM Capability Bit |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 133 | =========================== |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 134 | |
| 135 | To set the IsSM capability bit for a port, simply open the |
| 136 | corresponding issm device file. If the IsSM bit is already set, |
| 137 | then the open call will block until the bit is cleared (or return |
| 138 | immediately with errno set to EAGAIN if the O_NONBLOCK flag is |
| 139 | passed to open()). The IsSM bit will be cleared when the issm file |
| 140 | is closed. No read, write or other operations can be performed on |
| 141 | the issm file. |
| 142 | |
| 143 | /dev files |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 144 | ========== |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 145 | |
| 146 | To create the appropriate character device files automatically with |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 147 | udev, a rule like:: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 148 | |
Bart Van Assche | aa07a99 | 2009-10-07 15:35:55 -0700 | [diff] [blame] | 149 | KERNEL=="umad*", NAME="infiniband/%k" |
| 150 | KERNEL=="issm*", NAME="infiniband/%k" |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 151 | |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 152 | can be used. This will create device nodes named:: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 153 | |
| 154 | /dev/infiniband/umad0 |
| 155 | /dev/infiniband/issm0 |
| 156 | |
| 157 | for the first port, and so on. The InfiniBand device and port |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 158 | associated with these devices can be determined from the files:: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 159 | |
| 160 | /sys/class/infiniband_mad/umad0/ibdev |
| 161 | /sys/class/infiniband_mad/umad0/port |
| 162 | |
Mauro Carvalho Chehab | 97162a1 | 2019-06-08 23:27:03 -0300 | [diff] [blame] | 163 | and:: |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 164 | |
| 165 | /sys/class/infiniband_mad/issm0/ibdev |
| 166 | /sys/class/infiniband_mad/issm0/port |