Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0 |
| 2 | .. include:: <isonum.txt> |
| 3 | |
| 4 | ===== |
| 5 | DLMFS |
| 6 | ===== |
| 7 | |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 8 | A minimal DLM userspace interface implemented via a virtual file |
| 9 | system. |
| 10 | |
| 11 | dlmfs is built with OCFS2 as it requires most of its infrastructure. |
| 12 | |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 13 | :Project web page: http://ocfs2.wiki.kernel.org |
| 14 | :Tools web page: https://github.com/markfasheh/ocfs2-tools |
Alexander A. Klimov | 4510a5a | 2020-08-06 23:18:06 -0700 | [diff] [blame] | 15 | :OCFS2 mailing lists: https://oss.oracle.com/projects/ocfs2/mailman/ |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 16 | |
| 17 | All code copyright 2005 Oracle except when otherwise noted. |
| 18 | |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 19 | Credits |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 20 | ======= |
| 21 | |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 22 | Some code taken from ramfs which is Copyright |copy| 2000 Linus Torvalds |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 23 | and Transmeta Corp. |
| 24 | |
| 25 | Mark Fasheh <mark.fasheh@oracle.com> |
| 26 | |
| 27 | Caveats |
| 28 | ======= |
| 29 | - Right now it only works with the OCFS2 DLM, though support for other |
| 30 | DLM implementations should not be a major issue. |
| 31 | |
| 32 | Mount options |
| 33 | ============= |
| 34 | None |
| 35 | |
| 36 | Usage |
| 37 | ===== |
| 38 | |
| 39 | If you're just interested in OCFS2, then please see ocfs2.txt. The |
| 40 | rest of this document will be geared towards those who want to use |
| 41 | dlmfs for easy to setup and easy to use clustered locking in |
| 42 | userspace. |
| 43 | |
| 44 | Setup |
| 45 | ===== |
| 46 | |
| 47 | dlmfs requires that the OCFS2 cluster infrastructure be in |
| 48 | place. Please download ocfs2-tools from the above url and configure a |
| 49 | cluster. |
| 50 | |
| 51 | You'll want to start heartbeating on a volume which all the nodes in |
| 52 | your lockspace can access. The easiest way to do this is via |
| 53 | ocfs2_hb_ctl (distributed with ocfs2-tools). Right now it requires |
| 54 | that an OCFS2 file system be in place so that it can automatically |
Francis Galiegue | a33f322 | 2010-04-23 00:08:02 +0200 | [diff] [blame] | 55 | find its heartbeat area, though it will eventually support heartbeat |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 56 | against raw disks. |
| 57 | |
| 58 | Please see the ocfs2_hb_ctl and mkfs.ocfs2 manual pages distributed |
| 59 | with ocfs2-tools. |
| 60 | |
| 61 | Once you're heartbeating, DLM lock 'domains' can be easily created / |
| 62 | destroyed and locks within them accessed. |
| 63 | |
| 64 | Locking |
| 65 | ======= |
| 66 | |
| 67 | Users may access dlmfs via standard file system calls, or they can use |
| 68 | 'libo2dlm' (distributed with ocfs2-tools) which abstracts the file |
| 69 | system calls and presents a more traditional locking api. |
| 70 | |
| 71 | dlmfs handles lock caching automatically for the user, so a lock |
| 72 | request for an already acquired lock will not generate another DLM |
| 73 | call. Userspace programs are assumed to handle their own local |
| 74 | locking. |
| 75 | |
Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 76 | Two levels of locks are supported - Shared Read, and Exclusive. |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 77 | Also supported is a Trylock operation. |
| 78 | |
| 79 | For information on the libo2dlm interface, please see o2dlm.h, |
| 80 | distributed with ocfs2-tools. |
| 81 | |
| 82 | Lock value blocks can be read and written to a resource via read(2) |
| 83 | and write(2) against the fd obtained via your open(2) call. The |
| 84 | maximum currently supported LVB length is 64 bytes (though that is an |
| 85 | OCFS2 DLM limitation). Through this mechanism, users of dlmfs can share |
| 86 | small amounts of data amongst their nodes. |
| 87 | |
| 88 | mkdir(2) signals dlmfs to join a domain (which will have the same name |
| 89 | as the resulting directory) |
| 90 | |
| 91 | rmdir(2) signals dlmfs to leave the domain |
| 92 | |
| 93 | Locks for a given domain are represented by regular inodes inside the |
| 94 | domain directory. Locking against them is done via the open(2) system |
| 95 | call. |
| 96 | |
| 97 | The open(2) call will not return until your lock has been granted or |
| 98 | an error has occurred, unless it has been instructed to do a trylock |
| 99 | operation. If the lock succeeds, you'll get an fd. |
| 100 | |
| 101 | open(2) with O_CREAT to ensure the resource inode is created - dlmfs does |
| 102 | not automatically create inodes for existing lock resources. |
| 103 | |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 104 | ============ =========================== |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 105 | Open Flag Lock Request Type |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 106 | ============ =========================== |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 107 | O_RDONLY Shared Read |
| 108 | O_RDWR Exclusive |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 109 | ============ =========================== |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 110 | |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 111 | |
| 112 | ============ =========================== |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 113 | Open Flag Resulting Locking Behavior |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 114 | ============ =========================== |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 115 | O_NONBLOCK Trylock operation |
Mauro Carvalho Chehab | 14a19fa | 2020-02-17 17:11:58 +0100 | [diff] [blame] | 116 | ============ =========================== |
Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 117 | |
| 118 | You must provide exactly one of O_RDONLY or O_RDWR. |
| 119 | |
| 120 | If O_NONBLOCK is also provided and the trylock operation was valid but |
| 121 | could not lock the resource then open(2) will return ETXTBUSY. |
| 122 | |
| 123 | close(2) drops the lock associated with your fd. |
| 124 | |
| 125 | Modes passed to mkdir(2) or open(2) are adhered to locally. Chown is |
| 126 | supported locally as well. This means you can use them to restrict |
| 127 | access to the resources via dlmfs on your local node only. |
| 128 | |
| 129 | The resource LVB may be read from the fd in either Shared Read or |
| 130 | Exclusive modes via the read(2) system call. It can be written via |
| 131 | write(2) only when open in Exclusive mode. |
| 132 | |
| 133 | Once written, an LVB will be visible to other nodes who obtain Read |
| 134 | Only or higher level locks on the resource. |
| 135 | |
| 136 | See Also |
| 137 | ======== |
| 138 | http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf |
| 139 | |
| 140 | For more information on the VMS distributed locking API. |