Blame view

README.md 14.7 KB
redmine authored
1
[![Build Status](https://travis-ci.org/xaionaro/clsync.png?branch=master)](https://travis-ci.org/xaionaro/clsync)
redmine authored
2
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/xaionaro/clsync?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
redmine authored
3

redmine authored
4 5
clsync
======
redmine authored
6 7
Contents
--------
redmine authored
8

redmine authored
9 10 11 12 13 14
1.  Name
2.  Motivation
3.  inotify vs fanotify
4.  Installing
5.  How to use
6.  Example of usage
redmine authored
15 16 17
7.  Other uses
8.  Clustering
9.  Known building issues
redmine authored
18 19 20
10. FreeBSD support
11. Support
12. Developing
redmine authored
21
13. Articles
redmine authored
22
14. See also
redmine authored
23 24 25


1. Name
redmine authored
26
-------
redmine authored
27

redmine authored
28 29 30 31 32 33
Why "clsync"? The first name of the utility was "insync" (due to inotify) but
then I suggested to use "fanotify" instead of "inotify" and utility has been
renamed to "fasync". After that I started to intensively write the program and
I faced with some problems in "fanotify". So I was have to temporary fallback
to "inotify" then I decided that the best name is "Runtime Sync" or
"Live Sync" but "rtsync" is a name of some corporation and "lsync" is busy
redmine authored
34
by "[lsyncd](https://github.com/axkibe/lsyncd)". So I called it
redmine authored
35
"clsync" that should be interpreted as "lsync but on c" due to "lsyncd" that
redmine authored
36 37
written on "LUA" and may be used for the same purposes.

redmine authored
38
UPD: Also I was have to add somekind of clustering support. It's a multicast
redmine authored
39 40
notifing subsystem to prevent loops on bidirection syncing. So "clsync" also
can be interpreted as "cluster live sync". ;)
redmine authored
41

redmine authored
42
2. Motivation
redmine authored
43
-------------
redmine authored
44

redmine authored
45
This utility has been written for two purposes:
redmine authored
46
- for making high availability clusters
redmine authored
47
- for making backups of them
redmine authored
48

redmine authored
49
To do a HA cluster I've tried a lot of different solutions, like "simple 
redmine authored
50
rsync by cron", "glusterfs", "ocfs2 over drbd", "common mirrorable external 
redmine authored
51
storage", "incron + perl + rsync", "inosync", "lsyncd" and so on. When I 
redmine authored
52
started to write the utility we were using "lsyncd", "ceph" and
redmine authored
53 54
"ocfs2 over drbd". However all of this solutions doesn't arrange me, so I
was have to write own utility for this purpose.
redmine authored
55 56 57 58 59

To do backups we also tried a lot of different solution, and again I was have
to write own utility for this purpose.

The best known (for me) replacement for this utility is "lsyncd", however:
redmine authored
60
- It's code is `>½` on LUA. There a lot of problems connected with it,
redmine authored
61
for example:
redmine authored
62 63 64
    - It's more difficult to maintain the code with ordinary sysadmin.
    - It really eats 100% CPU sometimes.
    - It requires LUA libs, that cannot be easily installed to few
redmine authored
65
of our systems.
redmine authored
66
- It's a little buggy. That may be easily fixed for our cases,
redmine authored
67
but LUA. :(
redmine authored
68
- It doesn't support pthread or something like that. It's necessary
redmine authored
69
to serve huge directories with a lot of containers right.
redmine authored
70
- It cannot run rsync for a pack of files. It runs rsync for every
redmine authored
71
event. :(
redmine authored
72 73
- Sometimes, it's too complex in configuration for our situation.
- It can't set another event-collecting delay for big files. We don't
redmine authored
74
want to sync big files (`>1GiB`) so often as ordinary files.
redmine authored
75
- Shared object (.so file) cannot be used as rsync-wrapper.
redmine authored
76
- It doesn't support kqueue/bsm
redmine authored
77

redmine authored
78
Sorry, if I'm wrong. Let me know if it is, please :). "lsyncd" - is really
redmine authored
79
interesting and useful utility, just it's not appropriate for us.
redmine authored
80

redmine authored
81
UPD.: Also clsync had been used to replace incron/csync2/etc in HPC-clusters for
redmine authored
82
syncing /etc/{passwd,shadow,group,shells} files.
redmine authored
83

redmine authored
84
3. inotify vs fanotify:
redmine authored
85
-----------------------
redmine authored
86

redmine authored
87
It's said that fanotify is much better than inotify. So I started to write 
redmine authored
88
this program with using of fanotify. However I encountered the problem, that
redmine authored
89 90
fanotify was unable to catch some important events at the moment of writing
the program, like "directory creation" or "file deletion". So I switched to
redmine authored
91 92 93 94
"inotify", leaving the code for "fanotify" in the safety... So, don't use
"fanotify" in this utility ;).


redmine authored
95
4. Installing
redmine authored
96
-------------
redmine authored
97

redmine authored
98
Debian/ubuntu-users can try to install it directly with apt-get:
redmine authored
99

redmine authored
100
    apt-get install clsync
redmine authored
101

redmine authored
102
If it's required to install clsync from the source, first of all, you should
redmine authored
103
install dependencies to compile it. On debian-like systems you should
redmine authored
104
execute something like:
redmine authored
105

redmine authored
106
    apt-get install libglib2.0-dev autoreconf gcc
redmine authored
107

redmine authored
108 109 110 111 112 113 114 115 116 117 118
Next step is generating Makefile. To do that usually it's enought to execute:

    autoreconf -i && ./configure

Next step is compiling. To compile usually it's enough to execute:

    make

Next step is installing. To install usually it's enough to execute:

    su -c 'make install'
redmine authored
119

redmine authored
120

redmine authored
121
5. How to use
redmine authored
122
-------------
redmine authored
123

redmine authored
124 125
How to use is described in "man" ;). What is not described, you can ask me
personally (see "Support").
redmine authored
126 127


redmine authored
128
6. Example of usage
redmine authored
129
-------------------
redmine authored
130

redmine authored
131
Example of usage, that works on my PC is in directory "examples". Just run
redmine authored
132
"clsync-start-rsyncdirect.sh" and try to create/modify/delete files/dirs in
redmine authored
133 134 135
"example/testdir/from". All modifications should appear (with some delay) in
directory "example/testdir/to" ;)

redmine authored
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189
For dummies:

    pushd /tmp
    git clone https://github.com/xaionaro/clsync
    cd clsync
    autoreconf -fi
    ./configure
    make
    export PATH_OLD="$PATH"
    export PATH="$(pwd):$PATH"
    cd examples
    ./clsync-start-rsyncdirect.sh
    export PATH="$PATH_OLD"

Now you can try to make changes in directory
"/tmp/clsync/examples/testdir/from" (in another terminal).
Wait about 7 seconds after the changes and check directory
"/tmp/clsync/examples/testdir/to". To finish the experiment press ^C
(control+c) in clsync's terminal.

    cd ../..
    rm -rf clsync
    popd

Note: There's no need to change PATH's value if clsync is installed
system-wide, e.g. with

    make install

For dummies, again (with "make install"):

    pushd /tmp
    git clone https://github.com/xaionaro/clsync
    cd clsync
    autoreconf -fi
    ./configure
    make
    sudo make install
    cd examples
    ./clsync-start-rsyncdirect.sh

Directory "/tmp/clsync/examples/testdir/from" is now synced to
"/tmp/clsync/examples/testdir/to" with 7 seconds delay. To terminate
the clsync press ^C (control+c) in clsync's terminal.

    cd ..
    sudo make uninstall
    cd ..
    rm -rf clsync
    popd

For really dummies or/and lazy users, there's a video demonstration:
[http://ut.mephi.ru/oss/clsync](http://ut.mephi.ru/oss/clsync)

redmine authored
190

redmine authored
191 192 193 194 195
7. Other uses
-------------

For example, command

redmine authored
196
    ionice -c 3 clsync -L /dev/shm/clsync --exit-on-no-events -x 23 -x 24 -M rsyncdirect -S $(which rsync) -W /path/from -D /path/to -d1
redmine authored
197

redmine authored
198
may be used to copy "/path/from" into "/path/to" with sync up of changes made (in "/path/from") while the copying. It will copy new changes over and over until there will be no changes, and then clsync will exit. It may be used as atomicity-like recursive copy.
redmine authored
199 200


redmine authored
201 202 203 204 205

Or command

    clsync -w5 -t5 -T5 -x1 -W /var/www/site.example.org/root -Mdirect -Schown --uid 0 --gid 0 -Ysyslog -b1 -- --from=root www-data:www-data %INCLUDE-LIST%

redmine authored
206
may be used to fix files owner in runtime. This may be used as a temporary solution for fixing file privileges of misconfigured web-servers (it's well-known problem of apache users).
redmine authored
207

redmine authored
208
8. Clustering
redmine authored
209
-------------
redmine authored
210

redmine authored
211 212 213 214 215 216 217 218 219 220 221 222 223 224
I've started to implement support of bi-directional syncing with using
multicast notifing of other nodes. However it became a long task, so it was
suspended for next releases.

However let's solve next hypothetical problem. For example, you're using
LXC and trying to replicate containers between two servers (to make failover
and load balancing).

In this case you have to sync containers in both directions. However, if you
just run clsync to sync containers to neighboring node on both of them, you'll
get sync-loop [file-update on A causes file-update on B causes file-update
on A causes ...].

Well, in this case I with my colleagues were using separate directories for
redmine authored
225
every node of cluster (e.g. "`/srv/nodes/<NODE NAME>/containers/<CONTAINERS>`")
redmine authored
226 227 228 229 230 231 232 233 234 235 236 237 238
and syncing every directory only in one direction. That was failover with
load-balancing, but very unconvenient. So I've started to write code for
bi-directional syncing, however it's no time to complete it :(. So
Andrew Savchenko proposed to run one clsync-instance per container. And this's
really good solution. It's just need to start clsync-process when container
starts and stop the process when containers stops. The only problem is
split-brain, that can be solved two ways:
- by human every time;
- by scripts that chooses which variant of container to save.

Example of the script is just a script that calls "find" on both sides to
determine which side has the latest changes :)

redmine authored
239 240
UPD: I've added option "--modification-signature" that helps to prevent syncing file, that is not changed. You can easily use it to prevent sync-loops for bi-directional syncing.

redmine authored
241
9. Known building issues
redmine authored
242
------------------------
redmine authored
243

redmine authored
244 245
May be problems with "configuring" or compilation. In this case just try
next command:
redmine authored
246
    echo '#define REVISION "-custom"' > revision.h; gcc -std=gnu99 -D\_FORTIFY\_SOURCE=2 -DPARANOID -pipe -Wall -ggdb3 --param ssp-buffer-size=4 -fstack-check -fstack-protector-all -Xlinker -zrelro -pthread $(pkg-config --cflags glib-2.0) $(pkg-config --libs glib-2.0) -ldl \*.c -o /tmp/clsync
redmine authored
247

redmine authored
248 249 250 251

10. FreeBSD support
-------------------

redmine authored
252
clsync has been ported to FreeBSD.
redmine authored
253

redmine authored
254
FreeBSD doesn't support inotify, so there're 3.5 ways to use clsync on it:
redmine authored
255
* using [libinotify](https://github.com/dmatveev/libinotify-kqueue);
redmine authored
256
* using BSM API (with or without a prefetcher thread);
redmine authored
257 258
* using kqueue/kevent directly.

redmine authored
259 260
Here's an excerpt from the manpage:

redmine authored
261 262 263
     Possible values:
            inotify
                   inotify(7) [Linux, (FreeBSD via libinotify)]
redmine authored
264
    
redmine authored
265
                   Native, fast, reliable and well tested Linux FS monitor subsystem.
redmine authored
266
    
redmine authored
267 268 269
                   There's no essential performance profit to use "inotify"  instead  of
                   "kevent"  on FreeBSD using "libinotify". It backends to "kevent" any‐
                   way.
redmine authored
270
    
redmine authored
271 272
                   FreeBSD users: The libinotify on FreeBSD is still not ready and unus‐
                   able for clsync to sync a lot of files and directories.
redmine authored
273
    
redmine authored
274 275
            kqueue
                   kqueue(2) [FreeBSD, (Linux via libkqueue)]
redmine authored
276 277 278 279
    
                   A  *BSD  kernel  event  notification  mechanism (inc. timer, sockets,
                   files etc).
    
redmine authored
280 281 282
                   This monitor subsystem cannot determine file creation event,  but  it
                   can determine a directory where something happened. So clsync is have
                   to rescan whole dir every  time  on  any  content  change.  Moreover,
redmine authored
283
                   kqueue  requires  an  open()  on  every watched file/dir. But FreeBSD
redmine authored
284
                   doesn't allow to open() symlink itself (without following)  and  it's
redmine authored
285 286 287 288 289 290
                   highly  invasively  to open() pipes and devices. So clsync just won't
                   call open() on everything except regular files and directories.  Con‐
                   sequently,  clsync  cannot  determine  if  something  changed in sym‐
                   link/pipe/socket and so on.  However it still  can  determine  if  it
                   will  be created or deleted by watching the parent directory and res‐
                   caning it on every appropriate event.
redmine authored
291
    
redmine authored
292 293 294
                   Also this API requires to open every monitored file and directory. So
                   it  may  produce  a  huge  amount  of  file descriptors. Be sure that
                   kern.maxfiles is big enough (in FreeBSD).
redmine authored
295
    
redmine authored
296
                   CPU/HDD expensive way.
redmine authored
297
    
redmine authored
298
                   Not well tested. Use with caution!
redmine authored
299
    
redmine authored
300
                   Linux users: The libkqueue on Linux is not working. He-he :)
redmine authored
301
    
redmine authored
302 303
            bsm
                   bsm(3) [FreeBSD]
redmine authored
304
    
redmine authored
305
                   Basic Security Module (BSM) Audit API.
redmine authored
306
    
redmine authored
307 308 309 310
                   This is not a FS monitor subsystem, actually. It's  just  an  API  to
                   access  to  audit information (inc. logs).  clsync can setup audit to
                   watch FS events and report it into log. After that clsync  will  just
                   parse the log via auditpipe(4) [FreeBSD].
redmine authored
311
    
redmine authored
312 313
                   Reliable,  but  hacky  way.  It requires global audit reconfiguration
                   that may hopple audit analysis.
redmine authored
314
    
redmine authored
315 316 317 318 319 320 321
                   Warning!  FreeBSD has a limit for queued events. In  default  FreeBSD
                   kernel it's only 1024 events. So choose one of:
                          - To patch the kernel to increase the limit.
                          - Don't use clsync on systems with too many file events.
                          - Use bsm_prefetch mode (but there's no guarantee in this case
                          anyway).
                   See also option --exit-on-sync-skip.
redmine authored
322
    
redmine authored
323 324 325
                   Not  well  tested.  Use   with   caution!    Also   file   /etc/secu‐
                   rity/audit_control will be overwritten with:
                          #clsync
redmine authored
326
    
redmine authored
327 328 329 330 331 332 333 334
                          dir:/var/audit
                          flags:fc,fd,fw,fm,cl
                          minfree:0
                          naflags:fc,fd,fw,fm,cl
                          policy:cnt
                          filesz:1M
                   unless it's already starts with "#clsync\n" ("\n" is a new line char‐
                   acter).
redmine authored
335
    
redmine authored
336 337 338 339
            bsm_prefetch
                   The same as bsm but all BSM events will be  prefetched  by  an  addi‐
                   tional  thread  to prevent BSM queue overflow. This may utilize a lot
                   of memory on systems with a high FS events frequency.
redmine authored
340
    
redmine authored
341 342
                   However the thread may be not fast enough to unload  the  kernel  BSM
                   queue. So it may overflow anyway.
redmine authored
343
    
redmine authored
344
     The default value on Linux is "inotify". The default value on FreeBSD is "kqueue".
redmine authored
345 346 347 348 349

I hope you will send me bugreports to make me able to improve the FreeBSD support :)


11. Support
redmine authored
350
-----------
redmine authored
351

redmine authored
352
To get support, you can contact with me this ways:
redmine authored
353 354
- Official IRC channel of "clsync": irc.freenode.net#clsync
- Where else can you find me: IRC:SSL+UTF-8 irc.campus.mephi.ru:6695#mephi,xaionaro,xai
redmine authored
355
- And e-mail: <dyokunev@ut.mephi.ru>, <xaionaro@gmail.com>; PGP pubkey: 0x8E30679C
redmine authored
356

redmine authored
357
12. Developing
redmine authored
358
--------------
redmine authored
359

redmine authored
360 361
I started to write "DEVELOPING" and "PROTOCOL" files.
You can look there if you wish. ;)
redmine authored
362

redmine authored
363
I'll be glad to receive code contribution :)
redmine authored
364

redmine authored
365 366 367 368 369 370 371 372
13. Articles
------------

Russian:
- [HA clustering](https://gitlab.ut.mephi.ru/ut/articles/blob/master/clsync/ha)
- [syncing to many nodes](https://gitlab.ut.mephi.ru/ut/articles/blob/master/clsync/inotify-to-many-nodes)
- [atomic sync](https://gitlab.ut.mephi.ru/ut/articles/blob/master/clsync/atomicsync)

redmine authored
373 374
LVEE (Russian):
- [clsync - live sync utility (abstract)](http://lvee.org/en/abstracts/118) [presentation](http://lvee.org/uploads/image_upload/file/337/winter_2014_15_clsync.pdf)
redmine authored
375
- [clsync progress: security and porting to freebsd](http://lvee.org/en/abstracts/138)
redmine authored
376

redmine authored
377 378 379 380 381
14. See also
------------

- [lrsync](https://github.com/xaionaro/lrsync)

redmine authored
382

redmine authored
383
                                               -- Dmitry Yu Okunev <dyokunev@ut.mephi.ru> 0x8E30679C
redmine authored
384