SYSTEM1:root# freedup -cf /home/freedup/holidays/2006/family /home/freedup/holidays/2006/friendsRun through both trees .../family and .../friends, compare the files (selections of my holiday snapshots) and link those files, that have identical names and contents. The option -c counts how much space is saved by linking.
MP3 files usually contain a tag (indeed there are two different tags that may coexist at the beginning and at the end of the file) that contains more information than the pathname. MP3 files are also copied frequently (for legal reasons) like having it on a MP3 CD for the car, on the stick that is used for jogging or for simple rearrangements from one computer to another. With each further action it starts getting more complicate to know, whether it is a old version or a new one in higher quality (hard disks are getting cheaper). In order to make the situation worse, I tend to correct wrong titles or misspelled Artist names. (Do you know how many 'r', 's', 't' and 'n's are in the artists name who sings "Ironic"?)
I think the motivation should be clear by now. How looks the solution like? My solution is to use FreeDup and its extended style enhancements. Here is the syntax to find out, whether it is a good idea to link files and one should check it manually:
MP3SYSTEM:root# freedup -inq -x mp3 /music /burn
/music/Alanis_Morissette/Ironic.mp3 /music/HiFi/Alanis_Morissette/Ironic.mp3 /burn/CD/Females/Alanis_Morissette/Ironic.mp3 /burn/CD/CarMix1/Alanis_Morissette/Ironic.mp3 /burn/Stick/Blue/Alanis_Morissette/Ironic.mp3 [...]
Due to the amount of files to link I tend to link them automatically, by omitting -in or replacing it with -c. In case I want to to decide the linking direction case by case I only omit -n.
MP3SYSTEM:root# freedup -i -x mp3 /music /burn
Extra Style comparisons are not thouroughly tested yet. You may loose header information, if you pressI was used to use CTRL-D to bypass this message. But you should update to a new version.
. Use CTRL-C to stop.
4052 files to investigate Del:Lnk [filesize:devc:i-node:perm:L:tag] <filename> 0 : A [ 81920:0303:00a8c0:0644:1: ] /music/Alanis_Morissette/Ironic.mp3 1 : B [ 86016:0303:00a8e7:0644:1: 2] /music/HiFi/Alanis_Morissette/Ironic.mp3 2 : C [ 86018:0303:00a8e0:0644:1: 2] /burn/CD/CarMix1/Alanis_Morissette/Ironic.mp3 3 : D [ 86144:0303:00a8bb:0644:1:12] /burn/CD/Females/Alanis_Morissette/Ironic.mp3 4 : E [ 86144:0303:00a8c1:0644:1:12] /burn/Stick/Blue/Alanis_Morissette/Ironic.mp3 Delete on number, Link on letter, Symlink on Capital (first is source)When I get the first selection I type 'e', 'b','a','3','2'. When I press return, the commands will be executed. It is easy to erase the command list by typing the escape key. If your decision rules are simple you may use '@','<','>' to link all other files to the first (from command line or standard input), to the oldest or the newest file. But I use CTRL-C to try different options.
to confirm. to clear. All links will point to <a>. $ ln <e> <b>; ln <e> <a>; rm <3>; rm <2>; [...]
MP3SYSTEM:root# freedup -i /music /burn
If I now start FreeDup without the extra style option, not all identical mp3 codings would have been found, since most files differ (compare sizes). The resulting list is much shorter. Since they are all identical, the size and the tag info (FreeDup does not look for tags then) is not shown.
4052 files to investigate Del:Lnk [dev:i-node:perm:L] 0 : A [0303:00a8bb:0644:1] /burn/CD/Females/Alanis_Morissette/Ironic.mp3 1 : B [0303:00a8c1:0644:1] /burn/Stick/Blue/Alanis_Morissette/Ironic.mp3 Delete on number, Link on letter, Symlink on Capital (first is source)
to confirm. to clear. All links will point to <a>. $ ln <b> <a>; [...]
The intention is to see, what FreeDup would do on all registered JPEGs on SYSTEM1. We do run the command as root, just to see all allowed links.
SYSTEM1:root# locate '*.jpg' | freedup -nv
lstat() failed while reading file statistics: No such file or directory lstat() failed while reading file statistics: No such file or directory ... lstat() failed while reading file statistics: No such file or directory lstat() failed while reading file statistics: No such file or directory 1085 files to investigate ln "/opt/kde3/share/apps/pixie/doc/en/pixielogo.jpg" "/opt/kde3/share/apps/pixie/pixielogo.jpg" ln "/opt/kde3/share/apps/quanta/templates/binaries/images/jpg/demo.jpg" "/opt/kde3/share/apps/quanta/templates/images/jpg/demo.jpg" ln "/usr/lib/webmin/mscstyle3/images/cats/net.jpg" "/usr/lib/webmin/mscstyle3/images/cats_over/net.jpg" ln "/usr/lib/webmin/mscstyle3/images/cats/webmin.jpg" "/usr/lib/webmin/mscstyle3/images/cats_over/webmin.jpg" ln "/usr/share/games/freedroid/graphics/transfer.jpg" "/usr/src/packages/BUILD/freedroid-0.8.4/graphics/transfer.jpg" ln "/usr/share/doc/packages/id3lib-devel/attilas_id3logo.jpg" "/usr/src/packages/BUILD/audacity-src-1.0.0/id3lib/doc/attilas_id3logo.jpg" ln "/usr/share/doc/packages/mgp/sample/mgp3.jpg" "/usr/X11R6/lib/X11/mgp/mgp3.jpg" ln "/usr/share/doc/packages/mgp/sample/mgp2.jpg" "/usr/X11R6/lib/X11/mgp/mgp2.jpg" ln "/usr/share/doc/packages/mgp/sample/mgp1.jpg" "/usr/X11R6/lib/X11/mgp/mgp1.jpg" ln "/opt/kde3/share/apps/kworldclock/maps/caida_bw/1280.jpg" "/usr/X11R6/lib/X11/xglobe/caida_bw_1280.jpg" ln "/opt/kde3/share/apps/kworldclock/maps/caida/1280.jpg" "/usr/X11R6/lib/X11/xglobe/caida_1280.jpg" ln "/opt/kde3/share/apps/kworldclock/maps/alt/1200.jpg" "/usr/X11R6/lib/X11/xglobe/alt_1200.jpg" ln "/opt/kde3/share/apps/kworldclock/maps/bio/1600.jpg" "/usr/X11R6/lib/X11/xglobe/bio_1600.jpg" ln "/opt/kde3/share/apps/kworldclock/maps/depths/1440.jpg" "/usr/X11R6/lib/X11/xglobe/depths_1440.jpg" ln "/opt/kde3/share/apps/kworldclock/maps/mggd/1440.jpg" "/usr/X11R6/lib/X11/xglobe/mggd_1440.jpg" 15 files of 1085 will be replaced by links. The total size of replacable files is 1506387 bytes. md5 hash algorithm had to read 86 files to avoid 31 file comparisons.
Initially failed lstat() executions show, that the locate database was not updated since the last JPEG removals. The stats of 1085 files have been read and compared after that. 15 files turned out to match 15 others. Most of them seem having been transfered by install commands. The amount of saved space is about 1.5MB. Using the md5 hashing was not a good idea in this case. Instead of reading and evaluating a hash sum it would have been easier to read 62 files for direct comparison.
Please be aware, that the displayed commands cannot be piped into a file and executed later. You need to remove the target first or use ln -f, before you link it. Otherwise you will receive "file exists". Later versions of FreeDup show option -f by default.
SYSTEM1:root# find /usr/src/linux -type f -xdev -atime +12 | freedup -nv SYSTEM1:/home/freedup # find /usr/src/linux -type f -xdev -atime +12 | freedup -c
Taking file names from stdin 0 files to investigate 0 files of 0 replaced by links. The total size of replaced files was 0 bytes. md5 hash algorithm had to read 0 files to avoid 0 file comparisons.The starting tree is not a tree but a symbolic link. You need to append a slash to descend into the referenced directory. This trick only works for the starting tree.
SYSTEM1:/home/freedup # find /usr/src/linux/ -type f -xdev -atime +12 | time freedup -c
Taking file names from stdin 1045 files to investigate 0 files of 1045 replaced by links. The total size of replaced files was 0 bytes. md5 hash algorithm had to read 0 files to avoid 0 file comparisons. 0.00user 0.01system 0:00.29elapsed 6%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1310minor)pagefaults 0swapsYou see, that I tried FreeDup already before. No file that was not touched for 12 days did match another. -xdev was used to confine the find command to the local directory. The prefix command time was used to show some not very interesting performance statistics. Another way to write it down is
SYSTEM1:/home/freedup # freedup -c /usr/src/linux/ -o "-xdev -atime +12"
The option passing really pays if you need to scan a number of trees. Just compare yourself:
SYSTEM1:/home/freedup # freedup -c /usr/src/linux-22.214.171.124 /usr/src/linux-126.96.36.199 -o "-xdev -atime +12"versus
SYSTEM1:/home/freedup # ( find /usr/src/linux-188.8.131.52 -type f -xdev -atime +12 ; find /usr/src/linux-184.108.40.206 -type f -xdev -atime +12 ) | time freedup -cPlease note, that I omitted (incorrectly) the find default action -print.
SYSTEM1:/home/freedup/fdupes-1.40 # freedup -in testdir/
14 files to investigate testdir/two testdir/twice_one testdir/recursed_a/two testdir/recursed_a/one testdir/recursed_b/one testdir/recursed_b/three testdir/recursed_b/two_plus_one testdir/zero_a testdir/zero_b testdir/with spaces a testdir/with spaces b
|freedup /usr/src/linux-2.6.10 /usr/src/linux-2.6.11||20/68329||(9k)|
|freedup /usr/man /usr/share/man||14/10772||(19kB)|
|freedup /usr/share/locale /etc/locale||36/1436 files||(29kB)|
|Warwick Pooles Personal Files||26% space reduction|