Finding a file by filename, the fast way.
April 9th, 2007
Finding a file by filename takes a while (far too long) with the GNU find(1) command. Here's one solution.
I've added this to my crontab to update a list of local files every morning.
# Get list of files for grepping at my leisure
33 03 * * * root find / -mount > /root/files
Now look at the time I'm saving. Here I count the C files on the local disk using both a cached, and a non-cached list of files.
$ time sudo find / -mount -name *.c | wc
265 265 15190
real 1m17.288s
user 0m1.168s
sys 0m2.772s
$ time sudo grep "\.c$" /root/files | wc
265 265 15190
real 0m0.309s
user 0m0.280s
sys 0m0.020s
Of course you need to set logical permissions on the /root/files file. And note that on my system this is a 35MB file with almost 500,000 lines so be careful not to open it in MS Word :)
*** Update 05-08-2007
slocate is better!
on April 12th, 2007 at 12:50 PM
I think I know why you prefer this method to using slocate, tracker, or beagle. Is it because you have the full power of the command line, and hence things like grep, to search through the files?
I’ve been frustrated with slocate at times, wishing I could just use a regex. I’m not sure if the thrashing every night is worthwhile, and non-real time results seem a bit old-fashioned now with inotify and such.
on April 12th, 2007 at 09:23 PM
I like the find command because it can filter results based on access time, user or group, and size. It’s slow so I started caching the results to a file for grep since I don’t always write a good regex on my first try. The cron job was just an extension of that idea.
Just loaded the man page for slocate. It has the -r switch for regex and it looks like the only way to go for multiple users of the file database. I’ll work more with it in the future. Thanks for the tip.