Tuesday, 5 August 2014

sendfile system call > cat and cp

My friend just came to me and asked me whether I thought it would be quicker to use cat to copy a file or to create a bespoke program to do the same job. Whilst I originally thought the differences would be negligible I remembered the system call 'sendfile' and thought I would have a poke. 

In the linux 2.2 kernel (This release was not yesterday, it was in 1999), the system call 'sendfile' was released allowing for fast file transfers between two file descriptors.

man sendfile:
"Because this copying is done within the kernel, sendfile() is more efficient than the combination of read(2) and write(2), which would require transferring data to and from user space"

With a quick review of the coreutils source code, I don't think that sendfile is in use..

[adam@localhost coreutils-8.23]$ grep sendfile * -RA2
TODO:Integrate use of sendfile, suggested here:
TODO-  http://mail.gnu.org/archive/html/bug-fileutils/2003-03/msg00030.html
TODO-I don't plan to do that, since a few tests demonstrate no significant benefit.

So for the hell of it I made a crude cp using sendfile to compare with:
 #include <sys/sendfile.h>  
 #include <sys/types.h>  
 #include <sys/stat.h>  
 #include <fcntl.h>  
 #include <stdio.h>  
 //ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);  
 int main(int argc, char **argv){  
     ssize_t sf_stat;  
     if (argc != 3){  
         printf("usage: %s in_file out_file\n", argv[0]);  
         return -1;  
     int in_fd = open(argv[1], O_RDONLY);  
     int out_fd = creat(argv[2], S_IRUSR|S_IWUSR);  
     if ((in_fd | out_fd) < 0){  
         perror("error opening file");  
         return -1;  
     do {  
         sf_stat = sendfile(out_fd, in_fd, NULL, 65536);  
     } while (sf_stat > 0);  
     if (sf_stat < 0){  
         perror("error copying file");  

And the results were...

...that the coreutils code should probably be updated. I mean, come on, it's been 15 years already:

[adam@localhost c]$ time ./sendfile large_file /dev/null
real 0m0.044s
user 0m0.001s
sys 0m0.042s

[adam@localhost c]$ time cat large_file > /dev/null
real 0m0.283s
user 0m0.005s
sys 0m0.274s

[adam@localhost c]$ time cp large_file /dev/null
real 0m0.251s
user 0m0.003s
sys 0m0.246s

UPDATE! I did further tests. Sending files to /dev/null acts differently when using send_file; it doesn't matter what you copy, it takes the same(ish) amount of time. For normal copies, it is slightly quicker. I am still looking into it. :)

No comments:

Post a Comment