SourceXR

C/C++ Cross-Reference Tool

Avoid Data Copy with splice()

The splice system call (and the associated syscalls vmsplice and tee) provides a way to perform reading and writing oprations on a file descriptor without copying data between user and kernel space and therefore avoid a performance penalty due to copy of buffer between user and kernel spaces.

This syscall is Linux-specific.

The splice declaration is the following:

ssize_t splice (int fd_in, loff_t *off_in, int fd_out,
                loff_t *off_out, size_t len,
                unsigned int flags);

As described in the man page, one (or both) of the file descriptors must refer to a pipe.

splice is similar to the sendfile system call which allow data transfer between two file descriptors (not necessarily pipes). One of the file descriptors can correspond to a file (as target, it should not be open in append mode), a socket, etc.

In this article we'll show a way to use splice to redirect stdout/stderr from a child process to its parent stdout using splice(): no buffer copy in user space is performed between the child and its parent.

Without the splice syscall, we would have to read() the pipe to fill a buffer and then call write() to display its contents to the standard output/error.

The sample consists of several steps:

  • first: setup of the two pipes (stdout and stderr)
  • second: creation of the child process with fork
  • third: put parent end of the stdout and stderr pipes in a select() loop to watch the two descriptors at the same time
  • whenever data are available on one of the pipe, use splice to display the data to the corresponding channel.

Pipe Setup

We simply call the pipe syscall for the two redirected channels:

    // create stdout/stderr pipe
    int out[2];
    int r = pipe (out);
    if (r == -1) {
        perror ("pipe");
        return -1;
    }

    int err[2];
    r = pipe (err);
    if (r == -1) {
        perror ("pipe");
        return -1;
    }

Forking a Child Process

We call fork and closes one end of the two pipes, and redirect stdout and stderr to the other end of the pipes (with dup2):

    pid_t childpid = fork ();
    if (childpid == -1) {
        perror ("fork");
        return -1;
    }
    if (childpid == 0) {
        close (out[0]);
        close (err[0]);

        // close out, err -> redirect to pipe
        r = dup2 (out[1], stdoutfd);
        if (r == -1) {
            return -1;
        }
        r = dup2 (err[1], stderrfd);
        if (r == -1) {
            return -1;
        }
        close (out[1]);
        close (err[1]);

Then the child process enters a simple print loop.

        // print to stdout
        while (true) {
            fprintf (stdout, "child alive\n");
            fflush (stdout);
            sleep (1);
        }
        exit (0);

Monitoring Child Pipes

On the other end of the pipes, the parent process sets up a select loop to watch for the two file descriptors:

    else {
        // add out, err handlers
        close (out[1]);
        close (err[1]);

        fd_set rfds, wfds, efds;
        FD_ZERO (&rfds);
        FD_ZERO (&wfds);
        FD_ZERO (&efds);

        FD_SET (out[0], &rfds);
        FD_SET (err[0], &rfds);

        const int m = std::max (out[0], err[0]) + 1;

        while (true) {
            // wait for events
            int r = select (m, &rfds, &wfds, &efds, 0);
            if (r == -1) {
                perror ("select");
                return 1;
            }
            else {
                if (FD_ISSET (out[0], &rfds)) {
                    if (forward (out[0], stdoutfd) == -1) {
                        return -1;
                    }
                }
                if (FD_ISSET (err[0], &rfds)) {
                    if (forward (err[0], stderrfd) == -1) {
                        return -1;
                    }
                }
            }
        }
        close (out[0]);
        close (err[0]);
    }

Whevener select returns, we check the file descriptors and forward the data through the splice call to the parent stdout or stderr:

ssize_t forward (int src, int dst)
{
    int size = 0;
    if (ioctl (src, FIONREAD, &size) == -1) {
        perror ("ioctl");
        return -1;
    }
    const ssize_t r = splice (src, NULL, dst, NULL, size, 0);
    if (r == -1) {
        perror ("splice");
        return -1;
    }
    return r;
}

Notice here that we do not perform any buffer copy: the kernel is handling everything on its own.

We first determine the pending data size with an ioctl call and splice does the remaining tasks of moving the data in the kernel from the source descriptor to the target.

Source Code

The source file of the implementation can be found here.

Comments !