Recently, a work colleague needed to transfer many large files to me for processing. We both had user accounts on the server on which the files resided, but based on the file system configuration, direct transfer via the usual strategy of move-the-files-to-a-shared-folder wasn’t feasible.
While there was plenty of space on the partition mounted to /home, neither of us had privileges to create a mutually
accessible directory in that space. We could both read from, and write to /tmp in the root partition, however, the total
free disk space in that partition was smaller than most of the individual files.
My modus operandi in these cases, is to find a server on the network that has sufficient space, and use it to facilitate the transfer. Unfortunately, the server was in a network zone that kept it pretty well isolated, and there were no suitable servers within reach.
Somehow, my brain managed to dredge up the term “named pipe”; a seed planted there years ago, no doubt, by someone who had extolled their virtues after having found them to be useful.
What is a Named Pipe?
A named pipe is simply a permanent version of the traditional Unix pipe that you may already be familiar with. A pipe is a means of inter-process communication, or IPC. Simply put, it’s a way of transferring data from one process to another.
With a traditional, or unnamed pipe, output data from one process is passed (or piped) as the input data of another process, e.g.:
cat foo.txt | grep 'bar'
Here, cat outputs the contents of file foo.txt, which is used as input to the grep command. These pipes are transient;
they exist until the processes exit, then they disappear.
Following the Unix mantra of “everything is a file”, named pipes are represented as a file; specifically, a zero-byte file. Named pipes have a few attributes that differentiate them from traditional pipes:
- They are persistent; once created, they stick around until explicitly deleted.
- Since they are represented as a file, read/write permissions can be assigned.
- Processes writing to / reading from a named pipe need not be running in the same terminal session, nor under the same user.
- The writer is blocked until there is a reader, i.e. the buffer size is zero.
# Create a named pipe
mkfifo my_pipe
# Examine it's attributes. The 'p' indicates a file of type 'pipe'
ls -l my_pipe
prw-r--r-- 1 nobrien nobrien 0 Aug 10 10:31 my_pipe
The example from above can be implemented using a named pipe, as:
# Named pipes are blocking; run this in bg
cat foo.txt > my_pipe &
[1] 39791
# Read the 'contents' of the pipe. Note that the process writing to the pipe finishes when all the data is read
cat my_pipe
Bar Bar Qux
[1]+ Done cat foo.txt > my_pipe
Thats not nearly as succinct as the traditional pipe, but named pipes have their advantages.
Transferring Large Files With Named Pipes
Knowing that my colleague and I have ample space in the /home directory, and mutual read/write access in /tmp, we could
easily solve the transfer problem using named pipes, as follows:
- For each file to be transferred, set up a named pipe in
/tmp. - My colleague would
catthe contents of each file into its respective pipe. - I would
catthe “contents” of the pipe into files in my home directory.
We worked up some quick ‘n’ dirty shell scripts to do all the work. First my colleague would create the pipes and write to them:
#! /bin/bash
for i in *.log;
do
p=/tmp/$i-pipe;
echo "Creating pipe $p";
mkfifo $p;
nohup cat $i > $p;
done
Once the pipes were set up, I ran this script to read all the data to files in my home directory with ./readpipe.sh /tmp/*-pipe:
#! /bin/bash
for p in $@
do
echo "Reading pipe $p"
# Extract log filename
f=$(echo $p|cut -d '_' -f 1|cut -d '/' -f 3)
echo "Writing to $f"
nohup cat $p > $f&
done
Conclusion
Named pipes aren’t going to replace traditional pipes for simple, day-to-day commands and processing, but they can be very useful is certain situations. It pays to keep them in the back of your mind.