Cached at:
05/14/26, 03:24 PM
# CS 61 2017
Source: [https://cs61.seas.harvard.edu/wiki/2017/Shell3/](https://cs61.seas.harvard.edu/wiki/2017/Shell3/)
## Pipes, Forks, and Zombies
## Pipe History
### The idea of pipes
Doug Mcllroy, described the concept of pipes long before they were implemented\.
```
Summary--what's most important.
To put my strongest concerns into a nutshell:
1. We should have some ways of coupling programs like
garden hose--screw in another segment when it becomes when
it becomes necessary to massage data in another way.
This is the way of IO also.
2. Our loader should be able to do link-loading and
controlled establishment.
3. Our library filing scheme should allow for rather
general indexing, responsibility, generations, data path
switching.
4. It should be possible to get private system components
(all routines are system components) for buggering around with.
M. D. McIlroy
October 11, 1964
```
### Literative Programming
Don Knuth is the person who came up with the term`computer science`\. He wrote`The Art of Computer Programming`and created the`Latex`language\. Knuth enjoyed writing and programming so much that he developed`Literative Programming`\. Knuth created a style of programming that allows you to write text about a program as you write code\. The idea is to write prose and programs simultaneously\. However, this idea did not take off because there was a large amount of overhead associated with completing simple tasks like text parsing\. From a systems programming perspective using pipes, Mcllroy responded to Knuth's work\. He was able to complete the same text parsing task in 6 lines of shell code using pipes\. While Knuth approached the problem from an algorithms perspective, Mcllroy approached the problem from a systems perspective, chaining intermediate outputs together to arrive at the answer\.
Pipes matter\!
[Mcllroy](http://doc.cat-v.org/unix/pipes/)[Knuth](http://www.literateprogramming.com/knuthweb.pdf)
## The Less Program
The`seq`program takes a number as an argument and prints consecutive numbers starting at the first number\. If a second argument is provided, the numbers will stop printing at the once the second number has been reached\. Otherwise, the numbers will continue forever\.
```
$seq 2 5
2
3
4
5
```
If we pipe the output of the`seq`program to less, the output displayed on the screen is truncated because piping to less only displays enough output to fill your screen\. The`seq`program appears to be paused\. However, is it still running?
```
$seq 2 100000000 | less
2
3
4
5
6
:█
```
If we pipe`seq`to`less`and look at the list of running processes using`ps aux`, we can see that the`seq`program is not running\. Using strace to further examine what is happing reveals that after a series of`write`commands, there is a`SIGPIPE`signal\. A`SIGPIPE`occurs when there you are writing to a pipe with no readers\. The default action after a`SIGPIPE`is to kill the program\. Pipes automatically kill programs when their output is no longer needed\. This explains why the`seq`program is killed when it is piped to less\.
## Using Pipe to Implement Waitpid
```
waitpid(p, &status, 0); // block until p exits or there is a signal
```
Given that we don't care about`&status`, how can we use pipes to create a blocking call that unblocks when the process dies? When the child returns, we want the call to`read`to return 0 because all the child will have exited and all write ends of the pipe will be closed\.
```
int main() {
int pipdfd[2];
pipe(pipefd);
pid_t p = fork();
if (p == 0) {
exec();
}
close(pipfd[1]); // all write ends must be closed
char buf;
read(pipefd[0], &buf, 1); // the read syscall will block until the child exits
// when the child exits, this will return 0
close(pipefd[0]); // pipe hygiene
}
```
## Parents, Children, and Zombies
### Process Hierarchy
Every process has a single parent\. The root of the process hierarchy \(or process tree\) is a process called`init`, which has`pid 1`\. This is the only process that cannot be killed\. The`waitpid`retrieves a process's exit status\. The exit status of a process is stored in the process structure until the parent process needs the status\.`Waitpid`collects the status and recycles the process structure\. This means that the process structure can be reused for another process\.
### ManyFork
The`manyfork`program tries to execute the fork instruction 10000 times\. However, if we run`\./manyfork`, only ~3400 process have been created\. Running`sudo \./manyfork`, which gives the program more privileges, results in ~6890 process created\. The operating system is protecting the user from runaway program\. If we look at the processes created by the`manyfork`program, we see that most of them are defunct\.
### Zombies
The`manyfork`program does not wait for its children using`waitpid`\. This will created what is called a**zombie process**\. A zombie process is a process that has been terminated but that has not been waited upon by a parent\. The`ps`command allows to identify these zombie processes\. Below is a sample output of`ps`after running the`manyfork`program\.
`user 78623 0\.0 0\.0 0 0 pts/0 Z\+ 15:44 0:00 \[manyfork\]``user 78624 0\.0 0\.0 0 0 pts/0 Z\+ 15:44 0:00 \[manyfork\]``user 78625 0\.0 0\.0 0 0 pts/0 Z\+ 15:44 0:00 \[manyfork\]`
The`Z\+`column tells us that these processes are zombie processes\. These zombie processes are resources, namely process IDs\. When a child outlives its parent, the child's parent process is reassigned to the`init`process with pid 1\. The`init`process collects orphaned children in this way\. The job of`init`is to call waitpid on orphaned children to collect their resources\.