select vs. poll vs. epoll
All 3 are used for I/O multiplexing: monitor multiple file descriptors to see whether I/O is possible on any of them.
epoll
was meant to replace the older POSIX select
and poll
system calls.
Complexity and Scalability
- the older system calls operate in
O(n)
time: every time you callselect
orpoll
, the kernel needs to check from scratch whether your file descriptors are available for writing. The kernel doesn’t remember the list of file descriptors it’s supposed to be monitoring. epoll
operates inO(1)
time. It doesn't do the linear scan over all the watched descriptors.epoll
uses a red-black tree (RB-tree) data structure to keep track of all file descriptors that are currently being monitored.epoll
"scales well to large numbers of watched file descriptors."
Availablity and Portability
select
andpoll
are available on any Unix system.epoll
is Linux specific (available after version 2.5.44).poll
is a POSIX standard interface, so use that when portability is required.
poll vs select
Given a list of file descriptors, they tell you which ones have data available to read/write to. select
and poll
fundamentally use the same code. poll
returns a larget set of possible results for file descriptors like POLLRDNORM
, POLLRDBAND
, POLLIN
, POLLHUP
, POLLERR
, while select
just tells you there's input/output/error.
poll
can perform better than select
if you have a sparse set of file descriptors. poll
takes a pollfd
argument to specify which file descriptors to monitor; select
takes bitsets and loops the whole range.
level-triggered vs edge-triggered
- level-triggered: get a list of every file descriptor you’re interested in that is readable.
- edge-triggered: get notifications every time a file descriptor becomes readable.
poll
is only level-triggered, but epoll
can be used as either edge- or level-triggered interface.
epoll
epoll_create
: start epollingepoll_ctl
: tell the kernel file descriptors you’re interested in updates about.epoll_wait
: wait for updates about the list of files you are interested in.epoll_wait()
itself is a blocking interface. Whenever you callepoll_wait()
, it blocks your thread/process until any of the monitored events happens on the registered descriptors. Either you ask epoll to monitor a non-blocking or blocking FD, epoll interface, itself, is still blocking-based
Where epoll is used
- node.js uses
libuv
(which was written for the node.js project) - the
gevent
networking library in Python useslibev
/libevent
- golang uses some custom code. This looks like it might be the implementation of network polling with
epoll
in the golang runtime – it’s only about 100 lines which is interesting. You can see the generalnetpoll
interface,it’s implemented on BSDs withkqueue
instead. - Webservers also implement
epoll
: Every time the webserver accept a connection with theaccept
system call, and it gets a new file descriptor representing that connection. There may be thousands of connections open at the same time. You need to know when people send you new data on those connections, so you can process and respond to them.