Friday, February 22, 2013

Reactor Pattern Explained - Part 1

Handling concurrent events a Server receives is often thought of as a use-case for creating a separate thread for each IO event listener. Most programmers are tempted to use the famous socket loop for creating Sockets for every incoming connection.

class Server implements Runnable {
    public void run() {
        try {
            ServerSocket ss = new ServerSocket(PORT);
        while (!Thread.interrupted())
            new Thread(new Handler(ss.accept())).start();
            // or, single-threaded, or a thread pool
        } catch (IOException ex) { }
    }
}

class Handler implements Runnable {
    final Socket socket;
    Handler(Socket s) { socket = s; }
    public void run() {
        try {
            byte[] input = new byte[MAX_INPUT];
            socket.getInputStream().read(input);
            byte[] output = process(input);
            socket.getOutputStream().write(output);
        } catch (IOException ex) { }
    }
    private byte[] process(byte[] cmd) { }
}

The disadvantage of using a separate thread for each event listener is the overhead of context switching. In the worst case, some threads handling event listeners which do not read or write data frequently, will be context switched periodically without doing useful work. Every time such a Thread is dispatched to the CPU by the scheduler, it will be blocked until an IO event occurs, in which case all the time spent waiting for an IO event will be wasted. Note that ss.accept() is a blocking call which blocks the server thread till a client connects. The server thread will not be able to call start() method of the new Handler thread until it is returned from ss.accept(). To reduce the wastage of CPU time by unnecessary context switches, the concept of non blocking IO was invented.

Reactor Pattern is an event handling design pattern used to address this issue. Here, one Reactor will keep looking for events and will inform the corresponding event handler to handle it once the event gets triggered. To explain this I am using some Java code borrowed from some lecture slides by Professor Doug Lea. To see his explanation please go through this set of slides.

Java provides a standard API (java.nio) which could be used to design non-blocking IO systems. I will explain the Reactor pattern with a simple client server model where the clients will shout out their names to the server while the server will respond to the corresponding client with a Hello message.

There are two important participants in the architecture of Reactor Pattern.

1. Reactor  


A Reactor runs in a separate thread and its job is to react to IO events by dispatching the work to the appropriate handler. Its like a telephone operator in a company who answers the calls from clients and transfers the communication line to the appropriate receiver. Don't go too far with the analogy though :).

2. Handlers


A Handler performs the actual work to be done with an IO event similar to the actual officer in the company the client who called wants to speak to.

Since we are using java.nio package, its important to understand some of the classes used to implement the system. I will simply repeat some of the explanations by Doug Lea in his lecture sides to make the readers lives easy :).

Channels


These are connections to files, sockets etc. that support non blocking reads. Just like many TV channels can be watched from one physical connection to the antena, many java.nio.channels.SocketChannels corresponding to each client can be made from a single java.nio.channels.ServerSocketChannel which is bound to a single port.

Buffers


Array-like objects that can be directly read or written to by Channels.

Selectors


Selectors tell which of a set of Channels has IO events.

Selection Keys


Selection Keys maintain IO event status and bindings. Its a representation of the relationship between a Selector and a Channel. By looking at the Selection Key given by the Selector, the Reactor can decide what to do with the IO event which occurs on the Channel.

Now lets try to understand what Reactor Pattern is. Take a look at this diagram.

 
Here, there is a single ServerSocketChannel which is registered with a Selector. The SelectionKey 0 for this registration has information on what to do with the ServerSocketChannel if it gets an event. Obviously the ServerSocketChannel should receive events from incoming connection requests from clients. When a client requests for a connection and wants to have a dedicated SocketChannel, the ServerSocketChannel should get triggered with an IO event. What does the Reactor have to do with this event? It simply has to Accept it to make a SocketChannel. Therefore SelectionKey 0 will be bound to an Acceptor which is a special handler made to accept connections so that the Reactor can figure out that the event should be dispatched to the Acceptor by looking at SelectionKey 0. Notice that ServerSocketChannel, SelectionKey 0 and Acceptor are all in same colour ( Gray I suppose :) )

The Selector is made to keep looking for IO events. When the Reactor calls Selector.select() method, the Selector will provide a set of SelectionKeys for the channels which have pending events. When SelectionKey 0 is selected, it means that an event has occurred on ServerSocketChannel. So the Reactor will dispatch the event to the Acceptor.

When the Acceptor accepts the connection from Client 1, it will create a dedicated SocketChannel 1 for the client. This SocketChannel will be registered with the same Selector with SelectionKey 1. What would the client do with this SocketChannel? It will simply read from and write to the server. The server does not need to accept connections from client 1 any more since it already accepted the connection. Now what the server needs is to Read and Write data to the channel. So SelectionKey 1 will be bound to Handler 1 object which handles reading and writing. Notice that SocketChannel 1, SelectionKey 1 and Handler 1 are all in Green.

The next time the Reactor calles Selector.select(), if the returned SelectionKey Set has SelectionKey 1 in it,  it means that SocketChannel 1 is triggered with an event. Now by looking at SelectionKey 1, the Reactor knows that it has to dispatch the event to Handler 1 since Hander 1 is bound to SelectionKey 1. If the returned SelectionKey Set has SelectionKey 0 in it, it means that ServerSocketChannel has received an event from another client and by looking at the SelectionKey 0 the Reactor knows that it has to dispatch the event to the Acceptor again. When the event is dispatched to the Acceptor it will make SocketChannel 2 for client 2 and register the socket channel with the Selector with SelectionKey 2.

So in this scenario we are interested in 3 types of events.
  1. Connection request events which get triggered on the ServerSocketChannel which we need to Accept.
  2. Read events which get triggerd on SocketChannels when they have data to be read, from which we need to Read.
  3. Write events which get triggered on SocketChannels when they are ready to be written with data, to which we need to Write.

A SelectionKey will have all the information about the relationship with its corresponding Channel and the Selector. It will have information about the corresponding Handler too. Selector will just select the SelectionKeys which have pending IO events. This way the Reactor can decide how to deal with the IO events accordingly. The relationships among the Channels, Selection Keys and Handlers can be put in a table as follows.

Selection Key Channel Handler Interested Operation
SelectionKey 0 ServerSocketChannel Acceptor Accept
SelectionKey 1 SocketChannel 1 Handler 1 Read and Write
SelectionKey 2 SocketChannel 2 Handler 2 Read and Write
SelectionKey 3 SocketChannel 3 Handler 3 Read and Write

Now what does a Thread pool has to do with this? Let me explain. The beauty of non blocking architecture is that we can write the server to run in a single Thread while catering all the requests from clients. Just forget about the Thread pool for a while. Naturally when concurrency is not used to design a server it should obviously be less responsive to events. In this scenario when the system runs in a single Thread the Reactor will not respond to other events until the Handler to which the event is dispatched is done with the event. Why? Because we are using one Thread to handle all the events. We naturally have to go one by one.

We can add concurrency to our design to make the system more responsive and faster. When the Reactor dispatches the event to a Handler, it can start the Handler in a new Thread so that the Reactor can happily continue to deal with other events. This will always be a better design when performance is concerned. To limit the number of Threads in the system and to make things more organized, a Thread pool can be used.

I believe this explanation is adequate for us to get our hands dirty with some coding.

Please read Reactor Pattern Explained - Part 2 and Reactor Pattern Explained - Part 3.

No comments:

Post a Comment