LOBSTER logo

academic
data.


how does it work?

The limit order book data provided by LOBSTER is constructed on demand from a database of NASDAQ's Historical TotalView-ITCH files. We do not offer a one-size-fits-all solution. Users get access to exactly the data required: Select the ticker, the period and the level of detail you require and the LOBSTER platform reconstructs the limit order book from the exchange message data.

The process by which the limit order books are derived is illustrated below. The first section introduces the idea and structure of the underlying raw exchange message data. The second section presents the limit order book reconstruction mechanism.

underlying message data.

The limit order books reconstructed by LOBSTER are based on NASDAQ's Historical TotalView-ITCH data - the historic record of what NASDAQ calls

"[...] the standard NASDAQ data feed for serious traders - [which] displays the full order book depth [...]" and contains "[...] the order and trade transaction data from the TotalView-ITCH data feed."*

NASDAQ's TotalView-ITCH data feed contains so-called event message data, which allows for a very efficient transfer of information. Instead of streaming the state of the entire limit order book after each update, only the information on the limit order event that changes the order book is sent to market participants.
Each limit order submission, cancellation and execution results in an individual event message being streamed to the market participants. The transfer of information in this fashion implies that the state of the limit order book at any time depends on the (daily) history of limit order events or messages that have previously occurred.

Below you find the stylized structure of the message data. More information on NASDAQ's Historical TotalView-ITCH data can be found on NASDAQ's website.

general structure.

Considering the enormous number of messages generated by the market's activity every day, NASDAQ's TotalView-ITCH message system is designed to present information in a parsimonious way to reduce redundancies in the records.

Each message begins with an indicator of the length of the message in bytes followed by a message type indicator and further event-specific information. Time stamps are recorded in two steps. Time messages indicate the seconds after midnight and each individual market event message, such as limit order submission, cancellation, etc., contains a time stamp indicating the time since the last full second in nanoseconds.

example messages.

The figure below illustrates the structure of the TotalView-ITCH messages from which LOBSTER reconstructs the limit order book. Please note that the original data is stored in binary files, depicted here is a 'readable' version.

The messages of type 'T' are time messages giving the number of seconds after midnight. The first message in the graph below indicates that the following activities occur '34268' seconds after midnight, i.e. after 9:31:08.

The second message is a limit order submission indicated by the type 'A' (for 'add'). The third element of this message gives the number of nanoseconds since the last full second as recorded in the first depicted message. The exact time stamp of this order submission is hence 9:31:08.761000000. The nanosecond time stamp is followed by a unique order ID, '4172917', which is used to reference the order in case of an execution or cancellation.
The rest of the message gives details regarding the submitted limit order: The market side indicator, 'B', indicating a buy order, the order volume, '1400', the stock ticker, 'MSFT' for Microsoft, and the limit price of '237700', i.e. $23.77.

The message of type 'E' together with the preceding time message indicate a partial execution of the order with ID '4172917' at 09:31:29.357000000. The transaction of '900' shares is uniquely identifiable by the transaction ID, '13683137'. Note that in the execution message, only the unique order ID, '4172917', is referenced. The details of the executions such as the price, the ticker and market side have to be inferred from the previous submission message.

At '35403' seconds and '782000000' nanoseconds after midnight (or 09:50:03.782000000) the message of type 'D' indicates that the previously submitted and partially executed order with ID '4172917' is deleted and hence removed from the limit order book. Notice that the message is again kept as brief as possible to avoid any redundancies.

reconstruction algorithm.

outline.

The figure below shows a stylized version of the limit order reconstruction process. Depicted at the top is the stream of messages from NASDAQ's Historical TotalView-ITCH files that have been introduced above. The letters indicate the different types of events: Submissions ('A'), executions ('E') and deletions ('D').
The messages are read sequentially and the information is used to update the order pool given on the right of the figure. The order pool contains the collection of currently available limit orders.

The limit order book, shown at the bottom, is updated with each incoming message by aggregating the available limit orders in the order pool at common price levels.

Throughout the day the entire process is repeated in a loop. The three steps depicted in the figure are executed in a clockwise rotation from the first to the last available message to get the evolution of the limit order book.

Below are several examples to illustrate the process in more detail. For a more technical description of the LOBSTER platform and its reconstruction algorithm please refer to the technical report.

example messages.

The following three examples of order events highlight the reconstruction mechanism in more detail. Unlike the mechanism depicted above, they additionally consider a user specific level of detail in the order book output: The output is only updated if a change occurs within the requested levels of the limit order book.

  1. submission of a new sell limit order
    The incoming limit order submission message is read, the relevant information, i.e. time stamp, price, size and order ID, is recorded, and the limit order is placed in the order pool.

    If the order submission causes a change of the order book in the price range corresponding to the requested levels, the change of the order book is recorded in the output file.

  2. limit order deletion
    The event message is read and identified as a deletion. Using the order ID, the order pool is searched and the previously submitted order is removed from the order pool. Assuming a change in the requested range of the book, the updated limit order book is printed to the output file.

  3. limit order execution
    The event message contains the ID of the limit order against which a marketable limit order is executed and the size of the transaction. Using the unique order ID the order pool is searched and the executed limit order is adjusted, i.e. is deleted or resized, accordingly. The updated limit order book is printed to the output file.


order book reconstruction for different levels.

The order book reconstruction process of reading event messages, pooling orders, etc. is independent of the number of levels requested by the user. In each case the full order book is reconstructed throughout the whole day. The number of levels requested only specifies the range of price levels for which changes are saved as output.

Compare, for example, a level 1 and a level 25 request. In case of a level 1 request only 'trades and quotes', i.e. changes to the best bid and ask prices and their respective volumes are recorded. The limit order book saved to the output file in case of a level 25 request is updated every time a price or volume changes in the range from the 25th best bid to the 25th best ask price.

Please note, every request with a number of levels other than 'all levels' results in a series of (irregularly spaced) snapshots of the limit order book. The duration between the snapshots is determined by the time between changes to prices or volumes in the requested price range.