LOBSTER logo

academic
data.


faq.

Below you find a list of frequently asked questions organized by topic. Let us know if you have any question that is not on the list or if the answers given are not satisfying.

If you are looking for a certain key word just use the good old: 'crtl + f'.

topics.

joining LOBSTER.

  1. who can join LOBSTER?

    Currently LOBSTER offers limit order book data derived from NASDAQ's Historical TotalView-ITCH files for academic research only. All academic institutions that are identified by local law as institutions of higher learning are eligible to join LOBSTER under NASDAQ's academic waiver program.

    You are at an eligible institution? This way to the data ...

  2. why the waiver?

    Operating under NASDAQ OMX's academic waiver allows LOBSTER to offer the limit order book data to academic researchers without charging license fees that would otherwise be required.

    Obtaining the waiver is very easy. The academic research entity is just required to submit a formal application to NASDAQ OMX in which it outlines the research project and commits to handling the data with care. We, of course, provide you with a 'how to guide' outlining the details of the application process.

  3. is my NASDAQ waiver bounded to LOBSTER?

    No. The academic waiver is not limited to LOBSTER.

  4. I am at Humboldt Universität zu Berlin. How do I get access?

    If you are employee or student of Humboldt Universität zu Berlin and want access to LOBSTER please contact us at lobster.wiwi@hu-berlin.de. We will provide you with details on how to join.

data and data processing.

  1. how can I open .7z files?

    The output files provided by LOBSTER are compressed using the open source tool 7-zip (external link). 7-zip has better compression rates than standard .zip files and thus reduces download times even further. Windows, Linux and even Mac versions are available.

  2. how can I import LOBSTER's output into my computation software?

    LOBSTER's output is provided in the CSV (comma-separated values) format. CSV files can be opened by most editors, e.g. Wordpad in Windows or Microsoft Excel, and can easily be read by numerous statistics and econometrics software packages. Please refer to the manual of your favored software for details.

    In the code help you can find simple scripts to help you get started.

  3. how to load and process huge files?

    The 10 level order book of an actively traded ticker can easily be larger than 5 GB. For these large files or files that are simply too large for your local memory, we suggest a sequential processing of the data. Read in a small block of the data, process it and then discard it before loading the next.

    Let us know if you have a very efficient code for sequential processing of the data and would like to share it the other users.

  4. for which period is data available?

    The period for which limit order books are available at the start of LOBSTER covers all trading days on NASDAQ from 27th of June 2007 to the day before yesterday.

    We are, of course, committed to increasing the available period further into the past.

  5. are (reverse) stock splits accounted for in the data?

    No. Information on (reverse) stock splits is not included in the NASDAQ's TotalView-ITCH data. The limit order books provided are based on 'historical recordings' of the exchange's activity. There is no retroactive altering of the data to account for future developments such as stock splits.

    Should you find the stock price in your data to jump substantially over night, you are encouraged to check for a potential (reverse) stock split or similar events and adjust the data accordingly.

  6. what is the difference between trades and limit order executions?

    The limit order books reconstructed by LOBSTER contain records of individual limit order executions. It is important to make the distinction between these limit order executions and what would be considered 'trades' from an economic perspective. A limit order execution is uniquely identified in the message data and refers to one limit order being (partially) executed. A 'trade', caused, for example, by the submission of a marketable limit order, can refer to several limit order executions.

    In the literature, researcher interested in 'economic trades' typically rely on (time based) rules to aggregate limit order executions to 'trades'.

  7. how does LOBSTER handle the intra-day trading halt?

    LOBSTER includes the trading halt information that indicates the exact time of the trading halt as well as the resume of quoting and trading. With this information researchers can exactly identify non trading periods and do not have to rely on arbitrary rules, such as '5 min of no activity defines a trading halt', to clean their data.

  8. are there observations outside the regular trading hours?

    Yes. Between 07:00 and 9:30 traders on NASDAQ are allowed to submit limit orders which are collected and form the basis of the order book at opening. Further, on some days the exchange message data contains executions that are time stamped after 16:00. (All times in EST.)

    Given LOBSTER's current focus on providing an efficient environment for a smooth analysis of the limit order book dynamics during regular trading hours, the limit order book output does not include observations before 09:30 and after 16:00.

    In the future LOBSTER may offer the option to include information on events occurring before and/or after regular trading hours in the limit order book output.

  9. are there irregularities in the data?

    The limit order books reconstructed by LOBSTER are true to the raw data provided in NASDAQ's Historical TotalView-ITCH files. We do not clean the data, add any information or remove erroneous order book states. This implies that all irregularities that occur on NASDAQ's trading system and are recorded in the Historical TotalView-ITCH files are also in the order books provided by LOBSTER. Further, changes made at a later point in time, such as cancellations of trades, are not considered and not retroactively accounted for in the provided order books.

    The provided data is the data that was presented to traders in real time. I.e. LOBSTER presents the order book based on which traders have based their decisions.

    A well documented example of an irregularity in NASDAQ's TotalView limit order book is the IPO of Facebook (FB) on May 18-th 2012. Technical problems caused NASDAQ's order book (TotalView) to be crossed. I.e. there were bid limit orders with prices higher than the lowest ask price. This is an obvious error in the data as such orders should immediately trigger executions.

    It is the users responsibility to check the data for obvious irregularities before processing the data any further.

  10. what information on hidden order executions is provided?

    LOBSTER includes hidden order executions in the message file with a duplication of the previous limit order book state recorded in order book file. The execution message includes the exact time stamp, the size, direction and price of the executed order. Further, the message includes an order ID field. According to NASDAQ's data specification, before December 6-th 2010 NASDAQ's Historical TotalView-ITCH contained hidden order execution messages which included a unique order ID. In all files supplied after December 6-th 2010 the field has been set to '0'.

    Should NASDAQ's Historical TotalView-ITCH files contain a unique order ID in the hidden order execution message, the order ID is included in LOBSTER's output, otherwise the field is set to '0'.

  11. what is NASDAQ's Historical TotalView-ITCH?

    The limit order books reconstructed by LOBSTER are based on NASDAQ's Historical TotalView-ITCH data - the historic record of what NASDAQ calls

    "[...] the standard NASDAQ data feed for serious traders - [which] displays the full order book depth [...]" and contains "[...] the order and trade transaction data from the TotalView-ITCH data feed."*

    To allow a very efficient transfer of information the data feed is in binary format and avoids any redundancies in the records.

    You can find more information on NASDAQ's Historical TotalView-ITCH on nasdaqtrader.com.
    An outline of the message data and how LOBSTER reconstructs limit order books from the data is given here.

data requesting and management.

  1. how to request data?

    Requests are entered into an intuitive interface, which is depicted below. Entering the ticker, start and end date as well as the number of levels is all there is. Setting up a request takes less than 5 seconds.

    As illustrated in the image, several requests can be entered simultaneously by adding an additional row to the request.

  2. can I submit individual requests for more than one year of limit order book?

    No. We have set certain limits on the size of individual requests to keep the download sizes manageable and guarantee that you get access to (parts of) your requested data as soon as possible.

    By splitting huge requests into several small ones, you benefit from LOBSTER's multiprocess system. Instead of one thread working on the request, several threads work on different parts of the request in parallel which reduces the reconstruction time significantly.

    Assume, for example, someone wants to request 10 level limit order book data for Apple (AAPL) from June 2011 to December 2012. Requesting the data in one job will result in a very huge output file and you would only have access to the data after the last day in the sample would have been reconstructed.

    The user is, hence, asked to split the request into several smaller requests. For example into 1 month blocks. These smaller requests can all be entered into the request interface at the same time and as soon as the individual parts are finished you will be able to download the data.

    The exact limits are dependent on the number levels requested and can be found in the request interface. Again, it is not about restricting access to the data, but about keeping the output files manageable and giving you access to the data as soon as possible.

  3. where do I find the ticker for company XYZ?

    You want to know the ticker for Apple Inc. or Google Inc.? Please refer to nasdaq.com for a symbol lookup tool.

    Note that tickers may change over time and NASDAQ does not record this information in historical TotalView-ITCH data. It is hence the user's responsibility to check for such changes and adjust their requests accordingly.

    A history of US symbol changes can be found on nasdaq.com.

  4. which options are included in the data management?

    The LOBSTER data management section includes the sections 'my data.' and 'archive'.

    The 'my data' section, depicted on the figure below, gives an overview over the user's latest request activity. At the top the current progress of the latest requests is displayed. The middle section depicts the 10 latest requests available for download and an overview over the storage space occupied by the user's data is provided at the bottom.

    The 'archive', shown in the image below, lists the user's files available for download under his or her account and allows the removal of data from LOBSTER's storage. Should you, for example, have reached your storage limit the 'archive' allows the deletion of data with just a few clicks to allow new requests to be submitted under your account.

reconstruction details.

  1. details on the reconstruction mechanism?

    Should you have more detailed questions on the limit order book reconstruction mechanism, please refer to the reconstruction details.

other.

  1. how to reference LOBSTER?

    Please use the following phrase to reference LOBSTER in you research paper:

    LOBSTER: Limit Order Book System - The Efficient Reconstructor at Humboldt Universität zu Berlin, Germany. http://LOBSTER.wiwi.hu-berlin.de

  2. can I get my paper posted on the website?

    Yes. To strengthen LOBSTER's community we have posted a small list of papers which have utilized LOBSTER's data. Should you be interested in getting on the list, send us an email with a short abstract and link to your SSRN (or equivalent) included and the paper (pdf) attached.

  3. technical details?

    Should you be interested in more technical details of the LOBSTER data platform, please refer to the technical report.

further questions.

Let us know if you have a question that is not on the list.