The Growth of Big Data

Much has been said about “big data” in recent years, but often conversations get murky and unproductive because people think it is all about math and analysis.  It isn’t.  It’s about politics and the economy and people and technology (and yes, a little math and analysis).  This is particularly important to understand where the rapidly-growing quantities and various kinds of “social data” are involved since it helps put the relevance of the data in context – context that can be especially valuable in the investment arena.

Data related to retail trading – whether it takes the form of order flows, personal finance blogs, Facebook “likes” or tweets – has undergone exponential growth over the past two decades.  Regulatory revision and industry innovations lowered trade prices. Workplace changes and demographic shifts increased demand. Many formerly private topics such as personal finance have become comfortably public.  And technological advances massively increased the ability to capture, store and track all of this opinion and activity.

While some data can be validated (for example, E*Trade’s reporting of which stocks customers are looking at the most), much of it can not (such as the fake AP tweet that caused a mini “flash crash”).  Knowing how to acquire, clean and check the data is essential.  Once that process is in place, sources that have the potential to be unreliable can turn out to be helpful; for example, there is a strong correlation between tweet volumes for each stock ticker and actual dollar trade volumes. 


When data is cleaned and filed, it becomes information.  That can, in turn, generate ideas and lead to hypotheses for testing, which can lead to knowledge. But how is knowledge then best collected into wisdom?  Many models have emerged over time: consultation, collaboration, cooperation, conglomeration.  Given the strengths of each approach and looking at the current environment, we believe the most effective collective knowledge will arise from a new model: coordination.


There are hundreds of potential data sources with varying degrees of relevance to financial markets, and more are born (or buried)  every day.  While black-box, quant-heavy firms may seem better equipped to make use of it all, there are equally large gains to be made by firms that take a less quantitative approach as long as they are familiar with the data, understand their own theories and know what is required to constantly validate those theories. 


This material does not constitute an offer or solicitation to purchase an interest in securities. Such an offer will only be made by means of a confidential offering memorandum and only in those jurisdictions where permitted by law. An investment with Subset Capital (the “Adviser”) is speculative and is subject to a risk of loss, including a risk of loss of principal. No assurance can be given that the Adviser will achieve its objective or that an investor will receive a return of all or part of its investment.

This presentation is confidential and may not be distributed or reproduced in whole or in part without the express written consent of the Adviser.