Walter's DConf 2014 Talks - Topics in Finance
Laeeth Isharc via Digitalmars-d
digitalmars-d at puremagic.com
Mon Dec 22 19:07:09 PST 2014
Hi.
Sorry if this is a bit long, but perhaps it may be interesting to
one or two.
On Monday, 22 December 2014 at 22:00:36 UTC, Daniel Davidson
wrote:
> On Monday, 22 December 2014 at 19:25:51 UTC, aldanor wrote:
>> On Monday, 22 December 2014 at 17:28:39 UTC, Daniel Davidson
>> wrote:
>> I don't see D attempting to tackle that at this point.
>>> If the bulk of the work for the "data sciences" piece is the
>>> maths, which I believe it is, then the attraction of D as a
>>> "data sciences" platform is muted. If the bulk of the work is
>>> preprocessing data to get to an all numbers world, then in
>>> that space D might shine.
>> That is one of my points exactly -- the "bulk of the work", as
>> you put it, is quite often the data processing/preprocessing
>> pipeline (all the way from raw data parsing, aggregation,
>> validation and storage to data retrieval, feature extraction,
>> and then serialization, various persistency models, etc).
>
> I don't know about low frequency which is why I asked about
> Winton. Some of this is true in HFT but it is tough to break
> that pipeline that exists in C++. Take live trading vs
> backtesting: you require all that data processing before
> getting to the math of it to be as low latency as possible for
> live trading which is why you use C++ in the first place. To
> break into that pipeline with another language like D to add
> value, say for backtesting, is risky not just because the
> duplication of development cost but also the risk of live not
> matching backtesting.
>
> Maybe you have some ideas in mind where D would help that data
> processing pipeline, so some specifics might help?
I have been working as a PM for quantish buy side places since
98, after starting in a quant trading role on sell side in 96,
with my first research summer job in 93. Over time I have become
less quant and more discretionary, so I am less in touch with the
techniques the cool kids are using when it doesn't relate to what
I do. But more generally there is a kind of silo mentality where
in a big firm people in different groups don't know much about
what the guy sitting at the next bank of desks might be doing,
and even within groups the free flow of ideas might be a lot less
than you might think
Against that, firms with a pure research orientation may be a
touch different, which just goes hex again to say that from the
outside it may be difficult to make useful generalisations.
A friend of mine who wrote certain parts of the networking stack
in linux is interviewing with HFT firms now, so I may have a
better idea about whether D might be of interest. He has heard
of D but suggests Java instead. (As a general option, not for
HFT). Even smart people can fail to appreciate beauty ;)
I think its public that GS use a python like language internally,
JPM do use python for what you would expect, and so do AHL (one
of the largest lower freq quant firms). More generally, in every
field, but especially in finance, it seems like the data
processing aspect is going to be key - not just a necessary evil.
Yes, once you have it up and running you can tick it off, but it
is going to be some years before you start to tick off items
faster than they appear. Look at what Bridgewater are doing with
gauging real time economic activity (and look at Google Flu
prediction if one starts to get too giddy - it worked and then
didn't).
There is a spectrum of different qualities of data. What is
most objective is not necessarily what is most interesting. Yet
work on affect, media, and sentiment analysis is in its very
early stages. One can do much better than just affect bad, buy
stocks once they stop going down... Someone that asked me to
help with something are close to Twitter, and I have heard the
number of firms and rough breakdown by sector taking their full
feed. It is shockingly small in the financial services field,
and that's probably in part just that it takes people time to
figure out something new.
Ravenpack do interesting work from the point of view of a
practitioner, and I heard a talk by their former technical
architect, and he really seemed to know his stuff. Not sure what
they use as a platform.
I can't see why the choice of language will affect your back
testing results (except that it is painful to write good
algorithms in a klunky language and risk of bugs higher - but
that isn't what you meant).
Anyway, back to D and finance. I think this mental image people
have of back testing as being the originating driver of research
may be mistaken. Its funny but sometimes it seems the moment you
take a scientist out of his lab and put him on a trading floor he
wants to know if such and such beats transaction costs. But what
you are trying to do is understand certain dynamics, and one
needs to understand that markets are non linear and have highly
unstable parameters. So one must be careful about just jumping
to a back test. (And then of course, questions of risk
management and transaction costs really matter also).
To a certain extent one must recognise that the asset management
business has a funny nature. (This does not apply to many HFT
firms that manage partners money), It doesn't take an army to
make a lot of money with good people because of the intrinsic
intellectual leverage of the business. But to do that one needs
capital, and investors expect to see something tangible for the
fees if you are managing size. Warren Buffett gets away with
having a tiny organisation because he is Buffett, but that may be
harder for a quant firm. So since intelligent enough people are
cheap, and investors want you to hire people, it can be tempting
to hire that army after all and set them to work on projects that
certainly cover their costs but really may not be big
determinants of variations in investment outcomes. Ie one
shouldn't mistake the number of projects for what is truly
important.
I agree that it is setting up and keeping everything in
production running smoothly that creates a challenge. So it's
not just a question of doing a few studies in R. And the more
ways of looking at the world, the harder you have to think about
how to combine them. Spreadsheets don't cut the mustard anymore
- they haven't for years, yet it emerged even recently with the
JPM whale that lack of integrity in the spreadsheet worsened
communication problems between departments (risk especially).
Maybe pypy and numpy will pick up all of slack, but I am not so
sure.
In spreadsheet world (where one is a user, not a pro), one never
finishes and says finally I am done building sheets. One question
leads to another in the face of an unfolding and generative
reality. It's the same with quant tools for trading. Perhaps
that means value to tooling suited to rapid iteration and
building of robust code that won't need later to be totally
rewritten from scratch later.
At one very big US hf I worked with, the tools were initially
written in Perl (some years back). They weren't pretty, but they
worked, and were fast and robust enough. I has many new features
I needed for my trading strategy. But the owner - who liked to
read about ideas on the internet - came to the conclusion that
Perl was not institutional quality and that we should therefore
cease new development and rewrite everything in C++. Two years
later a new guy took over the larger group, and one way or the
other everyone left. I never got my new tools, and that
certainly didn't help on the investment front. After he left a
year after that they scrapped the entire code base and bought
Murex as nobody could understand what they had.
If we had had D then, its possible the outcome might have been
different.
So in any case, hard to generalise, and better to pick a few
sympathetic people that see in D a possible solution to their
pain, and use patterns will emerge organically out of that. I am
happy to help where I can, and that is somewhat my own
perspective - maybe D can help me solve my pain of tools not up
to scratch because good investment tool design requires
investment and technology skills to be combined in one person
whereas each of these two are rare found on their own. (D makes
a vast project closer to brave than foolhardy),
It would certainly be nice to have matrices, but I also don't
think it would be right to say D is dead in water here because it
is so far behind. It also seems like the cost of writing such a
library is v small vs possible benefit.
One final thought. It's very hard to hire good young people. We
had 1500 cvs for one job with very impressive backgrounds -
French grande ecoles, and the like. But ask a chap how he would
sort a list of books without a library, and results were
shocking, seems like looking amongst D programmers is a nice
heuristic, although perhaps the pool is too small for now. Not
hiring now, but was thinking about for future.
More information about the Digitalmars-d
mailing list