Hacker News new | comments | show | ask | jobs | submitlogin
Jane Street and the OCaml Compiler (2018) [video] (www.janestreet.com)
172 points by chmaynard 15 days ago | hide | past | web | 114 comments | favorite





Seems to me that firms like two sigma and jane street form some sense of "mystique" about them with regards to how elite their developers are. But once you get inside, most people are writing run of the mill software. Google has the same tactic, even paying below market rates to people willing to work their for the brand. Jane Street does pay very well, but still, most of their hiring is based off of prestigious colleges, less off of people who have written brilliant software. Not that they arent smart. I've just seen enough marketing, and met too many people working at these firms to give them much credit from an "elite software" perspective.

With so much money on the line, hedge funds and other trading firms focus a lot of effort on marketing and building up a perception that they truly are something special (at least those that take outside client money). Internally, things could be rather mundane, and even downright awful (from a technology perspective). Not commenting on Jane Street in particular, just the industry in general.

Source: I've worked at a hedge fund and I've seen how the sausage is truly made. Many friends have told me similar stories about other firms.


Also having worked for a hedge fund, they're generally going to do their best to get you at the lowest salary possible. The overall benefits might be good, but they're going to do their best to low ball on the salary, "you'll make it up at bonus time!" Bullshit, You have to negotiate with hedge funds and be willing to walk away from a shit offer. I started at one a year out of college. Made 3x what I did as an intern. I was happy, at first. You can budget around a salary for rent, utilities, etc. You can't budget hoping for a bonus to make ends meet.

Took me about 3 years to realize I was being paid half of my coworkers doing the same to lesser jobs as mine. Took me resigning with an offer in hand for twice what I was making before they upped their offer. I stayed for another 3 years before I got so fed up with the bullshit I actually left for a lower paying job (also less stress).


There is another, more sinister reason for low base: typically if you have a non compete and are forced into a one year garden leave, they can get way with paying you just the low base.

Yeah worked in the industry as well, there's a story an older buddy used to tell when he worked at PIMCO in the 90s. Bit different than a quant fund but still applies. They hired a few Harvard Economics PhDs, who when looking at their macro economics material for institutional investors, claimed it was all wrong/didn't make sense. Bill Gross, the fund manager, told them it was to "get the money in the door from the fucking clients, we make gut bets on the market". Then fired the guy

>Harvard Economics PhDs

tbf, academia is its own racket.


Yeah absolutely, the value of the Harvard PhD in this story is the name and nothing else.

FWIW, Jane Street is not a hedge fund nor take outside client money.

(Edit: I work at Jane Street.)


How would you best describe them

It's a prop trading firm. It's pretty important to distinguish between hedge funds that take public money (which are generally probably low quality, as the OP described) and prop trading firms that only or primarily trade their own money (Two Sigma, Renaissance, Jane Street, Jump trading, etc.). The latter deserve their mystique, because you only trade your own money if you are actually beating the market.

Two Sigma and Renaissance (albeit not Medallion) primarily invest outside capital.

Can you comment on relevance for society? Do these companies generate a net positive, or do they only extract value?

I.e. should I look for a job there only because of the money, or also because I can make a difference somehow in the bigger societal picture?


They trade their own money instead of OPM

>Internally, things could be rather mundane

Elite developers often make things mundane. No fires. No explosions. etc.


Is the sausage made in VBA?

In some shops, yes.

The people who work at jane street are generally super talented, but they also mostly work on generic not-especially-interesting software problems.

Source: Former jane street employee, also worked at two FAANGs before that, so I think I have some basis for comparison


Not JS, but I've worked for a couple of competitors and can confirm. There were a few truly brilliant individuals, but most of the staff was unexceptional, and the general level of the tech varied between standard corporate and outright appalling. It turns out that if you have enough money, you can simply keep throwing programmers into the firefight until something sort of works. They're truly too rich to fail.

But, not an enjoyable ride if you actually love quality tech.

(But, money! So much money!!)


5 years ago, I had an interview with Jane Street. At that time I was interested in functional programming but not in OCaml. I think I had a fair chance but I bombed a technical interview. I do not have a "prestigious college" background and at that time I only had like a year of PHP experience. The interview format was also quite good.

I am not sure if the software they are writing is "elite" but OCaml is raising the bar a bit higher.


> "Google has the same tactic, even paying below market rates to people willing to work their for the brand."

I've never heard anyone say that google pays below market. Perhaps there are those who pay more, but it's definitely above "the market"


Below market rate compared to other Big Ns. But Google is willing to play ball if you can get other offers/get a good recruiter

This kind of elite mystique was also summoned in many posts in the "Who's hiring" thread a few days ago. It's important not to be too discouraged by all of this when looking for a job. Yes, there are really good programmers, but how many can there be, realistically?

I've worked for Google and been rejected by Jane St and come off with a much better impression of Jane St. Google has some bright stars, but Jane St has a higher portion of them.

i asked this question at my Jane Street interview, but never got a real answer: how does a market making latency arb firm handle GC pauses that are inherent with any GC language? OCaml is better in this regard than, say, the JVM, but it still seems problematic. I know they wrote some FPGA compiler stuff with OCaml, but I can't imagine that all of their execution is running through FPGAs. Even if it's possible, it seems like an uphill battle; something like Rust would probably be easier to deal with (at least for their realtime code).

From what I've heard Jane Street's bread and butter isn't latency sensitive market making on public equities, but creation/redemption arbitrage on ETFs, especially ones that hold a lot of relatively illiquid stuff like various fixed income ETFs. Creating and redeeming those ETFs might involve pricing an emerging market sovereign or corporate bond that might have only traded a few times in the last year. So their expertise is more around smartly trading really weird stuff as opposed to the pure speed players like Jump, Citadel or Virtu Financial.

That said JS is probably in a lot of different products, some more latency sensitive than others, but speed isn't what they are known for.


Things to consider:

The company is >20 years old.

OCaml is a general purpose language that provides an equilibrium between the avoidance of bugs introduced by state (like all functional languages), speed, and polymorphic type inference. At the time of adoption, the other choices were Haskell (too academic, not practical), Erlang (no type inference, not suited for large code bases with complex business logic), and lisp (too slow, loose/optional type system). The last time I checked, OCaml was third only to C and C++ in terms of speed. It is also important to consider how intellectually stimulating it is to write OCaml. If you can achieve the three things mentioned at the top of this paragraph while also creating a brand of gravitas and intellect that attracts top-tier talent, of course you would choose OCaml.

Would a new, uninitiated market maker write something in OCaml? Unlikely, as they would probably use C++, Rust, or Scala with a 1TB heap and GC disabled. Ignoring the learning curve and time/dollar constraints of starting a hedge fund, I would choose OCaml over the three mentioned.

[1]https://en.m.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_...

[2]https://courses.cs.washington.edu/courses/cse341/04wi/lectur...


Yaron Minsky (speaker in the talk, and guy who introduced OCaml to Jane Street) wrote in detail about how the firm decided on this in ACM Queue: https://queue.acm.org/detail.cfm?id=2038036

At the time of OCaml choice at JS (I guess it was circa 2008-2010; I also strongly suspect that it was not at the start of JS) Haskell already had multithreaded RTS with very green threads, software transactional memory and multicore support.

[1] https://downloads.haskell.org/~ghc/6.6.1/docs/html/libraries... (Control.Concurrent contains forkIO - thus, green threads!)

Guessing from what information is available to me, it was a matter of personal preferences, not technical decision.

PS From [2] in some other comment, it really was a matter of personal choice.

[2] https://queue.acm.org/detail.cfm?id=2038036

At [3] it can be found that ghc had green threads at 2002.

[3] https://downloads.haskell.org/~ghc/5.04/docs/html/base/index...


> At the time of OCaml choice at JS (I guess it was circa 2008-2010; I also strongly suspect that it was not at the start of JS)

You're right that it wasn't OCaml from the beginning, but I believe it was quite a bit earlier than that. The firm dates to 1999-2000, and OCaml came into the picture sometime around 2004.

The reason they avoided Haskell, supposedly, is the lack of predictability in its performance, largely due to its laziness. OCaml, meanwhile, has a fairly straightforward compilation that allows moderately experienced developers to have a very good idea of what the corresponding machine code will be.


Why C++ now but not then, 20+ years ago? And would F# be suitable for the task? Im genuinely interested

Presumably Jane Street didn’t choose C++ because they wanted to reduce bugs introduced by the preservation of state; the killer of prop shops. F# was developed six years after their founding, hence too young, and more importantly, a Microsoft-owned clone of OCaml. I don’t think it even ran on Linux before 2015. Today, F# might just suit the job, assuming you are open to being locked into the .NET family. An interesting idea to say the least.

Why C++ now? Still the fastest and tons of quants and highly skilled programmers know it. When you consider the correlation between C++ developers’ technical acumen and quantitative skills, coupled with the maturity and increasing convenience of the ecosystem, it makes sense.


> assuming you are open to being locked into the .NET family

Which is not different than being locked into e.g. JVM family, or even being locked into OCaml itself.


.NET (redone as Core) was a much worse choice 20 years ago than it is now of course. If they would fully open up the debugger, I would prefer it over the JVM. OCaml is more open (for .NET Core, there is still no good open debugger and the JVM suffers from Oracle keeping closed performance enhancements which, in my experience, do make big difference) than either of them, but not many people like programming it so it is hard to find people.

Everything in the C++ ecosystem is better now than it was 20 years go. The language, the tools, the libraries, the build systems. It's actually fairly pleasant to work in these days.

Not really, everything related to app development is just gone, dead, with Qt and wxWidgets being the remaining survivors.

VCL is only available to corporate shops and those that aren't into FOSS religion.

MFC is in maintenance mode, and so far Windows developers are more keen moving into one of .NET UI stacks while keeping some C++ code as COM/DLLs or even C++/CLI, than jumping into UWP/WinUI. It remains to be seen if WinUI 3.0 will change the migration trend.

Then on mobile OSes, it isn't even an option, unless you want to write your own GUI from scracth using OpenGL/Metal/Vulkan.


I doubt HFT firms are writing apps in C++.

Everything in almost every ecosystem is better than it was 20 yrs ago. Even Common lisp is evolving... even c++ is almost, aaaaalmost has a package manager :-)

G-Research in London are using F# I believe

I think you are ascribing too much thought to Jane Street's decision to use OCaml. I think it was a good choice, on balance, but from what I heard/read when I worked there it was mostly just circumstantial.

There are many prop shops using plain old java with zero gc code.

Worked in HFT and funds, there's a bit more to the taxonomy:

- Trading completely based on the current prices. If you're doing purely arbitrage, eg looking for the ask to go under the bid (elsewhere), you are gonna have to be really, really fast, because that kind of thing is so very obvious nobody hasn't thought of it.

- One step from from that is passively leaving an order in and hoping for the same thing. Eg you leave a bid in a less liquid market, below the bid in the main market, and hope someone hits you. You then immediately throw that onto the main market. Gotta be super fast to do that, because of course everyone else can see someone traded.

- Market making based on some form of ML on the orderbook, basically imbalance. Here you need to do more than just comparing two prices. You also aren't entirely leaning on the current price to offload your position, you might hold onto your position a bit. So now there's risk involved, and you might need to decide how big a position you want. And not everyone will have the same position, so not everyone is after the same trade.

- Market making based on multiple orderbooks. Say you have an ETF basket, and you want to be able to make bids and offers around it, as well as the underlying shares. So then you have a different position to everyone else, there's a fair bit more information to digest, and there's a fair bit more modelling which will mean different participants have different decisions. This means your decisions will not be contested in the same way that obvious arbitrage would be.


Ive worked a couple java based hft platforms at major firms. You write code that never gcs after startup. That means not using certain language constructs and rewriting almost every library along with using off heap memory.

I have personally worked on systems that would run an entire week between deployments and never gced.


Could you maybe elaborate more on how to achieve this with Java, i.e., run an entire week and never get GCed? I would imagine that using immutable types is one of the tricks used in such a scenario.

You would (among other techniques) use something like Stormpot to maintain Object pools to release and acquire instances from.

Object pools typically can't match the GC on latency and throughput, however they do provide a lot more predictable and stable performance.


You basically write Java as C++, you don't use any builtins, write everything yourself from pure java.

I know you said "almost" but you don't have to re-write huge amounts to achieve zero gc and very low latency. There are open source tools, eg. agrona, aeron, that get you a lot of the way there.

And I'd imagine sun.misc.Unsafe is used quite often too.

Why would you do that? Why not use C++? Especially, as you say, when memory management is ignored...

You can use a language like OCaml without allocating and still capture huge benefits over C++, like a decent type system, ADTs, and better abstractions than templates (modules in OCaml's case).

What about multicore? :)

Nowadays Rust seems to cover all those features with zero-cost abstractions.


You still need to write non-consing code to avoid memory allocation/deallocation latency, just as if you were working in Java, Lisp or OCaml.

(In fact, GCs can often give you better latency than malloc/free unless you go into custom allocators with object pools... which is part of writing non-consing code in GCed languages)


Ah you're right, I forgot about OCaml's ass-tier multicore support. Rust is definitely better there.

As others have mentioned, they have ways of writing OCaml specifically to not trigger the GC or perform any excessive allocations. Helps when you have your own version of the compiler and a branch of the language itself.

This is talked about with an example in this talk around 18m30s: https://www.youtube.com/watch?v=BysBMdx9w6k

Also as someone else mentioned, Jane Street doesn't try regularly competing (to my knowledge as someone in the industry) at trading horizons that demand lowest the lowest possible latency.


I feel like the jane street tech blog has exceptionally good quality, with a really nice mix of academic and practical ideas thrown in. Do you happen to know of any other video series with this kind of exposition?

Unfortunately I agree with you Jane Street's tech blog and published lecture series are great and probably the best I know of. They make for great advertising to boot. Lecture series like Jane Street's are common in larger trading firms (more than 100-150 headcount) but I haven't seen one of JS' caliber yet.

If you welcome general corporate blogs, I think Google AI's technical blog is quite good. Their frequent publications on distributed computing strike a good balance of academic and practical ideas:

https://ai.googleblog.com

Cloudflare also has one of my favorite more engineering focused blogs.


I don't know what Jane Street does, and can't speak for them. That being said, I've heard of some firms using the JVM that just tune things such that the garbage collector will never trigger, and then they restart the VM every night.

Unfortunately I don't have a citation for that. I think I've seen it talked about before here on HN before.


I'm not entirely clear on the use case, but Disruptor by LMAX Exchange is for the JVM and seems very concerned with latency. (https://github.com/LMAX-Exchange/disruptor/wiki/Performance-...)

Here's the backstory, worth a good read in general. https://martinfowler.com/articles/lmax.html

I suspect at least one use case has gone away since the Fowler wrote that: wanting to saturate the write channel into a database in general. Consistent hashing document DBs like DynamoDB, Cassandra, etc. are basically infinitely scalable for writes. It's not clear to me if LMAX still makes sense if you want to assemble the db writes into a strongly ordered stream.


"wanting to saturate the write channel into a database in general. Consistent hashing document DBs like DynamoDB, Cassandra, etc. are basically infinitely scalable for writes."

I don't really understand what are you are saying there ?

But we use disruptors for processing millions of messages/events as quickly as possible, with a variety of different consumers.


You can write zero-alloc OCaml code, since the standard compiler is very predictable in when it will allocate closures, etc. The Flambda compiler makes things even nicer in removing allocations for some common idioms.

[1]: https://twitter.com/yminsky/status/947064713237684224


If you don't allocate, you won't garbage collect. GC pauses would be really bad, but you can avoid them, or at least avoid them happening at bad times, if you're careful. It does requires knowing in some detail how Ocaml manages memory.

They are market makers not those battling the swings at the micro second scale. Yaron Minsky replied this in some random video I saw a random night, though I cannot recall which.

So their edge isn’t based on speed. They gain more from correctness and ability to express these ideas.


Market makers are incredibly exposed to ms scale swings since they're warehousing a lot more risk than a HFT firm.

i don’t doubt it. But that’s what he said. Very few people alive in this specific field with this much domain knowledge. Are you his peer in this respect ?

You assume he’s telling the truth. Why would he make the truth public knowledge? It can do nothing but hurt.

In Elixir/Erlang there is no global pausing for garbage collection, instead each process is individual garbage collected. And I remember hearing about a strategy that allows you to avoid it entirely by ensuring your process's lifetime is short enough (or heap size big enough) that it basically never happens despite the BEAM (Erlang VM) running with garbage collection.

Edit: my point is that it is likely avoidable through certain coding practices in garbage collected environments.


This is only possible due to the way that the BEAM VM and its languages are architected; the gc itself is "not the smartest", but it doesn't have to be because there is no shared memory and process boundaries are failure domains.

This is a really fascinating question. When I chatted with some of the Jane Street crew at NeurIPS they were adamant in the benefits of OCaml's Hindley-Milner type system.

I didn't ask, but am curious, how that compares to the type guarantees of Rust? Would moving to Rust cause them to lose that advantage of compile time error catching? I've never written a line of Rust (hopefully that changes soon) so I don't know, but am certainly interested.


Rust type's system has a similar power level to OCaml's but without modules (which are used heavily at JS) and with linear types/lifetimes. Rust would probably be a good fit for JS, but of course it takes a lot to overcome the momentum of 20 years of using primarily one language.

Ocaml doesn't have a HM type system. I mean I guess it does, but Ocaml supports a lot of stuff HM doesn't.

Technically the same is true of Haskell[0] but most people (myself included) will refer to the kind of type system used as Hindley-Milner for simplicities sake. (Both are more powerful though)

[0] https://cstheory.stackexchange.com/a/30528/20014


Did they say that latency isn't a big factor in their PnL, but they let their tech guys waste their time over it to keep a big happy family?

They fill up with a few TBs of RAM and then turn off garbage collection. Always surprised people dont know about this.

> how does a market making latency arb firm handle GC pauses that are inherent with any GC language?

Don't garbage collect, or do it at the end of the trading day. There's plenty of VMs designed for this, at least in the JVM space (which is where my personal experience has been).

RAM is cheap. Latency is expensive.


Preallocate all of the memory on the box, and control GC runs.

There's an entire library of tricks for ensuring that GC pauses don't affect trading, but the biggest one is to use userspace allocated memory pools.


Anywhere that's latency sensitive they write code carefully so that it doesn't trigger the GC.

GC pauses aren't inherent.

I've written a java compiler that does GC and watched that part work; no pauses even for data structures with loops, and generally excellent performance. Not released yet, because I've also watched other parts not work.

But GC without pauses: That worked. And I was relieved² when I saw that my ivory tower worked, after many people had told me it couldn't possibly ;)


This is the third time I've watched this video (admittedly with a bit of distraction today). After the first time I went off and learned about modular implicits, which appear to be a huge ergonomic improvement, especially the way the JaneStreet library APIs are designed. The second and third times I got increasingly sad that so much brain power is being sucked up by algebraic effects, to the detriment of other advancements. Oh well, that's their decision to make; JaneStreet continues to do huge service to OCaml in myriad ways, and I really value these tech talk videos as part of that.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: