Interviewed with HRT awhile back. While I didn't get past the final round, their Python internals interview (which I did pass) was an absolute blast to prepare for, and required a really deep dive into implementation specific details of CPython around things like exactly how collisions are handled in dict, details about memory management, etc. Pretty much had to spend a few weeks in the CPython source to prep, and was, for me, worth the interview just to really learn what's going on.
For most teams I would be pretty skeptical of a internal Python fork, but the Python devs at HRT really know their stuff.
I interviewed with them as well. Something like 6-8 interviews only to be told they then, after that, were circulating my CV amongst teams and didn't have a fit for me...
I also interviewed with them a couple of years ago, for their Database SRE (AKA DBRE) role. It was going quite well until I discovered that they required you to live within commuting distance of an office, and unfortunately I had just moved away from one. They didn’t require in-person, supposedly, but needed the option, I guess?
I was quite impressed by the interviews, mostly for their pragmatism and skill-fitting. The programming interview wasn’t LC, it was “can you use a language (preferably Python) to parse a CSV and get useful information out of it,” because that’s the skill level the team needs. On the other hand, the Linux and DB interviews were quite in-depth, because again, the team needs those skills.
10/10 would interview again if I’m ever near an office again.
Most trading firms are past the whole "beat the other guys to buy". Established large investment firms already have all that on lockdown in terms of infrastructure and influence to the extent where they basically just run the stock market at this point (i.e Tesla posts horrible quarter numbers, but stock goes up).
Most of the smaller firms basically try to figure out the patterns of the larger firms and capitalize on that. The timescales have shifted quite a bit.
No, there are absolutely electronic trading markets where a difference of milliseconds of latency to certain events is worth more than a M PnL. That’s a long time.
The top trading firms are firing off orders in double digit nanoseconds, not milliseconds.
In some cases the order leaving the card starts to emerge before the packet containing the market data event that they're responding to has even finished arriving.
Waiting for a full microsecond for the packet to arrive before responding means you're already too slow
Doesn't the fact that a modern FPGA-centric (probably ASICs in the mix too at this point) hybrid NIC/order-parser/state-machine thing is rumored to be able to hit glass-to-glass of ~20-40ns mean that the speed game is hotter than ever?
Do you mean that because it involves a lot of hardware design now? The days of being able to offer around the inside in C++ on a regulated securities exchange are over, but there's still C++ driving the thing, that 20ns "tick to trade" or however it's being measured in some instance is still pretty basic response stuff, light speed is still a thing. There's a C++ program upstairs running the show, and it's trying to do it's job in under a mike for sure.
But there are more recent talks (Optiver is especially transparent about it but other people talk about it too): https://www.youtube.com/watch?v=sX2nF1fW7kI, that's David Gross at CppCon last year, it can't have changed that much since last year.
That’s irrelevant to the fact that the expected PnL on a millisecond of latency improvement is a lot more than 1M in some markets. Obviously if you are getting what ever trade you are concerned with off in less than one millisecond, the question isn’t well posed.
There are many more games to play than delta one takeout and the solutions certainly don’t fit on one or a handful of FPGA’s.
I took the parent post to mean that a few large firms have emerged as clear winners of the speed game, and most other companies compete on (relatively) longer time scales now.
It's not uncommon to have a fast core and then an API that alpha / research teams feed signals into
if you are someone like HRT I presume the bulk of their money comes at very short holding periods so you have e.g. fast signals that work short term and then mid frequency alpha signals that spit out a forecast over a few timeframes i.e. it might not be that they buy (aggressively) really quickly but rather than someone sells to them and then they hold onto the position for longer than they would if they have no opinions.
Similarly this shapes where you post your orders e.g. if you really want it then you want to be top of the book
Well there's gonna be people writing code who can't do it in say a high performance C/C++ setup. Not professional programmers, but professional <some finance discipline>.
Sometimes it will be worth the tradeoff to put that person and a programmer together to code up a solution in another language. Sometimes it will be worth it to have the non-programmer write it in Python and then do Herculean things in the background to make it fast enough.
The gulf between high performance C/C++ and Python is vast and includes most other programming languages, many of which are friendly to write or can be made friendly to write for a limited domain, with significantly less rocket science needed than making python faster.
Even at the speediest trading firms, the large majority of code is not latency sensitive. Systems and algos are structured such that the fast acting stuff is simple and contained.
Honestly if you're a millisecond too slow you might as well not trade at all. From my own experience with trying to get Python to go fast for crypto trading, you can get it pretty fast using Cython - single digit microseconds on an average AWS instance for a simple linear regression was my proudest moment. They're probably pushing it even faster because nanoseconds are where the money's at. Many HFT firms are down in the double digit nanoseconds, I believe. Maybe lower.
In crypto, you have centralised (e.g. Coinbase) and decentralised exchanges (e.g. Uniswap). Decentralised exchanges operate onchain via smart contracts.
Its a finance firm - i.e scam firm. "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
Scammers are not tech people. And its pretty from their post.
> In Python, imports occur at runtime. For each imported name, the interpreter must find, load, and evaluate the contents of a corresponding module. This process gets dramatically slower for large modules, modules on distributed file systems, modules with slow side-effects (code that runs during evaluation), modules with many transitive imports, and C/C++ extension modules with many library dependencies.
As they should.
The idea that when you type something in the code and then the interpreter just doesn't execute it is how you end up with Java like services, where you have dependency injection chains that are so massive that when the first time everything has to get lazily injected the code takes a massive amount of time to run. Then you have to go figure out where is the initialization code that slows everything down, and start figuring out how to modify your code to make that load first, which leads to a mess.
If your python module takes a long time to load, this is a module problem. There is a reason why you can import submodules of modules directly, and overall the __init__.py in the module shouldn't import all the submodules by default. Structure your modules so they don't do massive initialization routines and problem solved.
Furthermore, because of pythons dynamic nature, you can do run time imports, including imports in functions. In use, whether you import something up at the top and it gets lazily loaded or you import something right when you have to use it has absolutely no difference other than code syntax, and the latter is actually better because you can see what is going on rather than the lazy loading being hidden away in the interpreter.
Or if you really care, you can implement lazy work process inside the modules, so when you import them and use them the first time it works exactly like lazy imports.
To basically spend time building a new interpreter with lazy loading just to be able to have all your import statements up at the top just screams that those devs prefer ideology over practicality.
> Its a finance firm - i.e scam firm. "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
HRT trades their own money so if it didn't beat VOO then they'd just buy VOO. There are no external investors to scam.
> "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
You wish lol. How do you think they pay for all the developers?
Firms like HRT don't even take outsider money, they don't really need to.
And besides, we don't get paid for beating stocks, a lot of funds will do worse than equities in a good year for the latter, the whole point is that you're benchmarked to the risk free rate because your skill is in making money while being overall market neutral. So you rarely take a drawdown anywhere near as badly as equities.
As a service this is often a portfolio diversification tool for large allocators rather than something they put all the money into.
It is true however that some firms are basically just rubbish beta vehicles that probably should in an ideal world shut down.
I don't know what you define as outsider money, but the fact is that they are a market maker, and you are never going to make large amounts of money on arbitrage by itself.
> Its a finance firm - i.e scam firm. "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
> Scammers are not tech people. And its pretty from their post.
It would be great if you included any sort of evidence or argument.
Reading on to the other comments, it looks like you're throwing out a lot of accusations and claims. I don't know what you think you know, but from the looks of it, you don't really know HRT's business. I don't really these days, but I knew it years ago, and it's not from taking client money or arbitrage or some weird scam. It's not magic but the world of algo trading isn't a ponzi scheme.
> "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
Man from all these responses, so many people are unaware of the finance world, lol. Or just bots posting for HRT
In case you are unaware - very single trading firm makes money on fees, not by outperforming the market. This goes for firms like Vanguard too.
Just think about it for a little bit - if you could reliably outperform the market by any % with an algorithm why even start a company? Just take out loans, invest, make money, repeat and become rich. No expenses to manage a company.
They're a market market that provides liquidity to clients. They (as far as I know) don't charge fees, but earn their profits by exploiting the ask/bid spread.
A customer might want to offload TSLA and are willing to pay the market rate of $329. This trade might work in HRT's favor if they can sell it for $329.01. It's just 1 cent, but over millions of transactions these small amounts of profit add up.
The value captured by HRT is meaningful in the aggregate, but tiny and irrelevant to the institutional clients, and therefore can't be thought of as a fee. What HRT provides in return for taking clients' trades is liquidity.
I think you're misunderstanding what that page is: it's not an advertisement to invest with the company, it's an advertisement to trade via/with the company in the same way you might otherwise go manually trade from a Bloomberg terminal (or any other method).
There is no way to invest in the company, and the only way of becoming a "customer" is to engage in trading.
I'm not sure how you intend performing the "take out loans" and "invest" stages without a company, once you have more than a few million dollars. Companies permit scaling past one person's worth of uptime and ability.
HRT is a prop firm, which means they don't have outside investors. They invest their own money like you said, no income from fees. You just proved that you have no idea what you are talking about.
No, they are a liquidity provider. Liquidity providers only make money when there are imbalances in the market. Notionally because markets tend to drive fast towards efficiency, you can't realistically make money just being a market maker.
So then, you have to offer additional services. HRT has the SDP that they provide, and of course charge fees for. But then the question is why would anyone do this, versus just going through any other financial institution.
The answer is basically all up on their website
"As a liquidity provider, HRT develops automated trading algorithms designed to provide the best prices to our clients".
The question that should be asked is as a user, why would I want to sign up with HRT or any similar financial company? The answer for HRT is because you want to have access to more complicated financial derivatives - you don't need to sign up and pay fees to buy basic stock.
So they promise that their algorithms give the user the best price, which is a legal way of saying that you will pay less for a certain asset and make money on it, and you can't say that because you can't guarantee this.
And its well known that nobody ever gets rich of an algorithm in finance unless you are well established large firm intricately tied with the government that can move so much money as to influence trading.
Being a market maker vs "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could" are two very different things lol
Moreover, I would be very surprised if the majority of their $8 billion annual profit came from client market making.
Right, exactly what I said. They don't make money by market making. They make money by charging transaction fees, and on an access basis to their "algorithms" which are designed against analyzing the complex futures that they are the market maker for.
The incentive for users to sign up with them is to get access to "better" pricing for whatever commodity they pair the buy/sell orders for - but remember these are futures so its all betting, and so the algorithms don't really mean anything.
Runtime imports are a maintenance nightmare and can quickly fragment a codebase. Static analysis of imports is so desirable that it is almost always worth the initialization performance hit. Tradeoffs.
Static analysis of imports should be solved by mypy (or your favorite static analyzer).
I guess you meant "run all imports at startup is desirable to check if they work", but I have a hard time agreeing with that (personally I think having a good test suite is needed, whereas running more code at startup is not wanted).
You can have one module with all the capability, but the capability is separated into submodules. And the __init__.py of the module doesn't auto import the submodules
So when you do this
import bigmodule
It doesn't do anything functionality, and you may only have some small top level things available for you, like bigmodule.config, or bigmodule.logging.
Then, you have your big initializer code in bigmodule.financedata. But the stuff you need for running scripts is in bigmodule.scripts.
So when you write
from bigmodule import financedata
This code will take a while.
But if you write
from bigmodule import scripts
This will load fast.
You don't need to have gazillion modules, just good organization. Also, in general, its a good practice to gate intensive compute/network operations behind an explicit function you need to call.
Also thank you for focusing the convo on the tech stuff instead of repeating finance bro myths
That sounds good in theory, but I haven’t come across a codebase that’s really optimized and free of unnecessary imports, especially in large companies.
And when you have to depend on external libraries beyond your control, how do you typically handle those situations?
As codebases get larger, inefficiencies are bound to happen. Having explicit imports that I can go look up and see where they are is better for resolving this, because you can trace how every module gets imported and time them. Having interpreter code that runs with lazy loading that is all hidden is not the way to solve this.
As for external libraries, you import them in places where you need to use them only to avoid the same pitfalls. Its also pretty easy to analyze the import process within those libraries, and then again import specific submodules only that limit what actually gets loaded.
While I see the usefulness of lazy imports, it always seemed a bit backward to me for the importer to ask for lazy import, especially if you make it an import keyword rather than a Python flag. Instead I'd expect the modules to declare (and maybe enforce) that they don't have side effects, that way you know they can be lazily imported, and it opens the door for more optimizations, like declaring the module immutable. That links to the performance barrier of Python due to its dynamic nature as discussed in https://news.ycombinator.com/item?id=44809387
Of course that doesn't solve the overhead of finding the modules, but that could be optimized without lazy import, for example by having a way to pre-compute the module locations at install time.
> it always seemed a bit backward to me for the importer to ask for lazy import, especially if you make it an import keyword rather than a Python flag
Exactly this. There must be zero side effects at module import time, not just for load times, but because the order of such effects is 1) undefined, 2) heavily dependent on a import protocol implementation, and 3) poses safety and security nightmares that Python devs don't seem to care much about until bad things happen at the most inconvenient time possible.
> Of course that doesn't solve the overhead of finding the modules, but that could be optimized without lazy import, for example by having a way to pre-compute the module locations at install time.
2) pre-compute everything in CI by using a solution from (1) and doing universal toplevel import of the entire Python monorepo (safe, given no side effects).
> This process gets dramatically slower for … modules on distributed file systems, modules with slow side-effects
Oh no. Look I'm not saying you're holding it wrong, it's perfectly valid to host your modules on what is presumably NFS as well as having modules with side effects but what if you didn't.
I've been down this road with NFS (and SMB if it matters) and pain is the only thing that awaits you. It seems like they're feeling it. Storing what is spiritually executable code on shared storage was a never ending source of bugs and mysterious performance issues.
Gonna call this an antipattern. Do you need all those modules imported in every script ? Well then you save nothing on loadup time, the time will be spent regardless. Does every script not need those imports ? Well they shouldn't be importing those things and this small set of top level imports should be curated into a better, more fine grained list (and if you want to write tools, you can certainly identify these patterns using tooling similar to that which you wrote for LazyImports).
I don't disagree, but I mean it was either that or C + shell back in the early 2000s, and C + shell is notorious for its non-portability across Unix and Windows—partly why Git on Windows requires an entire MSYS installation.
Today, it would be a mistake to use anything other than Rust (hence Jujutsu carrying the flame forward).
For personal one file utility scripts, I'll sometimes only import a module on a code path that needs it. And make it global if the scope gets in the way.
It's dirty, but speeds things up vs putting all imports at the top.
I wonder how much can be saved by using a local file system for imports though. In my testing just a mere presense of a home directory on NFS already dramatically slows down imports (by ~10x) due to Python searching for modules in home directory too by default.
to prevent this, set PYTHONNOUSERSITE=1 will prevent searching for modules in ~/.local/ (for convenience, try calling python through a wrapper in your project, say bin/run-python, and there you can set all the python-specific environment variables you need, set at the time of execution and not have to worry about setting them in the user's shell etc)
Thanks, yeah I know that it works, what I meant is that it may be quite easy to compare the module import times with and without that env variable to see how much impact an NFS home directory has (and it's a lot), and possibly draw similar conclusions about the distributed file system behaviour in general too
It'd have been really nice to have that PEP in as it'd have helped me not have to write local imports everywhere.
As it is, top-level imports IMHO are only meant to be used for modules required to be used in the startup, everything else should be a local import -- getting everyone convinced of that is the main issue though as it really goes against the regular coding of most Python modules (but the time saved to start up apps I work on does definitely make it worth it).
Yeah, imo that's the way that python should've worked in the first place.
Import-time side effects are definitely nasty though and I wonder what the implications on all downstream code would be. Perhaps a lazy import keyword is a better way forward.
We need something like the ancient unexec from Emacs to dump out Python images. More generally, we need something like that for generic checkpointing, maybe based on CRIU.
> we support the Steering Council in their rejection of PEP 690—the implicit lazy imports are not a good fit for upstream due to the same, subtle bugs we encountered during our migration. However, as time permits, we hope to propose a revised lazy imports PEP that introduces an explicit lazy keyword, e.g. lazy import foo or lazy from foo import bar. This approach will satisfy migration and compatibility concerns, allow users to opt-in gradually, and enable all Python users to reap the speed benefits of lazy imports in a safe way.
Hmm it strikes me that is they really wanted to go this lazy route, they could've implemented an import hook, instead of creating and maintaining an entire fork.
Yeah I wonder what led them down the drastic fork trajectory rather than considering this approach… kind of interesting this wasn’t even acknowledged in the article
every quant shop has QR and QT people that can barely write passable python let alone cpp - then the QD people have to integrate that stuff with prod cpp pipelines.
In my experience it tends to be the opposite — I am not a quant (QD) but having worked with a few teams there’s a negative selection for technical expertise. QRs who are good at programming are usually pushed into maintaining infrastructure, datasets, or just tooling for less technical members of their team, who then get to use those tools to further their own alpha generation. Orgs incentivize the final step in making alpha — spend too much time helping others or building reusable research, and your coworkers steal the thunder.
That, or stop helping your coworkers/accommodating them… risky, as a career move. Only seen that work once.
Libraries for this have always existed, triggering import on first access. The problem was, they would break linters. But that's not an issue anymore with typing.TYPE_CHECKING.
A PEP is very much welcome, but using lazy import libraries is a fairly common, very old, method of speeding things up. My pre PEP 690 code looks like this:
import typing
from lazy import LazyImport
member = LazyImport('my_module.subpackage', 'member')
member1, member2, = LazyImport('my_module', 'member1', 'member2')
if typing.TYPE_CHECKING:
# normal import, for linter/IDE/navigation.
from my_module.subpackage import member
from my_module import member1, member2
Well if you use argparse or one of the many argparse wrappers for a moderately complex CLI you end up lazyfing the CLI parser itself because just fully populating the argparse data structures can easily take half a second or more, so with other startup costs you easily end up with "program --help" taking >1s and any CLI parsing error also taking >1s.
Not sure why you're throwing shade on people making a living. Go lobby to your representative if you think the financial market should be changed instead of belittling folks doing a job.
Interviewed with HRT awhile back. While I didn't get past the final round, their Python internals interview (which I did pass) was an absolute blast to prepare for, and required a really deep dive into implementation specific details of CPython around things like exactly how collisions are handled in dict, details about memory management, etc. Pretty much had to spend a few weeks in the CPython source to prep, and was, for me, worth the interview just to really learn what's going on.
For most teams I would be pretty skeptical of a internal Python fork, but the Python devs at HRT really know their stuff.
I interviewed with them as well. Something like 6-8 interviews only to be told they then, after that, were circulating my CV amongst teams and didn't have a fit for me...
But yes, like you I had a great experience
I also interviewed with them a couple of years ago, for their Database SRE (AKA DBRE) role. It was going quite well until I discovered that they required you to live within commuting distance of an office, and unfortunately I had just moved away from one. They didn’t require in-person, supposedly, but needed the option, I guess?
I was quite impressed by the interviews, mostly for their pragmatism and skill-fitting. The programming interview wasn’t LC, it was “can you use a language (preferably Python) to parse a CSV and get useful information out of it,” because that’s the skill level the team needs. On the other hand, the Linux and DB interviews were quite in-depth, because again, the team needs those skills.
10/10 would interview again if I’m ever near an office again.
when milliseconds mean millions
Those days are all over btw.
Most trading firms are past the whole "beat the other guys to buy". Established large investment firms already have all that on lockdown in terms of infrastructure and influence to the extent where they basically just run the stock market at this point (i.e Tesla posts horrible quarter numbers, but stock goes up).
Most of the smaller firms basically try to figure out the patterns of the larger firms and capitalize on that. The timescales have shifted quite a bit.
No, there are absolutely electronic trading markets where a difference of milliseconds of latency to certain events is worth more than a M PnL. That’s a long time.
The top trading firms are firing off orders in double digit nanoseconds, not milliseconds.
In some cases the order leaving the card starts to emerge before the packet containing the market data event that they're responding to has even finished arriving.
Waiting for a full microsecond for the packet to arrive before responding means you're already too slow
The speed game is essentially over
Doesn't the fact that a modern FPGA-centric (probably ASICs in the mix too at this point) hybrid NIC/order-parser/state-machine thing is rumored to be able to hit glass-to-glass of ~20-40ns mean that the speed game is hotter than ever?
Do you mean that because it involves a lot of hardware design now? The days of being able to offer around the inside in C++ on a regulated securities exchange are over, but there's still C++ driving the thing, that 20ns "tick to trade" or however it's being measured in some instance is still pretty basic response stuff, light speed is still a thing. There's a C++ program upstairs running the show, and it's trying to do it's job in under a mike for sure.
The OG talk on this is Carl Cook's: https://www.youtube.com/watch?v=NH1Tta7purM
But there are more recent talks (Optiver is especially transparent about it but other people talk about it too): https://www.youtube.com/watch?v=sX2nF1fW7kI, that's David Gross at CppCon last year, it can't have changed that much since last year.
That’s irrelevant to the fact that the expected PnL on a millisecond of latency improvement is a lot more than 1M in some markets. Obviously if you are getting what ever trade you are concerned with off in less than one millisecond, the question isn’t well posed.
There are many more games to play than delta one takeout and the solutions certainly don’t fit on one or a handful of FPGA’s.
I took the parent post to mean that a few large firms have emerged as clear winners of the speed game, and most other companies compete on (relatively) longer time scales now.
> i.e Tesla posts horrible quarter numbers, but stock goes up
This is a non sequitur from who’s winning the HFT game
If that was the case, why use Python in the first place?
It's not uncommon to have a fast core and then an API that alpha / research teams feed signals into
if you are someone like HRT I presume the bulk of their money comes at very short holding periods so you have e.g. fast signals that work short term and then mid frequency alpha signals that spit out a forecast over a few timeframes i.e. it might not be that they buy (aggressively) really quickly but rather than someone sells to them and then they hold onto the position for longer than they would if they have no opinions.
Similarly this shapes where you post your orders e.g. if you really want it then you want to be top of the book
They’re not doing the fast stuff in python; nevertheless it’s worth their engineering to make the parts they do in python fast
Well there's gonna be people writing code who can't do it in say a high performance C/C++ setup. Not professional programmers, but professional <some finance discipline>.
Sometimes it will be worth the tradeoff to put that person and a programmer together to code up a solution in another language. Sometimes it will be worth it to have the non-programmer write it in Python and then do Herculean things in the background to make it fast enough.
The gulf between high performance C/C++ and Python is vast and includes most other programming languages, many of which are friendly to write or can be made friendly to write for a limited domain, with significantly less rocket science needed than making python faster.
> Sometimes it will be worth it to have the non-programmer write it in Python and then do Herculean things in the background to make it fast enough.
Nim exists, Crystal exists
It would not surprise me if some shops end up using less common languages to fill a niche (or hell, invent their own DSL).
But it also wouldn't surprise me if a lot of shops land on python because that's what their hiring pool knows.
With insignificant ecosystems, compared to Python.
Even at the speediest trading firms, the large majority of code is not latency sensitive. Systems and algos are structured such that the fast acting stuff is simple and contained.
There’s tons of latency sensitive code outside of the FPGA systems and it is not simple.
Honestly if you're a millisecond too slow you might as well not trade at all. From my own experience with trying to get Python to go fast for crypto trading, you can get it pretty fast using Cython - single digit microseconds on an average AWS instance for a simple linear regression was my proudest moment. They're probably pushing it even faster because nanoseconds are where the money's at. Many HFT firms are down in the double digit nanoseconds, I believe. Maybe lower.
For reference, a byte at 10gbps is almost 1ns long.
Though the encoder runs 64/66bits at a time, so you really get around 8B every 7ns or so.
For crypto you pay the miners to put your transaction first. You don't need millisecond precission reaction time.
Crypto trading takes place on exchanges, not the blockchain though?
In crypto, you have centralised (e.g. Coinbase) and decentralised exchanges (e.g. Uniswap). Decentralised exchanges operate onchain via smart contracts.
This article from a16z explains the mechanics of reordering transactions for profit (MEV): https://a16zcrypto.com/posts/article/mev-explained/
>Python devs at HRT really know their stuff.
Its a finance firm - i.e scam firm. "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
Scammers are not tech people. And its pretty from their post.
> In Python, imports occur at runtime. For each imported name, the interpreter must find, load, and evaluate the contents of a corresponding module. This process gets dramatically slower for large modules, modules on distributed file systems, modules with slow side-effects (code that runs during evaluation), modules with many transitive imports, and C/C++ extension modules with many library dependencies.
As they should.
The idea that when you type something in the code and then the interpreter just doesn't execute it is how you end up with Java like services, where you have dependency injection chains that are so massive that when the first time everything has to get lazily injected the code takes a massive amount of time to run. Then you have to go figure out where is the initialization code that slows everything down, and start figuring out how to modify your code to make that load first, which leads to a mess.
If your python module takes a long time to load, this is a module problem. There is a reason why you can import submodules of modules directly, and overall the __init__.py in the module shouldn't import all the submodules by default. Structure your modules so they don't do massive initialization routines and problem solved.
Furthermore, because of pythons dynamic nature, you can do run time imports, including imports in functions. In use, whether you import something up at the top and it gets lazily loaded or you import something right when you have to use it has absolutely no difference other than code syntax, and the latter is actually better because you can see what is going on rather than the lazy loading being hidden away in the interpreter.
Or if you really care, you can implement lazy work process inside the modules, so when you import them and use them the first time it works exactly like lazy imports.
To basically spend time building a new interpreter with lazy loading just to be able to have all your import statements up at the top just screams that those devs prefer ideology over practicality.
> Its a finance firm - i.e scam firm. "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
HRT trades their own money so if it didn't beat VOO then they'd just buy VOO. There are no external investors to scam.
> "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
You wish lol. How do you think they pay for all the developers?
Firms like HRT don't even take outsider money, they don't really need to.
And besides, we don't get paid for beating stocks, a lot of funds will do worse than equities in a good year for the latter, the whole point is that you're benchmarked to the risk free rate because your skill is in making money while being overall market neutral. So you rarely take a drawdown anywhere near as badly as equities.
As a service this is often a portfolio diversification tool for large allocators rather than something they put all the money into.
It is true however that some firms are basically just rubbish beta vehicles that probably should in an ideal world shut down.
I don't know what you define as outsider money, but the fact is that they are a market maker, and you are never going to make large amounts of money on arbitrage by itself.
Two business models:
Good returns - take other peoples money, trade it, take 20% of profits
Excellent returns - trade your own money, make a bit less overall but keep 100% of profits
If the first case, whats the incentive for users to trade with you
In the second case, why start a company?
Not sure what you mean
Prop shops usually come about as the partners buy out other investors in a fund
> Its a finance firm - i.e scam firm. "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could". > Scammers are not tech people. And its pretty from their post.
It would be great if you included any sort of evidence or argument.
Reading on to the other comments, it looks like you're throwing out a lot of accusations and claims. I don't know what you think you know, but from the looks of it, you don't really know HRT's business. I don't really these days, but I knew it years ago, and it's not from taking client money or arbitrage or some weird scam. It's not magic but the world of algo trading isn't a ponzi scheme.
> "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could".
How they make $8B/y underperforming VOO?
Reference: https://www.businessinsider.com/hudson-river-trading-hrt-8-b...
Man from all these responses, so many people are unaware of the finance world, lol. Or just bots posting for HRT
In case you are unaware - very single trading firm makes money on fees, not by outperforming the market. This goes for firms like Vanguard too.
Just think about it for a little bit - if you could reliably outperform the market by any % with an algorithm why even start a company? Just take out loans, invest, make money, repeat and become rich. No expenses to manage a company.
You have no concept of the infrastructure and organization necessary to operate these enterprises.
All your posts here are low-information anti-finance rants.
> Because every single trading firm makes money on fees
I really want to know what you think this company does, precisely
I can't tell if people don't understand how financial firms work or are you just being sarcastic.
If a company has customers, and those customers buy a product, the company charges a price for that product.
They're a market market that provides liquidity to clients. They (as far as I know) don't charge fees, but earn their profits by exploiting the ask/bid spread.
A customer might want to offload TSLA and are willing to pay the market rate of $329. This trade might work in HRT's favor if they can sell it for $329.01. It's just 1 cent, but over millions of transactions these small amounts of profit add up.
The value captured by HRT is meaningful in the aggregate, but tiny and irrelevant to the institutional clients, and therefore can't be thought of as a fee. What HRT provides in return for taking clients' trades is liquidity.
> If a company has customers
They don't...
a) What do you call this at the bottom of the page https://www.hudsonrivertrading.com/liquidity/
b) If you don't have customers, why have a company?
I think you're misunderstanding what that page is: it's not an advertisement to invest with the company, it's an advertisement to trade via/with the company in the same way you might otherwise go manually trade from a Bloomberg terminal (or any other method).
There is no way to invest in the company, and the only way of becoming a "customer" is to engage in trading.
I don’t really understand question b), how else would you organize a venture involving more than a couple people?
I'm not sure how you intend performing the "take out loans" and "invest" stages without a company, once you have more than a few million dollars. Companies permit scaling past one person's worth of uptime and ability.
HRT is a prop firm, which means they don't have outside investors. They invest their own money like you said, no income from fees. You just proved that you have no idea what you are talking about.
No, they are a liquidity provider. Liquidity providers only make money when there are imbalances in the market. Notionally because markets tend to drive fast towards efficiency, you can't realistically make money just being a market maker.
So then, you have to offer additional services. HRT has the SDP that they provide, and of course charge fees for. But then the question is why would anyone do this, versus just going through any other financial institution.
The answer is basically all up on their website
"As a liquidity provider, HRT develops automated trading algorithms designed to provide the best prices to our clients".
The question that should be asked is as a user, why would I want to sign up with HRT or any similar financial company? The answer for HRT is because you want to have access to more complicated financial derivatives - you don't need to sign up and pay fees to buy basic stock.
So they promise that their algorithms give the user the best price, which is a legal way of saying that you will pay less for a certain asset and make money on it, and you can't say that because you can't guarantee this.
And its well known that nobody ever gets rich of an algorithm in finance unless you are well established large firm intricately tied with the government that can move so much money as to influence trading.
Being a market maker vs "We have a fancy trading algorithm that statistically is never going to outperform just buying VOO and holding it, but the thing is if you get lucky, it could" are two very different things lol
Moreover, I would be very surprised if the majority of their $8 billion annual profit came from client market making.
Right, exactly what I said. They don't make money by market making. They make money by charging transaction fees, and on an access basis to their "algorithms" which are designed against analyzing the complex futures that they are the market maker for.
The incentive for users to sign up with them is to get access to "better" pricing for whatever commodity they pair the buy/sell orders for - but remember these are futures so its all betting, and so the algorithms don't really mean anything.
They don’t charge fees, because they’re not a brokerage or exchange.
They pay fees to exchanges.
As a market maker, some rebates are given back conditional on their activity.
They have no users.
You’re just constantly obliviously asserting falsehoods that betray an almost comical lack of understanding of the reality of these businesses.
My understanding of these firms is limited too, but I’ve never heard of market makers charging transaction fees.
Isn’t it actually the opposite? they pay for order flow instead? They should be making money from bid-ask spread, not fees.
> Right, exactly what I said. They don't make money by market making
“client market making”. That’s very different from “market making”, you two are not in agreement at all here
Prop trading firms usually have returns higher than 25%, which is way higher than holding the S&P.
You're confusing prop shops and hedge funds.
Runtime imports are a maintenance nightmare and can quickly fragment a codebase. Static analysis of imports is so desirable that it is almost always worth the initialization performance hit. Tradeoffs.
Static analysis of imports should be solved by mypy (or your favorite static analyzer).
I guess you meant "run all imports at startup is desirable to check if they work", but I have a hard time agreeing with that (personally I think having a good test suite is needed, whereas running more code at startup is not wanted).
>Runtime imports are a maintenance nightmare and can quickly fragment a codebase
I agree, which is why you should design your modules correctly and import only the stuff you need.
I was pointing out that lazy imports vs runtime imports in functions are basically the same and lead to the same issues.
> I agree, which is why you should design your modules correctly and import only the stuff you need.
How do you achieve this without making gazillions of modules, where each module has just a few stuff?
Are you saying just use local import everywhere?
You can have one module with all the capability, but the capability is separated into submodules. And the __init__.py of the module doesn't auto import the submodules
So when you do this
It doesn't do anything functionality, and you may only have some small top level things available for you, like bigmodule.config, or bigmodule.logging.Then, you have your big initializer code in bigmodule.financedata. But the stuff you need for running scripts is in bigmodule.scripts.
So when you write
This code will take a while.But if you write
This will load fast.You don't need to have gazillion modules, just good organization. Also, in general, its a good practice to gate intensive compute/network operations behind an explicit function you need to call.
Also thank you for focusing the convo on the tech stuff instead of repeating finance bro myths
That sounds good in theory, but I haven’t come across a codebase that’s really optimized and free of unnecessary imports, especially in large companies.
And when you have to depend on external libraries beyond your control, how do you typically handle those situations?
As codebases get larger, inefficiencies are bound to happen. Having explicit imports that I can go look up and see where they are is better for resolving this, because you can trace how every module gets imported and time them. Having interpreter code that runs with lazy loading that is all hidden is not the way to solve this.
As for external libraries, you import them in places where you need to use them only to avoid the same pitfalls. Its also pretty easy to analyze the import process within those libraries, and then again import specific submodules only that limit what actually gets loaded.
[dead]
[flagged]
When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."
Please don't post shallow dismissals...
https://news.ycombinator.com/newsguidelines.html
[flagged]
While I see the usefulness of lazy imports, it always seemed a bit backward to me for the importer to ask for lazy import, especially if you make it an import keyword rather than a Python flag. Instead I'd expect the modules to declare (and maybe enforce) that they don't have side effects, that way you know they can be lazily imported, and it opens the door for more optimizations, like declaring the module immutable. That links to the performance barrier of Python due to its dynamic nature as discussed in https://news.ycombinator.com/item?id=44809387
Of course that doesn't solve the overhead of finding the modules, but that could be optimized without lazy import, for example by having a way to pre-compute the module locations at install time.
> it always seemed a bit backward to me for the importer to ask for lazy import, especially if you make it an import keyword rather than a Python flag
Exactly this. There must be zero side effects at module import time, not just for load times, but because the order of such effects is 1) undefined, 2) heavily dependent on a import protocol implementation, and 3) poses safety and security nightmares that Python devs don't seem to care much about until bad things happen at the most inconvenient time possible.
> Of course that doesn't solve the overhead of finding the modules, but that could be optimized without lazy import, for example by having a way to pre-compute the module locations at install time.
1) opt for https://docs.python.org/3/reference/import.html#replacing-th...
2) pre-compute everything in CI by using a solution from (1) and doing universal toplevel import of the entire Python monorepo (safe, given no side effects).
3) This step can be used to scan all toplevel definitions too, to gather extra code meta useful for various dynamic dispatch at runtime without complex lookups. See for example: https://docs.pylonsproject.org/projects/venusian/en/latest/i...
3) put the result of (2) and (3) as a machine-readable dump, read by (1) as the alternative optimised loading branch.
4) deploy (3) together with your program.
For optimizing the module finding, using a custom import hook was indeed what I had in mind!
> This process gets dramatically slower for … modules on distributed file systems, modules with slow side-effects
Oh no. Look I'm not saying you're holding it wrong, it's perfectly valid to host your modules on what is presumably NFS as well as having modules with side effects but what if you didn't.
I've been down this road with NFS (and SMB if it matters) and pain is the only thing that awaits you. It seems like they're feeling it. Storing what is spiritually executable code on shared storage was a never ending source of bugs and mysterious performance issues.
Gonna call this an antipattern. Do you need all those modules imported in every script ? Well then you save nothing on loadup time, the time will be spent regardless. Does every script not need those imports ? Well they shouldn't be importing those things and this small set of top level imports should be curated into a better, more fine grained list (and if you want to write tools, you can certainly identify these patterns using tooling similar to that which you wrote for LazyImports).
There are often large programs where not every invocation imports every module.
The lazy import approach was pioneered in Mercurial I believe, where it cut down startup times by 3x.
Or, here's an idea: don't write a CLI on the hot path of a developer's flow in a scripting language. No wonder it lost out
I don't disagree, but I mean it was either that or C + shell back in the early 2000s, and C + shell is notorious for its non-portability across Unix and Windows—partly why Git on Windows requires an entire MSYS installation.
Today, it would be a mistake to use anything other than Rust (hence Jujutsu carrying the flame forward).
For personal one file utility scripts, I'll sometimes only import a module on a code path that needs it. And make it global if the scope gets in the way.
It's dirty, but speeds things up vs putting all imports at the top.
I wonder how much can be saved by using a local file system for imports though. In my testing just a mere presense of a home directory on NFS already dramatically slows down imports (by ~10x) due to Python searching for modules in home directory too by default.
to prevent this, set PYTHONNOUSERSITE=1 will prevent searching for modules in ~/.local/ (for convenience, try calling python through a wrapper in your project, say bin/run-python, and there you can set all the python-specific environment variables you need, set at the time of execution and not have to worry about setting them in the user's shell etc)
Thanks, yeah I know that it works, what I meant is that it may be quite easy to compare the module import times with and without that env variable to see how much impact an NFS home directory has (and it's a lot), and possibly draw similar conclusions about the distributed file system behaviour in general too
It'd have been really nice to have that PEP in as it'd have helped me not have to write local imports everywhere.
As it is, top-level imports IMHO are only meant to be used for modules required to be used in the startup, everything else should be a local import -- getting everyone convinced of that is the main issue though as it really goes against the regular coding of most Python modules (but the time saved to start up apps I work on does definitely make it worth it).
Yeah, imo that's the way that python should've worked in the first place.
Import-time side effects are definitely nasty though and I wonder what the implications on all downstream code would be. Perhaps a lazy import keyword is a better way forward.
The author interviewed me and talked about this project, so it was cool seeing a blog post posted about it
We need something like the ancient unexec from Emacs to dump out Python images. More generally, we need something like that for generic checkpointing, maybe based on CRIU.
> we support the Steering Council in their rejection of PEP 690—the implicit lazy imports are not a good fit for upstream due to the same, subtle bugs we encountered during our migration. However, as time permits, we hope to propose a revised lazy imports PEP that introduces an explicit lazy keyword, e.g. lazy import foo or lazy from foo import bar. This approach will satisfy migration and compatibility concerns, allow users to opt-in gradually, and enable all Python users to reap the speed benefits of lazy imports in a safe way.
> python
> monorepo
> vast proliferation of imports
> large modules
> distributed file system
> side-effects
> many transitive imports
This sounds like a very optional problem to have.
Trading companies are really disciplined about their tech stack.
If only compiled languages with dead code elimination existed...
I know few modules that can take seconds to import but would have been nice to hear how much they actually gained?
Also maybe, if this approach could yield stats on if some import was needed or not ?
> There’s also no way to make imports of the form from module import * lazy
I'd say if you see
Its probably 99% safe to pull that from a quick run over of the AST (and caching that for the later import if you want to be fancy)Of course, should one be doing a star import in a proper codebase?
Hmm it strikes me that is they really wanted to go this lazy route, they could've implemented an import hook, instead of creating and maintaining an entire fork.
Yeah I wonder what led them down the drastic fork trajectory rather than considering this approach… kind of interesting this wasn’t even acknowledged in the article
I thought HRT was a Cpp shop? Is Python used in their main business applications, or more for quants / data scientists?
every quant shop has QR and QT people that can barely write passable python let alone cpp - then the QD people have to integrate that stuff with prod cpp pipelines.
alpha in being good at both (if nothing else you can keep more of the desks pnl...)
In my experience it tends to be the opposite — I am not a quant (QD) but having worked with a few teams there’s a negative selection for technical expertise. QRs who are good at programming are usually pushed into maintaining infrastructure, datasets, or just tooling for less technical members of their team, who then get to use those tools to further their own alpha generation. Orgs incentivize the final step in making alpha — spend too much time helping others or building reusable research, and your coworkers steal the thunder.
That, or stop helping your coworkers/accommodating them… risky, as a career move. Only seen that work once.
I really really want lazy imports in Python, it's would be a godsend for CLIs
Libraries for this have always existed, triggering import on first access. The problem was, they would break linters. But that's not an issue anymore with typing.TYPE_CHECKING.
A PEP is very much welcome, but using lazy import libraries is a fairly common, very old, method of speeding things up. My pre PEP 690 code looks like this:
Well if you use argparse or one of the many argparse wrappers for a moderately complex CLI you end up lazyfing the CLI parser itself because just fully populating the argparse data structures can easily take half a second or more, so with other startup costs you easily end up with "program --help" taking >1s and any CLI parsing error also taking >1s.
Imagine if these guys put their intelligence towards improving the world.
Not sure why you're throwing shade on people making a living. Go lobby to your representative if you think the financial market should be changed instead of belittling folks doing a job.
Oh, it's a trading firm. That's why they can fund an internal fork of Python... That sounds nice...