So here's the thing I struggle with. I do a lot of work in jupyter notebooks. I come up with a new model or approach to some problem, and I want to fork out and test a hypothesis in the background (which might be some set of hyperparameters, and might take several minutes, or hours; call it Run A) while continuing to work down some other path in the same notebook, and maybe kick off a Run B that explores some other change (like a restructure of the code that's not "compatible" with the hyperparameter search of Run A).

Then at some point when Run A finishes, I want to incorporate the changes I made in Run B and kick off Run C, and so on.

The hard/important things are:

1) Being able to do this while staying in a Jupyter notebook context the whole time. Even something as simple as multiprocessing sucks because I've found it's too hard to manage in a Jupyter context (e.g. how do you handle where stdout and stderr go?). It's easier if you move to scripts where you have full support for this sort of thing and you are expecting to look at multiple log files on disk and whatnot.

Also the sequential nature of notebooks doesn't help when you want to occasionally fork out or conditionally run stuff.

2) Keeping track of all these changes and hypotheses and merging the results/code together as you learn. It's like you need a VCS for your hypotheses. Maybe hydra & wandb help with that, I haven't used them. But this idea of keeping track of hypotheses seems like the more fundamental thing.

3) The main reason I prefer to stay in a notebook context is because I have all my objects easily accessible. My models, all my dataframes, functions to do some ad-hoc charting etc, all super easy to access in a REPL-like form. That is invaluable for doing ad-hoc sanity checks or digging/drilling down. So a big part of the workflow is you basically have this in-memory database of a bunch of relevant objects and you're querying it and constructing new objects & visualisations using Python as your tool, without having to load things from disk or build up the context from scratch. It's all "just there".

4) And then sometimes you want to take the results X1 of that notebook and plot them against some entirely different set of data X2 that requires a whole bunch of other code that you've defined in some other notebook somewhere, or maybe even as a real Python module. Like maybe that data lives in a database and you transform it or something. So OK, you call some functions to load X2 within your original notebook, but BOOM you get an OOM and you're like ok now I have to write some code to serialise X1 to disk, and make YET ANOTHER notebook so I can go analyze X1 and X2. It all just seems so... unnecessary, if only the right tooling existed.

My current best approach is to use semantic versioning on the filename, just copy the whole notebook each time I make a fundamental change, and try to keep track of my hypotheses, preconditions, learnings etc within comments and have a few of those on the go running, but it's often hard to engage in critical thinking when everything you know is sprawled across multiple notebooks.

Maybe a simple global journal is the only thing for this sort of use case. And that doesn't even address (4) which is often a huge pain point. Can anyone think of something better?

For me, those side-investigations are often physical experiments, which run on their own time scale. Plus they often run on another computer, to reduce the risk of crashing and losing data, or just physical proximity to the experiment.

How I tie those threads together is by the data that they generate. I use ASDF because it works for the kind of stuff I'm going, but choose your poison. Once the data are in the bag, the cells that analyze or report the results can stay in the same notebook, or be copied into your main notebook. My data aren't so huge that there's much of a penalty in re-loading them.

For me, reproducibility is more important than organization, because I'm not all that organized anyway. So, a single master notebook at the end of a study isn't my top goal.

  > I do a lot of work in jupyter notebooks.
I don't think I'll have good advice for you if you want to use jupyter notebooks, hopefully someone else will. I *hate* notebooks. I think they are great for reports or for demos (especially when teaching), but I honestly do not get how people use them in research or general programming.

I will use vim and ipython thought. If on a remote machine I'll use tmux (preserve sessions) but local I use Ghostty[0]. I can iterate through code this way without the notebook and am far less likely to get caught up by with out of order executions. I can get all that ad-hoc, persistent memory, auto-updating function benefits with ipython (autoreload). But vim and ipython don't require me to lose my modularity, organization, automated record keeping, and the rest.

If it works for you, keep it! I'm just saying what works for me (any switch will cause some disruption). But I do want to stress that there's tons of other options for keeping things in memory without using jupyter notebooks (pdb is also a fantastic tool!). But also be careful because persistent memory can easily bite you in the ass too. Easy to forget what's still in memory. I'll also add, that having also been the person that maintains our lab's compute systems, I'm wildly annoyed with notebooks and VSCode users leaving their workloads in memory. This is a user thing more than a tool thing but there's a tendency here and it eats up resources that other people need. Just make sure to disconnect when you leave the desk.

  > I want to fork out and test a hypothesis in the background 
But my process does greatly help with this! IMO you should be trying to run experiments in parallel. Operating in this style there's no forking, you're just launching another job. I like using a job launcher like slurm when I have multiple machines but just a simple bash script to launch is often more than good enough.

The point is to not fork. You should clone. With changes, I suggest using git branches. But if your code is written to be modular and flexible it is often really quick and easy to add new functions to handle different tasks, add new types of measurements, or whatever.

The two big reasons to write like I do is that

  1) It is (partially) self documenting. You don't have to think about writing down and remembering all your hyperparameters. I'm going to forget and so I need to automate that to prevent this
  2) I'm running experiments! I may be dumb, but I'm not so dumb I think I am not going to change details of my experiments as the project matures.
That's why I say it is about being lazy. I'm writing the way I do because I know that whether it is tomorrow, next week, or 6 months from now, I'm going to need to make changes that are going to make things very different from where they are today. I don't think of it so much as having foresight about the future so much as I'm just frustrated at having to constantly dig myself out of a hole and this makes that a lot easier and lets me get back to the fun exploration stuff faster. It is 100% about having version control over my hypotheses and experiments.

So I'd argue you should move away from notebooks and use other better tools more suited for the job. It'll definitely cause disruption and you're definitely going to be slower at first but find what works for you. The reason people love tools like vim or love working in the cli is because they are modifiable. There's no one tool that works for everyone. I'm not sure there's even a tool that out of the box works for any one person (maybe the original dev?)! But there's a ton of power in having tools which I can adapt to me and the way I work. I can make it help me catch my common mistakes and highlight things I care about. You don't need to spend hours doing this stuff. It develops over time. But go into any workshop and you'll see that everyone has modified the tools for them. We're programmers, we have way more flexibility over customization than people working with physical stuff. Use that to your advantage. And truthfully, you should see how that idea becomes circular here. I'm just designing my code and experiments to be like my tools: environments to be shaped.

[0] https://ghostty.org/