> whole gestalt of what you do on a computer is now one big programmable surface
Emacs-effect. Use it long enough and everything becomes "one big programmable surface". I've been in that modus operandi for years. Emacs is my "control room", I don't necessarily do everything in Emacs, but for sure it converges all into it - everything flows through Emacs. I control my WM directly from the REPL inside Emacs. I can grab a content from a tab in my browser - I have access to my browser history, and all the tabs, I can switch to any tab, close and re-order them. I can grab a text selection on the page, I can extract entire readable corpus of an article while ignoring all the irrelevant fluff - banners, ads, buttons, etc. It works even for js-rendered content (React, et al.). I play all videos controlling them directly from Emacs - even though the video itself is playing outside, in mpv. I still can pause, change volume, fast forward, speed-up, extract transcript, etc. All without leaving Emacs. That's pretty useful when taking notes. I can grab any text I see on the screen. Even if it's in Slack.app. Why, If I can read it, there's no reason why Emacs shouldn't be able to. I can grab any region on my screen with Flameshot, it goes through Emacs, runs tesseract and OCRs the text out of it. Useful when someone's screen sharing in Zoom. This was all possible before LLMs. Now, LLMs running in Emacs can do some crazy, wild stuff.
Wild!! Would love to look at your .emacs.d configs
mpvi [1] is the video control part. I have only used it a little bit but it is incredibly good. Control the playback completely from Emacs and quickly make timestamped org notes.
I don't know what the other parts are. Curious to learn!
[1]: https://github.com/lorniu/mpvi
> what the other parts are
Of course I can't explain in detail in a single comment everything I listed, besides, that's just a subset of things I do through Emacs.
Sometimes I want to kick off a process in the external terminal - long-lived processes are better handled that way. Kitty has remote protocol. I needed bidirectionally - being able to pipe into and from an arbitrary Emacs buffer to and from the terminal, so I wrote the Piper¹. This kind of stuff should be built into Emacs, maybe someone gets bored and sends patches, if I get to it, perhaps I'd myself do it one day.
I gravitate towards CLIs and tools with built-in IPC layers, Emacs is terrific with inter-process communication. That's how I manage my WM on Mac, where I use Hammerspoon. Wondrous piece on its own, it is Lua-enabled, which means I can use Fennel, which means I can have Lispy-REPL, which means I can connect to it directly from Emacs and manipulate all my windows among a bunch of other things.
In Linux, I have build a similar modal toolkit² (experimental) that is written in Babashka (Clojure), that means I can expose nrepl port and use it with the Lispy-REPL. If you don't know what's such a big deal about it being a Lisp REPL, here's my comment from the other day³.
MPV is amazingly hackable and has an IPC, you can nearly fully control it without touching it. I have built some customizations on top on mpv.el.
To access the browser, I use OSA (open scripting architecture) with some JXA snippets. Unfortunately, there's nothing similar for Linux - the only thing one can do is to run the browser with the RDP port exposed. Although, you still can access the browser history - every major browser keeps it in a sqlite db⁴.
OCRing any text is a straightforward piece of Elisp⁵ - it just checks if there's a graphic content in the clipboard and if it is - it saves it into a temp file and feeds it to tesseract cli. It's not as accurate as most modern OCRs, but it beats everything else in speed. For my purpose (typically grabbing a piece of Zoom screen share) - it works.
Hope this helps.
___
¹ http://github.com/agzam/mxp
² https://github.com/agzam/tecla
³ https://news.ycombinator.com/item?id=48113368
⁴ https://github.com/agzam/browser-hist.el
⁵ https://github.com/agzam/.doom.d/blob/main/modules/custom/wr...