One of the biggest boost to my SWE career was studying theory of computation and programming languages theory. My major was electronic engineering, so I didn't touch those at the university. But I use some books to at least grasp the introductory knowledge.

The boost was how easy it is to find the mechanism to some abstractions. And instead of being those amazing and complex tools that you have to use carefully, they've become easier to use and reason about. I would say very much like the study of forms, value, and perspectives for an artist, or music theory for a musician. It just makes everything easier. Especially the understanding about automata (regex), context-free grammar (programming language), and the Turing machine (CPU).

I have to vote the opposite direction. I don't think it's ever been relevant.

As an example, A practical walk through of what a simple CPU is doing was quite useful for getting the idea of what's going on. The concept of a Turing machine, which was originally intended for mathematical proofs, was not.

The CPU is an implementation of theory. It just happens that the realization of several axioms in the Turing machine is realizable physically. And the rest hold true. The nice thing with notation is that they're more flexible than building/creating/executing the thing they describe.

If you look back at the history, CPUs were often designed for specific tasks, and some of the early ones lacked conditional jumps that you'd normally consider nessesary for a Turing machine.

We've made them more general purpose as that sells better, not because people cared about making Turing machines.

Jump is not required for a Turing machine, only being able to move to a specific case by sliding. Conditional jump is an abstraction. Just like looping.

[deleted]

I suspect you have forgotten this was a conversation about relevance in the day to day.

TM is the basic theory. It's a more advanced finite automata that is able to modify it's own instructions. The latter can be generated from a context free grammar. Both CFG and FA have a recursive nature that allows to build more advanced mechanism from those simple elements. And that's how you get CPUs (and more specialised ones like GPUs, NPUs, etc) and programming languages.

There are other computation mechanism like lambda calculus and Hoare logic. They all can be mapped to each other. So what we usually do is to invent a set of abstractions (an abstract machine) that can be mapped down to a TM-equivalent. Then denote those abstractions as primitives and then build a CFG that we can use to instruct the machine.

In the case of CPUs, the abstractions are the control unit, the registers, the combinational logic, the memory,... and the CFG is the instruction set architecture. On top of that we can find the abstract machine for the C programming language. On top of the latter you can find the abstract machine for programming language like Common Lisp (SBCL) and Python.

It's turtles all the way down. You just choose where you want to stop. Either at the Python level, the C level, the assembly level, the ISA level, or the theorical TM level.

A Turing machine isn't the only way to improve a finite automaton to make it universal. It doesn't have to be the bottom turtle.

I think it's pretty clear you just want to talk about the theory. You're simply ignoring what I'm saying.

Turing machine works going left or right writing/reading on a tape of 1s and 0s. Your CPU works on RAM writing and reading 1s and 0s.

CPU in principle isn't that different and is largely just using a more sophisticated instruction set. Move 5 vs right right righy right right. swap vs read, write, right, write. etc

Definitely not necessary for programming but it's not some completely theoretical mumbo jumbo.

Moreover what's really interesting to me is that Turing came up with the idea from following what he himself was doing as he did computation on graph paper

> Definitely not necessary for programming but it's not some completely theoretical mumbo jumbo.

While I'm a big fan of teaching theory, I regret to inform you that the Turing machine is kind of completely theoretical mumbo jumbo. The theoretical equivalent of the modern processor is the Von Neumann machine. Certainly there is a direct connection to be made to the Turing machine, as all computation can be framed as a program run on such a theoretical device, but there is not really much practical use in teaching students that modern computers are Turing machines. There's just too much difference, I think.

The utility of the Turing machine as a pedagogical device stems more from, like, being able to formalize a decidable problem in a particular way, following a rigorous algorithmic approach to computation, thinking about abstraction, etc. I think the lambda calculus is also useful to teach for similar (though also different) pedagogical reasons, but I would never tell students that the lambda calculus "in principle isn't that different" from how their favorite language works. Practically all programming languages can be framed as lambda calculi, but some of them are sufficiently far removed from the theory for that comparison to be mostly useless to the majority of students.

I'm not sure what were arguing to be honest. You definitely don't need to understand Turing machines to understand how computers work, and certainly not how to do programming.

But as far as understanding computer science, computational theory, etc certainly you'd want to study Turing machines and lambda calculus. If you were say, writing a programming language, it would be nice to understand the fundamentals.

I mean, I don't think Turing machines or Lambda calculus are even that far removed to call them completely theoretical. You can easily implement a few functions in lambda calculus that already resemble modern programming interfaces.

Suppose I were to teach theoretical CS using only RAM models of computation, with no reference to Turing Machine tapes. Would there be any downside to doing this, pedagogically? (Other than, of course, the backward compatibility concern of students being able to engage with existing literature, which is the main reason this isn't done I think)

If you were teaching theoretical CS, I think you'd certainly want to use a lower level abstraction than RAM. (And RAM really is not unlike a Turing machine tape, except its chunked in bytes and addressable, but in principle plays the same role and has the same operations that can be performed on it which is, move to the left, move to the right, read at the current location, write to the current location. And the modern CPU instruction set isn't really all that different in principle either as if you look at it, its mostly using higher level instructions for accomplishing the aforementioned operations. Eg. Move x, versus 5 right or left operations. But you can certainly write a Turing Machines which implement such an instruction set, and have an easier programmable Turing Machine).

Now TMs certainly are not the only more fundamentals models of computing, but they are certainly interesting nonetheless, and have an for the influence for the Von Neuman Architecture and modern computers.

If I were studying theoretical CS, Id want to learn about TMs, lamba calculus, FSMs, queue automatica, all of it. If you just told me about how modern computers work, Id be left wondering how anyone even conceived this idea.

And as I said in my earlier comment, when you get down to the core of it, what's really interesting to me, and this is readily apparent if you reading Turing's 1936 paper, is that Turing very much came up with the idea thinking about how HE does computation on graph paper. That to me is such an essential fact I would not have want to missed, and I wouldn't have known it lest I actually read the paper myself.

You could do this with the C abstract machine. because it’s Turing complete. But we go with TM because they’re the most basic. Anything else is an abstraction. So you can stop at any level you like.

> You could do this with the C abstract machine.

I don't think there is a "C abstract machine". I think there are some, like, minimal versions of C (there's probably a Featherweight C out there or something), but the C specification is not given in terms of an abstract machine, and any abstraction of the machinery of C will necessarily lose important implementation details (e.g., undefined or unspecified behaviors, for instance).

In what way is a TM the most basic? If you mean in terms of simplicity I think a queue automatic is simpler to explain.

The usual theory of computation textbooks usually goes on to explain finite automata first, then context free grammar and finally the Turing machine. Why? Because the execution part of a TM is a finite automata, while the instructions can use the generative process of CFG. And then you got the whole computing world. Just from these principles (the other chapters are mostly about what's possible and what's not and exploration of other properties)

The nice thing about finite automata is that they can be composed together. That leads to a recursive pattern. This leads to the creation of higher abstractions, leading to general purposes CPU and special processors (gpu, npu, cryptographic chipset, hardware encoding,...). The same things applied to CFGs lead to the creation of programming languages.

> If you were say, writing a programming language, it would be nice to understand the fundamentals.

I work in programming languages research. Nobody uses Turing machines for anything. We do talk about decidability when it comes to certain things (type-checking, for instance), but that doesn't use Turing machines directly.

The lambda calculus comes up frequently, on the other hand, but it's used as a base from which we build minimal languages to demonstrate a novel concept. It's a nice theoretical framework around which to model a language abstractly, but essentially no major languages are actually implemented as extensions to the lambda calculus. (The notable exception is, of course, Haskell, but Haskell was explicitly designed in this way because it was intended as a sort of playground for programming languages research — see "A History of Haskell: Being Lazy with Class" from HOPL-III, 2007.) For example, Rust's region-based memory management descends from Cyclone, and iirc (it's been a few years) Cyclone was formalized as an extended lambda calculus, but nobody would suggest that becoming a contributor to the Rust language would require understanding the lambda calculus.

> I don't think Turing machines or Lambda calculus are even that far removed to call them completely theoretical. You can easily implement a few functions in lambda calculus that already resemble modern programming interfaces.

1. Whether your second statement is true is irrelevant to the first. The lambda calculus and Turing machines are theoretical devices. That's a statement of fact, not an indictment of utility. They're abstract machines used to reason about the theory of computation.

2. Your second statement is just false, actually. The pure lambda calculus doesn't have any kind of value other than anonymous functions; you need to get into, e.g., Church encoding to take them anywhere "useful", and that's pretty far removed from how actual language implementations work. If you go to the simply typed lambda calculus with some base types (integers and Booleans, for instance), okay, great, but you still are so abstract as to be sufficiently removed from actual implementations that the connections are not direct. Even Haskell, the most lambda calculus-y language out there, is actually based on System-F — which is like STLC but adds universal quantification over types, allowing for parametric polymorphism. And that's such a significant addition that most theorists are pretty quick to argue that Haskell is really a System F derivative rather than a direct lambda calculus derivative (though, of course, all computation can be derived in terms of the lambda calculus).

Considering we can implement and simulate a Turing machine and get actual calculations, how do you consider them to be completely abstract/theoretical?

If the only principle you look at is "does this compute?" then they're not that different. Otherwise they're about as far apart as you can get.

A Turing machine (at least one that isn't designed in some super wacky way to prove a point) has a few bits of internal state and no random access memory. If you want RAM you have to build a virtual machine on top of the Turing machine. If you try to program it directly it's going to be a byzantine nightmare.

Being restricted to tape, and in particular a single tape, makes Turing machines absolutely awful for teaching programming or how a CPU implements an algorithm. Instructions, data, and status all have to fit in the same place at the same time.

It's so bad that Brainfuck is an order of magnitude better, because it at least has separate instruction and data tapes.

from your other comment > But as far as understanding computer science, computational theory, etc certainly you'd want to study Turing machines and lambda calculus. If you were say, writing a programming language, it would be nice to understand the fundamentals.

Turing machines are not fundamental. They're just one way to achieve computation that is close to minimal. But they're not completely minimal, and I strongly doubt studying them is going to help you make a programming language. At best it'll help you figure out how to turn something that was never meant to compute into a very very bad computer.

Turing machine in essence is a finite state machine + memory (the tape) + some basic instructions for reading and writing to the memory.

Its a very simple, rudimentary computer, not some completely abstract mathematical object, which was what I was responding to.

With universal turing machines, its not difficult to start writing composable functions, like an assembly instruction set, adders, multipliers, etc.

TMs certainly arent fundemental, but when you look at TMs, lambda calculus and understand why they are equivalent, wouldnt you say you gain an understanding of what is fundamental? Certainly constructions like for loops, the stack etc are not fundamental, so youd want to go deeper in your study of languages

And a barebones traditional CPU is a finite state machine plus random access memory. It teaches you mostly the same things about how you put together simple components into universal computation, while having programs that are far easier to comprehend.

And then for another perspective on computation, lambda calculus is very different and can broaden your thoughts. Then you could look at Turing machines and get some value, but niche value at that point. I wouldn't call it important if you already understand the very low level, and you should not use it as the model for teaching the very low level.

>while having programs that are far easier to comprehend.

If you want to learn the fundamentals of something, should you not wish to you know, think about the fundamentals?

My argument is that FSM+tape and FSM+RAM are at the same level of "fundamental", but one is easier to understand so it should be the thing you teach with. Being more obtuse is not better.

One realization with TM is that programs and data are essentially the same and separation is usually imposed. When you think about your program as data, it’s hard to not notice patterns and you start to yearn for metaprogramming to more expressively express those.

I completely disagree. For example understanding the theory gives me a very powerful bullshit detector because entire categories of problem which might sound merely difficult are in fact Undecidable.

Knowing that the actual regular expressions can be recognised by a finite automaton but PCRE and similar "LOL, just whatever neat string matching" extensions cannot means you can clearly draw the line and not accidentally make a product which promises arbitrary computation for $1 per million queries.

I agree that understanding something of how the CPU works is also useful, but it's no substitute for a grounding in theory. The CPU is after all obliged to only do things which are theoretically possible, so it's not even a real difference of kind.

> One of the biggest boost to my SWE career was studying theory of computation and programming languages theory.

I totally agree with you.

> My major was electronic engineering, so I didn't touch those at the university.

Unfortunately, ever more computer science graduates lack the same belief, even if they go to "top" university programs.

Of course, I think part of the problem is the way the material is presented; "learn this to pass the exam" without fundamental motivation is pretty much always a bad set-up to get students interested. But, speaking anecdotally, a large part also seems to be a semi-recent surge of undergraduate students who genuinely believe that their academic career is nothing more than a hurdle to be muddled through in service of getting a piece of paper that lets them get a high-paying, low-labor-intensity job. They just don't engage with theory-focused material beyond what's strictly necessary to graduate, and then they dump the thoughts from their minds the second the semester is over.

This manifests in all sorts of ways, but a common one is undergrads telling one another how little their professors know about the "real world" because they use (not even that) outdated technology in their lectures. Or maybe the profs use some less popular language and the students think of it as a waste of time. There's comparatively little self-motivation among modern CS students, where historically I think there was rather a lot.

I suppose I'm not really going anywhere specific with this other than complaining, so maybe I can ask: What do you think it was that helped you realize that learning fundamental theory would be beneficial? (I don't have any indication that you were the same kind of student as these that I mention, since you were in a different major altogether, but I'm always looking for insights to help motivate students to study theory more, and insights can come from anywhere.)

Would you have book recommendations / resources to share?

Introduction to the Theory of Computation by Michael Sipser. There's also his course based on the book on YouTube[0].

Language Implementation Patterns by Terence Parr. It avoids the theory in other books, going for a more practical approach.

Then it was just trying a lot of programming paradigms like functional programming with Common Lisp and Clojure, logic programming with Prolog, array and stack programming with Uiua. And reading snippets of books and papers. It was chaotic.

But the most enlightening lesson was: Formalism is the key. You define axioms, specify rules and as long as you stay consistent, it's ok. The important thing is to remember the axioms and understanding the rules.

[0]: https://www.youtube.com/playlist?list=PLUl4u3cNGP60_JNv2MmK3...

I'd recommend codifying such axioms, rules and grammar into a domain-specific language, and then writing your logic in that DSL.

It will keep things consistent and allow newcomers to quickly understand the domain, enabling them to contribute without need for deep institutional knowledge.

That's one of the foundation of Domain-Driven Design. First you try to comes up with a glossary (aka your axioms). Then you'll notice that some have relations with each other and some terms may have the same name, but refers to two different concepts (or two parts of the same whole). So now you will have your boundaries. Then you try to make a subdomain internally consistent. But you still have to communicate with the other subdomains (to enact actions and query data). These communications are equally important as each subdomain. The hope is to have something that reflects the business domain, especially the cost of changes. Something that's easy/hard to change in the business should be easy/hard to change in the code,

With OOP, this often results in verbose code, because each view of the data (which has its own rules) has its own class. With FP, because you often use more primitive data structures, it's easier to commute between subsets of data.

Exactly. Getting DDD to be implemented and followed in past orgs has been difficult but it pays in spades. It's hard enough to push startup engineering teams to use consistent terminology, much less adopt DDD.

I do find it sometimes can lead to less verbose code even in OOP, because it makes logic more focused and DRY. It certainly can greatly reduce the amount of bugs in large codebases.

Might be hard to approach but my favorite is Pierce's "Types and Programing Languages": https://www.cis.upenn.edu/~bcpierce/tapl/ (known as TAPL). Was grateful to take a graduate level class using it in college.

I really like TAPL but would recommend Harper’s Practical Foundations of Programming Languages (PFPL) first (though skip the first 2 chapters I think?).

https://www.cs.cmu.edu/~rwh/pfpl.html

It’s far more directed than TAPL, so I think it’s easier to read from start to finish. TAPL feels better as a reference.

TAPL is nice, and also very dense. I haven't fully read it, but my current understanding is the follow: Both the Turing Machine and the lambda calculus only describes how to act on data. They don't actually describes what the data is. So you can craft an algorithm that works well, but it can be supplied with the wrong type of data and it will goes haywire. Your only recourse is to trust the user of your algorithm.

What type does is to have a set of classes of data, then have rules between those classes. You then annotate your code with those classes (combining them if it's possible) and then your type system can verify that the relations between them hold. To the type system, your program is data.

I think undecidable means more like there are classes of problems for which there is no algorithm (that runs in finite time) can give the answer to. To square N, there is an algorithm that always works. To see if the Turing machine N halts in input M, there is not such an algorithm. The halting problem is undecidable.

There is a clever way to encode Turing machines into Diophantine equations, so Diophantine equations (linear equations

Mertens Theory of Computation

> Mertens Theory of Computation

No such book exists. Do you mean The Nature of Computation by Moore and Mertens?

Sure, M&M's

[deleted]

Really? I studied it in my undergrad degree, and since then have worked as a SWE and not found it relevant at all really

It's the recursive nature of it.

Almost any program can be written in a DSL that solves the class of problem that the program solves. So if you consider the user stories of your project as that class of problem, you can come up with with a DSL (mostly in form of pseudocode) that can describe the solution. But first you need to refine the terms down to some primitives (context free grammar). You then need to think about the implementation of the execution machine. which will be a graph of states (automata). The transition between the states will be driven by an execution machine. The latter needs not be as basic as the Turing machine.

But often, you do not need to do all these stuff. You can just use common abstractions like design patterns, data structures, and basic algorithms to have a ready made solution. But you still have to compose them and if you understand how everything works, it's easier to do so.

it depends what you're working on, if you do any form of program analysis (security, compiler stuff) you bump into undecidability everyday

[deleted]