Hacker News

Show HN: PyMOL-RS – Rust reimplementation of PyMOL with modern rendering

5 points by zmactep a month ago | 5 comments

Well, it happened. After endless release candidates, we've finally made it to v0.1.0.

What's inside:

GPU-accelerated rendering with WebGPU shaders, shadows, and goodies like silhouette edges and a special soft-light mode

Core operations run up to 1000x faster than the original PyMOL. Surface generation that used to send you on a coffee run now finishes the moment you hit the button

Full PyMOL selection algebra support — 95+ keywords, boolean logic, distance/expansion operators, slash-macros

Distance, angle, and dihedral measurements, atom labels — everything you need for structural analysis

Python API — from pymol_rs import cmd and you're right at home

PDB, mmCIF, BinaryCIF, SDF/MOL, MOL2, XYZ, GRO — read and write, automatic format detection, transparent gzip decompression

Scenes, movies, ray tracing — all on the GPU

Kabsch superposition, CE alignment, RMSD, DSS, symmetry across all 230 space groups

Out of PyMOL's 798 original settings, some number of them actually work. Nobody knows exactly how many, but it's definitely in the hundreds. Plus we've added new settings that the original never had — like per-chain surface generation

13 independent crates — if you're writing Rust, you can use just the selection parser, the file readers, or the full GUI. No monolith

dalke a month ago [ - ]

The README says "PyMOL-RS is a clean-room rewrite" but when I look at ./pymol-mol/src/dss.rs I see things like:

  //! - PyMOL's layer3/Selector.cpp - SelectorAssignSS function
  //! - PyMOL's layer2/ObjectMolecule2.cpp - ObjectMoleculeGetCheckHBond function
  //! - PyMOL's layer1/SettingInfo.h - Default angle thresholds

and "matching PyMOL's cSS* flags from Selector.cpp"

While the Rust code is cleaned up and easier to read, I can see that it preserves similar data flow, uses similar variable names, and of course identical constants.

For example, this is PyMol layer3/Selector.cpp:

          /* look for antiparallel beta sheet ladders (single or double) 
          ...
          */

          if((r + 1)->real && (r + 2)->real) {

            for(b = 0; b < r->n_acc; b++) {     /* iterate through acceptors */
              r2 = (res + r->acc[b]) - 2;       /* go back 2 */
              if(r2->real) {

                for(c = 0; c < r2->n_acc; c++) {

                  if(r2->acc[c] == a + 2) {     /* found a ladder */

                    (r)->flags |= cSSAntiStrandSingleHB;
                    (r + 1)->flags |= cSSAntiStrandSkip;
                    (r + 2)->flags |= cSSAntiStrandSingleHB;

                    (r2)->flags |= cSSAntiStrandSingleHB;
                    (r2 + 1)->flags |= cSSAntiStrandSkip;
                    (r2 + 2)->flags |= cSSAntiStrandSingleHB;

                    /*                  printf("anti ladder %s %s to %s %s\n",
                       r->obj->AtomInfo[I->Table[r->ca].atom].resi,
                       r->obj->AtomInfo[I->Table[(r+2)->ca].atom].resi,
                       r2->obj->AtomInfo[I->Table[r2->ca].atom].resi,
                       r2->obj->AtomInfo[I->Table[(r2+2)->ca].atom].resi); */
                  }
                }
              }
            }

and this is pymol-rs's pymol-mol/src/dss.rs

        // Antiparallel ladder: i accepts j, (j-2) accepts (i+2)
        if a + 2 < n_res && res[a + 1].real && res[a + 2].real {
            for &acc_j in &acc_list {
                if acc_j < 2 || !res[acc_j].real {
                    continue;
                }
                let j_minus_2 = acc_j - 2;
                if !res[j_minus_2].real {
                    continue;
                }
                let acc_jm2_list: Vec<usize> = res[j_minus_2].acc.clone();
                for &acc_k in &acc_jm2_list {
                    if acc_k == a + 2 {
                        res[a].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                        res[a + 1].flags |= SsFlags::ANTI_STRAND_SKIP;
                        res[a + 2].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                        res[j_minus_2].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                        if acc_j >= j_minus_2 + 2 {
                            res[j_minus_2 + 1].flags |= SsFlags::ANTI_STRAND_SKIP;
                        }
                        res[acc_j].flags |= SsFlags::ANTI_STRAND_SINGLE_HB;
                    }
                }
            }
        }

That's close enough that I really think you should include the PyMol license info, before Schrödinger's lawyers notice.

dalke a month ago [ - ]

Over the last few days I have learned that using code generation tools are increasingly used to create a "clean room" version of a product, using a definition which is far from its standard use.

See https://tuananh.net/2026/03/05/relicensing-with-ai-assisted-... with discussion at https://news.ycombinator.com/item?id=47257803

I believe your use of "clean room" is another example of misusing the term.

Could you clarify how it was developed? Who had access to the original source code? Were code generation tools used, and if so, how? Was the PyMol source part of the training set for those tools? How did you ensure no copyright violations?

Warren was a friend of mine, and a passionate believer in open source software. He wanted people to be able to modify PyMol for their own purposes, and asked only for a license acknowledgment. Schrodinger, to their great credit, continues to honor Warren by maintaining the Open-Source PyMOL product.

If this project was not developed under true clean room practices, I ask that you continue to honor his work by including the PyMol license in your Rust rewrite.

If it was true clean room development, why does ./crates/pymol-algos/src/align/ce.rs say "This is a faithful port of PyMOL's `ccealignmodule.cpp`.", with comments like "Equivalent to PyMOL's `calcS`" and references to the original code in comments, like: "PyMOL: for (row = 0; row < wSize - 2; row++)"?

zmactep a month ago [ - ]

Dear Andrew, first of all — thank you so much for your feedback, both the technical and the legal parts. Your earlier comments about SDF and PDB parsing corner cases are incredibly valuable.

PyMOL has been one of my primary tools for 15 years, and I've always held it in deep respect. This project was born entirely out of a desire to contribute something to molecular visualization in the modern world — something fast, modular, and with qualities I've been missing in existing tools. And as a source of inspiration, I took the best one: PyMOL.

Of course I spent a lot of time reading and studying its code, and I openly took concepts and algorithms from it. I don't hide that — it's why the project carries the name it does, and it's why the README has had an Acknowledgments section since the very first commit: "Inspired by PyMOL, created by Warren Lyford DeLano. This is an independent reimplementation, not affiliated with Schrödinger, Inc."

You are absolutely right about the "clean-room" wording — I used it loosely, meaning "rewritten from scratch in a different language with a different architecture," not in the legal sense. That was misleading, and I've already removed it from the README.

You're also right that DSS or CE was a fairly direct port of PyMOL's algorithm, and it should carry proper attribution. At the same time, many other parts — surface generation, cartoon rendering, the shading pipeline — are done quite differently, and the gap keeps growing. But that doesn't excuse insufficient attribution where code was closely followed.

Going forward, I'm focusing on genuinely new functionality — Rust plugin system, web interface, novel shading models (try set shading_mode, skripkin!) — things the original PyMOL never had. But this is not an attempt to distance the project from DeLano's creation. It's a respectful continuation of his ideas in a completely new product.

Thank you again — your comments are making this project better.

dalke a month ago [ - ]

You might mention in other forums, like the RDKit mailing list (though that's almost moribund).

I looked at the SDF reader, since that's what I know best. I see a few things which look like they need revisiting.

Line 75 has 'if name == "$$$$" {return self.parse_molecule();}' This isn't correct. This means the record name is "$$$$" (if you are RDKit), or it means the record is in the wrong format (if you are the CTFile specification, which explicitly prohibits that).

Also, does Rust have tail recursion? If not, the recursive nature of the code makes me think parsing a file containing 1 million lines of the form "$$$$\n" would likely blow the stack.

In principle the version number test for V2000 or V3000 should look at the specific column numbers, and not presence somewhere in the line. Someone like me might place a "V3000" in the obsolete fields, with a "V2000" in the correct vvvvvv field. ;)

The "Skip to end of molecule" code will break on real-world datasets. One classic problem is a company which used "$", "$$", "$$$" and "$$$$" to indicate cost, stored as tag data like:

  > <price>
  $$$$

  $$$$

where the first "$$$$" is part of the data item, and the second "$$$$" is the end of the SD record. This ended up causing a problem when an SDF reader somewhere in their system didn't parse data items correctly. (Another common failure in data item parsing is to ignore the requirement for a newline after the data item.)

I talk about "$$$$" more at http://www.dalkescientific.com/writings/diary/archive/2020/0... .

Then there's the "S SKP" field, which you'll almost certainly never see in real life! I've only seen it used in a published example of a JICST extended MOLfile. See http://www.dalkescientific.com/writings/diary/archive/2020/0...

Please don't let these comments get you down! These details are hard to get, and not obvious. It took me years to learn the rare corner cases.

I also haven't done molviz since the 1990s, or used PyMol (I was VMD person), so can't say anything about the overall project. We started with GL, and had to port to OpenGL. :)

PS. A bit of history for you. PyMol's and VMD's selection syntax look similar because both drew on the syntax in Axel Brunger's X-PLOR. Warren DeLano came out of Brunger's lab, and VMD was from Schulten's group, which were X-PLOR users. (Schulten was Brunger's PhD advisor.)

dalke a month ago [ - ]

I looked at the PDB parser.

Will you be adding support for using duplicate CONECT records to store bond type information? That's a RasMol extension that PyMol supports. You'll also need to support the pdb_conect_nodup option in the writer.

I see you interpret atom/hetatm serial numbers as an integer. Will you be using the base-36 or hybrid-36 variants (see https://cci.lbl.gov/hybrid_36/) which is a common way to handle more than 100,000 atoms?

Again, these are corner cases which come with experience. I've no expectation that a new program would handle them. I want you to know about them since they will be issues if you expect long-term uptake.