Hacker News

Why not ripgrep?

Why not ugrep?

They are more or less equivalent. One has obscure feature X other has obscure feature Y, one is a bit faster on A, other is a bit faster on B, the defaults are a bit different, and one is written in Rust, the other in C++.

Pick the one you like, or both. I have both on my machine, and tend to use the one that does what I want with the least options. I also use GNU grep when I don't need the speed or features of either ug and rg.

tredre3 2 years ago [ - ]

One thing I never liked about ripgrep is that it doesn't have a pager. Yes, it can be configured to use the system-wide ones, but it's an extra step (and every time I have to google how to preserve colors) and on Windows you're SOL unless you install gnu utils or something. The author always refused to fix that.

Ugrep not only has a pager built in, but it also allows searching the results which is super nice! And that feature works on all supported platforms!

bornfreddy 2 years ago [ - ]

Interesting - for me a built-in pager is an antifeature. I don't want to figure out how to leave the utility. Worst of all, pager usually means that sometimes you get more pages and you need to press q to exit, and sometimes not. Annoying. I often type yhe next command right away and the pager means I get stuck, or worse, pager starts doing something in response to my keys (looking at you, `git log`).

Then again I'm on Linux and can always pipe to less if I need to. I'm also not the target audience for ugrep because I've never noticed that grep would be slow. :shrug:

amethyst 2 years ago [ - ]

You might appreciate setting `PAGER=cat` in your environment. ;)

Git obeys that value, and I would hope that most other UNIXy terminal apps do too.

bornfreddy 2 years ago [ - ]

Oh, wow, thank you! I must try this.

VTimofeenko 2 years ago [ - ]

Some terminal emulators (kitty for sure) support "open last command output in pager". Works great with a pager that can understand ANSI colors - less fussing around with variables and flags to preserve colors in the pager

burntsushi 2 years ago [ - ]

This is what I do personally:

    $ cat ~/bin/rgp
    #!/bin/sh
    exec rg -p "$@" | less -RFX

Should work just fine. For Windows, you can install `bat` to use a pager if you don't otherwise have one. You don't need GNU utils to have a pager.

anjanb 2 years ago [ - ]

hi @burntsushi,

   fan of your tool. like it's speed and defaults.

I use windows : didn't understand what you mean by "install `bat`" to use a pager.

I use cygwin and WSL for my unix needs. I have more and less in cygwin for use in windows.

burntsushi 2 years ago [ - ]

I referenced bat because I've found that suggesting cygwin sometimes provokes a negative reaction. The GP also mentioned needing to install GNU tooling as if it were a negative.

bat is fancy pager written in Rust. It's on GitHub: https://github.com/sharkdp/bat

anjanb 2 years ago [ - ]

I'm sure you know but windows command prompt always came with its inbuilt pager -- more. So, you could always do "dir | more" or "rg -p "%*" | more ". (more is good with colors without flags)

burntsushi 2 years ago [ - ]

I didn't! I'm not a Windows user. Colors are half the battle, so that's good. Will it only appear if paging is actually needed? That's what the flags to `less` do in my wrapper script above. They are rather critical for this use case.

ilyagr 2 years ago [ - ]

I don't believe bat is a paper; it's more of a pretty-printer that tends to call less.

Two pallets that should work on Windows are https://github.com/walles/moar (golang) and https://github.com/markbt/streampager (Rust). There might also be a newer one that uses rust, I'm unsure.

ttyprintk 2 years ago [ - ]

I'd recommend ov for Windows users.

https://github.com/noborus/ov

bat on Windows does page, but I believe it's only available on Choco and not winget.

ilyagr 2 years ago [ - ]

Good find, thanks! I'll check if I prefer it to moar.

As for bat, according to https://github.com/sharkdp/bat#using-bat-on-windows, the Chocolatey package simply installs `less` alongside `bat`. Seems like a good idea, but I haven't tried it.

ttyprintk 2 years ago [ - ]

Ah, thanks for doing the footwork.

MrDrMcCoy 2 years ago [ - ]

For me, it's a lot easier to compile a static binary of a C++ app than a Rust one. Never got that to work. Also nice to have compatibility with all of grep's arguments.

datadeft 2 years ago [ - ]

> to compile a static binary

Cargo is one of the main reasons to use Rust of C++. I am pretty sure there is more involved with C++ than this:

   rustup target add x86_64-unknown-linux-musl 
   cargo build --target=x86_64-unknown-linux-musl

devraza 2 years ago [ - ]

From the ugrep README:

For an up-to-date performance comparison of the latest ugrep, please see the ugrep performance benchmarks [at https://github.com/Genivia/ugrep-benchmarks]. Ugrep is faster than GNU grep, Silver Searcher, ack, sift. Ugrep's speed beats ripgrep in most benchmarks.

codetrotter 2 years ago [ - ]

Does these performance comparison take into account the things BurntSushi (ripgrep author) pointed out in the ripgrep issue link elsewhere ITT? https://github.com/BurntSushi/ripgrep/discussions/2597

Either way, ripgrep is awesome and I’m staying with it.

devraza 2 years ago [ - ]

Agreed - ripgrep is great, and I'm not planning to switch either. The performance improvement is tiny, anyways.

Conscat 2 years ago [ - ]

The best practical reason to choose this is its interactive features, like regexp building.

philkrylov 2 years ago [ - ]

Although being faster in some cases, ripgrep lacks archive search support (no, transparent decompression ignoring the archive structure is not enough) which works great in ugrep.

0cf8612b2e1e 2 years ago [ - ]

I assume the grep compatible bit is attractive to some people. Not me, but they exist.

derriz 2 years ago [ - ]

I find myself returning to grep from my default of rg because I'm just too lazy to learn a new regex language. Stuff like word boundaries "\<word\>" or multiple patterns "\(one\|two\)".

masklinn 2 years ago [ - ]

That seems like the weirdest take ever: ripgrep uses pretty standard PCRE patterns, which are a lot more common than posix’s bre monstrosity.

To me the regex langage is very much a reason to not use grep.

derriz 2 years ago [ - ]

A bit hyperbolic, no?

If you consider it "the weirdest ever", I'm guessing that I'm probably older than you. I've certainly been using regex long before PCRE became common.

As a vim user I compose 10s if not 100s of regexes a day. It does not use PCRE. Nor does sed, a tool I've been using for decades. Do you also recommend not using these?

comex 2 years ago [ - ]

I use all of those tools but the inconsistency drives me crazy as it's hard to remember which syntax to use where. Here's how to match the end of a word:

ripgrep, Python, JavaScript, and practically every other non-C language: \b

vim: \>

BSD sed: [[:>:]]

GNU sed, GNU grep: \> or \b

BSD grep: \>, \b, or [[:>:]]

less: depends on the OS it's running on

burntsushi 2 years ago [ - ]

Did you know that not all of those use the same definition of what a "word" character is? Regex engines differ on the inclusion of things like \p{Join_Control}, \p{Mark} and \p{Connector_Puncuation}. Although in the case of \p{Connector_Punctuation}, regex engines will usually at least include underscore. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...

And then there's \p{Letter}. It can be spelled in a lot of ways: \pL, \p{L}, \p{Letter}, \p{gc=Letter}, \p{gc:Letter}, \p{LeTtEr}. All equivalent. Very few regex engines support all of them. Several support \p{L} but not \pL. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...

pbhjpbhj 2 years ago [ - ]

`pgrep`, or `grep -P`, uses PCRE though, AFAIUI.

2 years ago [ - ]

[deleted]

burntsushi 2 years ago [ - ]

ripgrep's regex syntax is pretty similar to grep -E. So if you know grep -E, most of that will transfer over.

Also, \< and \> are in ripgrep 14. Although you usually just want to use the -w/--word-regexp flag.

xoranth 2 years ago [ - ]

> Also, \< and \> are in ripgrep 14

Isn't that inconsistent with the way Perl's regex syntax was designed? In Perl's syntax an escaped non-ASCII character is always a literal [^1], and that is guaranteed not to change.

That's nice for beginners because it saves you from having to memorize all the metacharacters. If you are in doubt you on whether something has a special meaning, you just escape it.

[^1]: https://perldoc.perl.org/perlrebackslash#The-backslash

burntsushi 2 years ago [ - ]

Yes, it's inconsistent with Perl. But there are many things in ripgrep's default regex engine that are inconsistent with Perl, including the fact that all patterns are guaranteed to finish a search in linear time with respect to the haystack. (So no look-around or back-references are supported.) It is a non-goal of ripgrep to be consistent with Perl. Thankfully, if you want that, then you can get pretty close by passing the -P/--pcre2 flag.

With that said, I do like Perl's philosophy here. And it was my philosophy too up until recently. I decided to make an exception for \< and \> given their prevalence.

It was also only relatively recently that I made it possible for superfluous escapes to exist. Prior to ripgrep 14, unrecognized escapes were forbidden:

    $ echo '@' | rg-13.0.0 '\@'
    regex parse error:
        \@
        ^^
    error: unrecognized escape sequence
    $ echo '@' | rg '\@'
    @

I had done it this way to make it possible to add new escape sequences in a semver compatible release. But in reality, if I were to ever add new escape sequences, it use one of the ascii alpha-numeric characters, as Perl does. So I decided it was okay to forever and always give up the ability to make, e.g., `\@` mean something other than just matching a literal `@`.

`\<` and `\>` are forever and always the lone exceptions to this. It is perhaps a trap for beginners, but there are many traps in regexes, and this seemed worth it.

Note that `\b{start}` and `\b{end}` also exist and are aliases for `\<` and `\>`. The more niche `\b{start-half}` and `\b{end-half}` also exist, and those are what are used to implement the -w/--word-regexp flag. (Their semantics match GNU grep's -w/--word-regexp.) For example, `\b-2\b` will not match in `foo -2 bar` since `-` is not a word character and `\b` demands `\w` on one side and `\W` on the other. However, `rg -w -e -2` will match `-2` in `foo -2 bar`:

    $ echo 'foo -2 bar' | rg -w -e '\b-2\b'
    $ echo 'foo -2 bar' | rg -w -e -2
    foo -2 bar

xoranth 2 years ago [ - ]

Ok, makes sense. And thanks for the detailed explaination about word boundaries and the hint about the --pcre flag (I hadn't realized it existed).

jedisct1 2 years ago [ - ]

Fuzzy matching is the main reason I switched to ugrep. This is insanely useful.

meindnoch 2 years ago [ - ]

Because this is faster?

bsdpufferfish 2 years ago [ - ]

ripgrep stole the name but doesn’t follow the posix standard.