Why not ripgrep?

Why not ugrep?

They are more or less equivalent. One has obscure feature X other has obscure feature Y, one is a bit faster on A, other is a bit faster on B, the defaults are a bit different, and one is written in Rust, the other in C++.

Pick the one you like, or both. I have both on my machine, and tend to use the one that does what I want with the least options. I also use GNU grep when I don't need the speed or features of either ug and rg.

One thing I never liked about ripgrep is that it doesn't have a pager. Yes, it can be configured to use the system-wide ones, but it's an extra step (and every time I have to google how to preserve colors) and on Windows you're SOL unless you install gnu utils or something. The author always refused to fix that.

Ugrep not only has a pager built in, but it also allows searching the results which is super nice! And that feature works on all supported platforms!

Interesting - for me a built-in pager is an antifeature. I don't want to figure out how to leave the utility. Worst of all, pager usually means that sometimes you get more pages and you need to press q to exit, and sometimes not. Annoying. I often type yhe next command right away and the pager means I get stuck, or worse, pager starts doing something in response to my keys (looking at you, `git log`).

Then again I'm on Linux and can always pipe to less if I need to. I'm also not the target audience for ugrep because I've never noticed that grep would be slow. :shrug:

You might appreciate setting `PAGER=cat` in your environment. ;)

Git obeys that value, and I would hope that most other UNIXy terminal apps do too.

Oh, wow, thank you! I must try this.

Some terminal emulators (kitty for sure) support "open last command output in pager". Works great with a pager that can understand ANSI colors - less fussing around with variables and flags to preserve colors in the pager

This is what I do personally:

    $ cat ~/bin/rgp
    #!/bin/sh
    exec rg -p "$@" | less -RFX
Should work just fine. For Windows, you can install `bat` to use a pager if you don't otherwise have one. You don't need GNU utils to have a pager.

hi @burntsushi,

   fan of your tool. like it's speed and defaults.
I use windows : didn't understand what you mean by "install `bat`" to use a pager.

I use cygwin and WSL for my unix needs. I have more and less in cygwin for use in windows.

I referenced bat because I've found that suggesting cygwin sometimes provokes a negative reaction. The GP also mentioned needing to install GNU tooling as if it were a negative.

bat is fancy pager written in Rust. It's on GitHub: https://github.com/sharkdp/bat

I'm sure you know but windows command prompt always came with its inbuilt pager -- more. So, you could always do "dir | more" or "rg -p "%*" | more ". (more is good with colors without flags)

I didn't! I'm not a Windows user. Colors are half the battle, so that's good. Will it only appear if paging is actually needed? That's what the flags to `less` do in my wrapper script above. They are rather critical for this use case.

I don't believe bat is a paper; it's more of a pretty-printer that tends to call less.

Two pallets that should work on Windows are https://github.com/walles/moar (golang) and https://github.com/markbt/streampager (Rust). There might also be a newer one that uses rust, I'm unsure.

I'd recommend ov for Windows users.

https://github.com/noborus/ov

bat on Windows does page, but I believe it's only available on Choco and not winget.

Good find, thanks! I'll check if I prefer it to moar.

As for bat, according to https://github.com/sharkdp/bat#using-bat-on-windows, the Chocolatey package simply installs `less` alongside `bat`. Seems like a good idea, but I haven't tried it.

Ah, thanks for doing the footwork.

For me, it's a lot easier to compile a static binary of a C++ app than a Rust one. Never got that to work. Also nice to have compatibility with all of grep's arguments.

> to compile a static binary

Cargo is one of the main reasons to use Rust of C++. I am pretty sure there is more involved with C++ than this:

   rustup target add x86_64-unknown-linux-musl 
   cargo build --target=x86_64-unknown-linux-musl

From the ugrep README:

For an up-to-date performance comparison of the latest ugrep, please see the ugrep performance benchmarks [at https://github.com/Genivia/ugrep-benchmarks]. Ugrep is faster than GNU grep, Silver Searcher, ack, sift. Ugrep's speed beats ripgrep in most benchmarks.

Does these performance comparison take into account the things BurntSushi (ripgrep author) pointed out in the ripgrep issue link elsewhere ITT? https://github.com/BurntSushi/ripgrep/discussions/2597

Either way, ripgrep is awesome and I’m staying with it.

Agreed - ripgrep is great, and I'm not planning to switch either. The performance improvement is tiny, anyways.

The best practical reason to choose this is its interactive features, like regexp building.

Although being faster in some cases, ripgrep lacks archive search support (no, transparent decompression ignoring the archive structure is not enough) which works great in ugrep.

I assume the grep compatible bit is attractive to some people. Not me, but they exist.

I find myself returning to grep from my default of rg because I'm just too lazy to learn a new regex language. Stuff like word boundaries "\<word\>" or multiple patterns "\(one\|two\)".

That seems like the weirdest take ever: ripgrep uses pretty standard PCRE patterns, which are a lot more common than posix’s bre monstrosity.

To me the regex langage is very much a reason to not use grep.

A bit hyperbolic, no?

If you consider it "the weirdest ever", I'm guessing that I'm probably older than you. I've certainly been using regex long before PCRE became common.

As a vim user I compose 10s if not 100s of regexes a day. It does not use PCRE. Nor does sed, a tool I've been using for decades. Do you also recommend not using these?

I use all of those tools but the inconsistency drives me crazy as it's hard to remember which syntax to use where. Here's how to match the end of a word:

ripgrep, Python, JavaScript, and practically every other non-C language: \b

vim: \>

BSD sed: [[:>:]]

GNU sed, GNU grep: \> or \b

BSD grep: \>, \b, or [[:>:]]

less: depends on the OS it's running on

Did you know that not all of those use the same definition of what a "word" character is? Regex engines differ on the inclusion of things like \p{Join_Control}, \p{Mark} and \p{Connector_Puncuation}. Although in the case of \p{Connector_Punctuation}, regex engines will usually at least include underscore. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...

And then there's \p{Letter}. It can be spelled in a lot of ways: \pL, \p{L}, \p{Letter}, \p{gc=Letter}, \p{gc:Letter}, \p{LeTtEr}. All equivalent. Very few regex engines support all of them. Several support \p{L} but not \pL. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...

`pgrep`, or `grep -P`, uses PCRE though, AFAIUI.

[deleted]

ripgrep's regex syntax is pretty similar to grep -E. So if you know grep -E, most of that will transfer over.

Also, \< and \> are in ripgrep 14. Although you usually just want to use the -w/--word-regexp flag.

> Also, \< and \> are in ripgrep 14

Isn't that inconsistent with the way Perl's regex syntax was designed? In Perl's syntax an escaped non-ASCII character is always a literal [^1], and that is guaranteed not to change.

That's nice for beginners because it saves you from having to memorize all the metacharacters. If you are in doubt you on whether something has a special meaning, you just escape it.

[^1]: https://perldoc.perl.org/perlrebackslash#The-backslash

Yes, it's inconsistent with Perl. But there are many things in ripgrep's default regex engine that are inconsistent with Perl, including the fact that all patterns are guaranteed to finish a search in linear time with respect to the haystack. (So no look-around or back-references are supported.) It is a non-goal of ripgrep to be consistent with Perl. Thankfully, if you want that, then you can get pretty close by passing the -P/--pcre2 flag.

With that said, I do like Perl's philosophy here. And it was my philosophy too up until recently. I decided to make an exception for \< and \> given their prevalence.

It was also only relatively recently that I made it possible for superfluous escapes to exist. Prior to ripgrep 14, unrecognized escapes were forbidden:

    $ echo '@' | rg-13.0.0 '\@'
    regex parse error:
        \@
        ^^
    error: unrecognized escape sequence
    $ echo '@' | rg '\@'
    @
I had done it this way to make it possible to add new escape sequences in a semver compatible release. But in reality, if I were to ever add new escape sequences, it use one of the ascii alpha-numeric characters, as Perl does. So I decided it was okay to forever and always give up the ability to make, e.g., `\@` mean something other than just matching a literal `@`.

`\<` and `\>` are forever and always the lone exceptions to this. It is perhaps a trap for beginners, but there are many traps in regexes, and this seemed worth it.

Note that `\b{start}` and `\b{end}` also exist and are aliases for `\<` and `\>`. The more niche `\b{start-half}` and `\b{end-half}` also exist, and those are what are used to implement the -w/--word-regexp flag. (Their semantics match GNU grep's -w/--word-regexp.) For example, `\b-2\b` will not match in `foo -2 bar` since `-` is not a word character and `\b` demands `\w` on one side and `\W` on the other. However, `rg -w -e -2` will match `-2` in `foo -2 bar`:

    $ echo 'foo -2 bar' | rg -w -e '\b-2\b'
    $ echo 'foo -2 bar' | rg -w -e -2
    foo -2 bar

Ok, makes sense. And thanks for the detailed explaination about word boundaries and the hint about the --pcre flag (I hadn't realized it existed).

Fuzzy matching is the main reason I switched to ugrep. This is insanely useful.

Because this is faster?

ripgrep stole the name but doesn’t follow the posix standard.