Hacker News

cogman10 2 days ago [ - ]

Depending on the language and logging framework, debug/trace logging can be acceptable in a library. But you have to be extra careful to make sure that it's ultimately a no-op.

A common problem in Java is someone will drop a log that looks something like this `log.trace("Doing " + foo + " to " + bar);`

The problem is, especially in a hot loop, that throw away string concatenation can ultimately be a performance problem. Especially if `foo` or `bar` have particularly expensive `toString` functions.

The proper way to do something like this in java is either

    log.trace("Doing $1 to $2", foo, bar);

    if (log.traceEnabled()) {
      log.trace("Doing " + foo + " to " + bar);
    }

usefulcat 2 days ago [ - ]

Ideally a logging library should at least not make it easy to make that kind of mistake.

Lvl999Noob a day ago [ - ]

This isn't really something the logging library can do. If the language provides a string interpolation mechanism then that mechanism is what the programmers will reach for first. And the library cannot know that interpolation happened because the language creates the final string before passing it in.

If you want the builtin interpolation to become a noop in the face runtime log disabling then the logging library has to be a builtin too.

demurgos a day ago [ - ]

I feel like there's a parallel with SQL where you want to discourage manual interpolation. Taking inspiration from it may help: you may not fully solve it but there are some API ideas and patterns.

A logging framework may have the equivalent of prepared statements. You may also nudge usage where the raw string API is `log.traceRaw(String rawMessage)` while the parametrized one has the nicer naming `log.trace(Template t, param1, param2)`.

NewJazz a day ago [ - ]

You can have 0 parameters and the template is a string...

demurgos a day ago [ - ]

The point of my message is that you should avoid the `log(string)` signature. Even if it's appealing, it's an easy perf trap.

There are many ideas if you look at SQL libs. In my example I used a different type but there other solutions. Be creative.

    logger.log(new Template("foo"))`
    logger.log("foo", [])
    logger.prepare("foo").log()

tharkun__ 19 hours ago [ - ]

And none of those solve the issue.

You pass "foo" to Template. The Template will be instantiated before log ever sees it. You conveniently left out where the Foo string is computed from something that actually need computation.

Like both:

    new Template("doing X to " + thingBeingOperatedOn)

    new Template("doing " + expensiveDebugThing(thingBeingOperatedOn))

You just complicated everything to get the same class of error.

Heck even the existing good way of doing it, which is less complicated than your way, still isn't safe from it.

    logger("doing {}", expensiveDebugThing(thingBeingOperatedOn))

All your examples have the same issue, both with just string concatenation and more expensive calls. You can only get around an unknowing or lazy programmer if the compiler can be smart enough to entirely skip these (JIT or not - a JIT would need to see that these calls never amount to anything and decide to skip them after a while. Not deterministically useful of course).

demurgos 17 hours ago [ - ]

Yeah, it's hard to prevent a sufficiently motivated dev from shooting itself in the foot; but these still help.

> You conveniently left out where the Foo string is computed from something that actually need computation.

I left it out because the comment I was replying to was pointing that some logs don't have params.

For the approach using a `Template` class, the expectation would be that the doc would call out why this class exists in the first place as to enable lazy computation. Doing string concatenation inside a template constructor should raise a few eyebrows when writing or reviewing code.

I wrote `logger.log(new Template("foo"))` in my previous comment for brevity as it's merely an internet comment and not a real framework. In real code I would not even use stringy logs but structured data attached to a unique code. But since this thread discusses performance of stringy logs, I would expect log templates to be defined as statics/constants that don't contain any runtime value. You could also integrate them with metadata such as log levels, schemas, translations, codes, etc.

Regarding args themselves, you're right that they can also be expensive to compute in the first place. You may then design the args to be passed by a callback which would allow to defer the param computation.

A possible example would be:

    const OPERATION_TIMEOUT = new Template("the operation $operationId timed-out after $duration seconds", {level: "error", code: "E_TIMEOUT"});
    // ...
    function handler(...) {
      // ..
      logger.emit(OPERATION_TIMEOUT, () => ({operationId: "foo", duration: someExpensiveOperationToRetrieveTheDuration()}))
    }

This is still not perfect as you may need to compute some data before the log "just in case" you need it for the log. For example you may want to record the current time, do the operation. If the operation times out, you use the time recorded before the op to compute for how long it ran. If you did not time out and don't log, then getting the current system time is "wasted".

All I'm saying is that `logger.log(str)` is not the only possible API; and that splitting the definition of the log from the actual "emit" is a good pattern.

bluGill a day ago [ - ]

Unless log() is a macro of some sort that expands to if(logEnabled){internalLog(string)} - which a good optimizer will see through and not expand the string when logging is disabled.

a day ago [ - ]

[deleted]

lock1 a day ago [ - ]

Ideally, but realistically, I have never heard of any major programming language that allows you to express "this function only accepts static constant string literal".

purkka a day ago [ - ]

Python has LiteralString for this exact purpose. It's only on the type checker level, but type checking should be part of most modern Python workflows anyway. I've seen DB libraries use this a lot for SQL parameters.

https://typing.python.org/en/latest/spec/literal.html#litera...

Too 21 hours ago [ - ]

Beyond LiteralString there is now also t-strings, introduced in Python 3.14, that eases how one writes templated strings without loosing out on security. Java has something similar with Template class in Java 21 as preview.

8n4vidtmkvmk a day ago [ - ]

We have this in c++ at Google. It's like securitytypes::StringLiteral. I don't know how it works under the hood, but it indeed only allows string literals.

stefanfisk a day ago [ - ]

Even PHP has that these days via static analysis https://phpstan.org/writing-php-code/phpdoc-types#other-adva...

pxx 21 hours ago [ - ]

c++20 offers `consteval` to make this clear, but you can do some simple macro wizardry in c++11 to do this:

    #define foo(x) ( \
        (void)std::integral_constant<char, (x)[0]>::value, \
        foo_impl(x) \
    )

(the re-evaluation of x doesn't matter if it compiles). You can also use a user-defined literal which has a different ergonomic problem.

MereInterest a day ago [ - ]

In Rust, this can almost be expressed as `arg: &'static str` to accept a reference to a string whose lifetime never ends. I say “almost” because this allows both string literals and references to static (but dynamically generated) string.

For Rust’s macros, a literal can be expressed as `$arg:lit`. This does allow other literals as well, such as int or float literals, but typically the generated code would only work for a string literal.

jval43 a day ago [ - ]

Not the language, but the linter can do it. IntelliJ inspections warn you if you do it: https://www.jetbrains.com/help/inspectopedia/StringConcatena...

zem a day ago [ - ]

it does seem like something a good static analysis tool should be able to catch though

prithvip a day ago [ - ]

This is not true. Any modern Java compiler will generate identical bytecode for both. Try it yourself and see! As a programmer you do not need to worry about such details, this is what the compiler is for. Choose whatever style feels best for you.

Hackbraten a day ago [ - ]

> Any modern Java compiler will generate identical bytecode for both. Try it yourself and see!

You may be misunderstanding something here.

If you follow the varargs-style recommendation, then concatenation occurs in the log class.

If you follow the guard-style recommendation, then the interpolated expressions will not be evaluated unless the log level matches.

In the naive approach, concatenation always occurs and all expressions which are part of the interpolation will be evaluated no matter the log level.

Could it be that you were thinking about StringBuffer vs. concatenation, an entirely unrelated problem?

rr808 a day ago [ - ]

Still quite like the windows log approach which (if logged) stores the template as just the id, with the values, saving lots of storage as well eg 123, foo, bar. You can concatenate in the reader.

buggjenrmf a day ago [ - ]

So, it costs perf every time it’s read, instead of when it’s written (once). And of course has a lot of overhead to store metadata. Bad design. As usual.

kubelsmieci a day ago [ - ]

Most logs are probably never read, but nevertheless should be written (fast) for unexpected situations when you will later need them. And logging have to be fast, and have minimal performance overhead.

ignoramous 19 hours ago [ - ]

> The problem is, especially in a hot loop ... The proper way to do something like this in java is either log.trace(..., ...) or if (log.traceEnabled()) log.trace(...)

The former still creates strings, for the garbage collector to mop up even when log.traceEnabled() is false, no?

Also, even if the former or latter is implemented as:

  fn trace(log, str, args...) {
     if (!log.tracing) return;
     // ...
  }

Most optimising JIT compilers will code hoist the if-condition when log.tracing is false, anyway.

TZubiri 2 days ago [ - ]

How about wrapping the log.trace param in a lambda and monkeypatching log.trace to take a function that returns a string, and of course pushing the conditional to the monkeypatched func.

cluckindan a day ago [ - ]

Then you still have the overhead of the log.trace function call and the lambda construction (which is not cheap because it has closure over the params being logged and is passed as a param to a function call, so probably gets allocated on the heap)

TZubiri a day ago [ - ]

>Then you still have the overhead of the log.trace function call

That's not an overhead at all. Even if it were it's not compareable to string concatenation.

Regarding overhead of lambda and copying params. Depends on the language, but usually strings are pass by ref and pass by values are just 1 word long, so we are talking one cycle per variable and 8 bytes of memory. Which were already paid anyways.

That said, logging functions that just take a list of vars are even better, like python's print()

> printtrace("var x and y",x,y)

> def printtrace(*kwargs):

>> print(kwargs) if trace else None

Python gets a lot of slack for being a slow language, but you get so much expressiveness that you can invest in optimization after paying a flat cycle cost.

jeeeb a day ago [ - ]

That’s what most languages, including Java do.

The problem the OP is pointing out is that some programmers are incompetent and do string concatenation anyway. A mistake which if anything is even easier in Python thanks to string interpolation.

01HNNWZ0MV43FF 2 days ago [ - ]

That is why the popular `tracing` crate in Rust uses macros for logging instead of functions. If the log level is too low, it doesn't evaluate the body of the macro

tsimionescu a day ago [ - ]

Does that mean the log level is a compilation parameter? Ideally, log levels shouldn't even be startup parameters, they should be changeable on the fly, at least for any server side code. Having to restart if bad enough, having to recompile to get debug logs would be an extraordinary nightmare (not only do you need to get your customers to reproduce the issue with debug logs, you actually have to ship them new binaries, which likely implies export controls and security validations etc).

bluGill a day ago [ - ]

I don't know how rust does it, but my internal C++ framework has a global static array so that we can lookup the current log level quickly, and change it at runtime as needed. It is very valuable to turn on specific debug logs at times, when someone has a problem and we want to know what some code is doing

TZubiri a day ago [ - ]

I know this is standard practice, but I personally think it's more professional to attach a gdb like debugger to a process instead of depending on coded log statements.

tsimionescu a day ago [ - ]

A very common thing that will happen in professional environments is that you ship software to your customers, and they will occasionally complain that in certain situations (often ones they don't fully understand) the software misbehaves. You can't attach a debugger to your customer's setup that had a problem over the weekend and got restarted: the only solution to debug such issues is to have had programmed logs set up ahead of time.

ekidd 21 hours ago [ - ]

In my professional life, somewhere over 99% of time, the code suffering the error has either been:

1. Production code running somewhere on a cluster.

2. Released code running somewhere on a end-user's machine.

3. Released production code running somewhere on an end-user's cluster.

And errors happen at weird times, like 3am on a Sunday morning on someone else's cluster. So I'd just as soon not have to wake up, figuring out all the paperwork to get access to some other company's cluster, and then figure out how to attach a debugger. Especially when the error is some non-reproducible corner case in a distributed algorithm that happens once every few months, and the failing process is long gone. Just no.

It is so much easier to ask the user to turn up logging and send me the logs. Nine times out of ten, this will fix the problems. The tenth time, I add more logs and ask the user to keep an eye open.

TZubiri 17 hours ago [ - ]

I think I get the idea, gdb is too powerful. For contexts where operator is distinct from manufacturer, the debug/logging tool needs to be weaker and not ad-hoc so it can be audited and to avoid exfiltrating user data.

tsimionescu 10 hours ago [ - ]

It's not so much about power, but about the ad-hoc nature of attaching a debugger. If you're not there to catch and treat the error as it happens, a debugger is not useful in the slightest: by the time you can attach it, the error, or the context where it happened, are long gone. Not to mention, even if you can attach a debugger, it's most often not acceptable to pause the execution of the entire process for you to debug the error.

Especially since a lot of the time an exception being raised is not the actual bug: the bug happened many functions before. By logging key aspects of the state of the program, even in non-error cases, when an error happens, you have a much better chance of piecing together how you got to the error state in the first place.

jeeeb a day ago [ - ]

The idea in Java is to let the JIT optimise away the logging code.

This is more flexible as it still allows runtime configuration of the logging level.

The OP is simply pointing that some programmers are incompetent and call the trace function incorrectly.