8
dmoen
2y

I dont understand the Log4j vulnerability.

Isnt the ability to execute code a feature they added so that you can add dynamic data to the logs?

If it is a feature then isnt it written in the documentation?

Is the problem that a lot of companies forgot to sanitize the input before logging it?

Comments
  • 6
    "it's not a bug it's a feature!"

    Development teams that introduce a feature that is active by default that compromises every security measure should be held accountable.

    This was a negligent disregard for the safety of their users environments and the trust in their products has been damaged beyond repair.

    The only fitting punishment that I can see is for them to live with the shame knowing they caused a global panic and caused trillions of dollars of loss.

    They will forever be the butt of every "Java sucks" joke and will always be the cautionary tale of "think about what this feature is doing".

    May God have mercy on their souls because the industry will surely damn them over the next month.
  • 4
    :)
    consider sql. Isn't it sql's feature that you can build it using arbitrary strings? Like filters, pagination, etc. Is it wrong to use input from users to form a db query? Or is it bad that it's a feature implemented long before, just users forgot to sanitize the inputs and got an sqli?

    Slf4shell is the same problem. The same mechanism. Except log4j has an api looking more like prep-statements rather than raw sqls. So intuituvely everyone assumed to leave parameters unsanitized. Noone's sanitizing PS params and dbms merges them into the qry. Log4j was assumed to behave similarly. Unfortunatelly, all the params had the same power as the base qry.

    Frankly, I didn't find this mentioned in docs [that params are also evaluated], so this might be a docs bug.
  • 3
    @sariel Except that it's the industry who happily freeloaded on the leisure project of some devs as essential package, routinely bypassed distro package managers via piling up fat jars, and also didn't care to invest a dime into any sort of review activities.
  • 6
    @netikras The documentation has explained this in great detail.

    The thought and rationale behind it was to provide a way to inject at runtime values - the lookup system. Lookup exactly as it is meant - take value e.g. from the java system properties ${sys:<var>} or from the env ${env:<var>}.

    The lookups also included JNDI - Java Naming and Directory Interface.

    The idea is to lookup a specific value in an LDAP directory for example and insert it, an DNS query, and so on.

    The core problem at hand is over anticipation in my opinion.

    Instead of looking at the problem in it's simplest form, replacing the values at runtime <if needed>, they decided do go the very muddy way:

    - by default, look for all possible lookups in a string recursively
    - do each lookup till all lookups are resolved
    - support default values for unset runtime variables

    Which is the reason the attack vector is so grave: By using the default values fallback, an recursive implementation and defaulting to do lookups by default on all provided values, you have an endless myriad of combinations to trigger the JNDI lookup.

    E.g. '${env:?"jndi://"} + ... - string' concatenation of env (variable syntax is not correct for obvious reasons like breaking my fingers on a smartphone) to create the JNDI variable lookup by concatenating values from another lookup.

    Over anticipation. Trying to solve all corner cases and provide all features possible without clearly defining what problem should be solved _by default_.

    TLDR: "Magically all values should appear and if not a default value could be provided".

    Over anticipation and it's result, featuritis is a common thing.

    I'm not an fan of minimalism, but I think this case should be explained to every dev and intern - don't try to solve all problems, you're digging your own grave.
  • 1
    @Fast-Nop and the Apache foundation should have seen this as an opportunity to review a mission critical library that was being used across multiple services and sites.

    This is the way the double edge cuts on community driven development.

    So who do we hold accountable?

    The only real answer is ourselves. We did this to ourselves by not vetting plugins and libraries well enough.

    But the devs that let this "feature" through, should be ashamed and virtually flogged for the negligence alone. Maybe next time they'll think twice when they want to develop a feature that allows the user to shoot themselves in the face.
  • 0
    Soo...if it was in the documentation then the whole issue is that the api should have been more self explaining?
  • 0
    @sariel But should you really have to think of every possible security flaw that can happen if devs use your library in a certain way?
  • 5
    @dmoen No.

    The problem is that noone cared. It was always a problem yet no one pointed it out.

    And in this case it is _obviously_ a security flaw.

    And your reasoning is a very dangerous one.

    It's a security flaw based on design, not on some obscure code level. It was obvious for years.
  • 1
    @IntrusionCM Can you, please, point me in the right direction? I'm digging through lookups, config docs - can't find where it says that all parameters of a logging statement (e.g. logger.info("name: {}, age: {}, address: {}", a, b, c)) are a subject to lookups.

    I truly want to find this part.
  • 3
    @dmoen point blank, this was a feature they left on. Not a flaw of the code.

    This was a flaw in judgment.

    I don't even know HOW anyone can even defend such an obviously terrible idea. I'm not even in devsec but Jesus fucking Christ how many people looked at that and thought, "this is a fucking great idea!".

    That feature existed for one of these four reasons:

    - the devs are too green to understand devsec
    - the devs are just fucking lazy
    - the devs are negligent and derelict in their responsibilities as software engineers
    - the devs purposely exposed their community for some devious motive
  • 3
    @netikras I posted somewhere a link to they way back archive.

    The documentation was changed as the defaults changed due to the last patches.

    https://web.archive.org/web/...

    This is a link of 2012.

    Property Substitution for Configuration.

    Note that this is a reference link that is mentioned in several subpages.

    It wasn't till the addition of nolookup that this got explicitly stated in the pattern specifier for msg:

    https://web.archive.org/web/...

    'Use {nolookups} to log messages like "${date:YYYY-MM-dd}" without using any lookups. Normally calling logger.info("Try ${date:YYYY-MM-dd}") would replace the date template ${date:YYYY-MM-dd} with an actual date. Using nolookups disables this feature and logs the message string untouched. '

    Giving an explicit hint that property Substitution with lookups works in all cases, even messages.
  • 1
    Maybe to make it a bit more obvious:

    Many frameworks / libs / ... ship a predefined configuration.

    You have in Log4J appenders, which create a stream of events that's written via the definition of a layout.

    A layout in it's simplest form is the pattern layout. As defined or explained here:

    https://web.archive.org/web/...
  • 2
    @highlight
    <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
    <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
    </Console>
    </Appenders>
  • 0
    @IntrusionCM I guess I have to know log4j's internals to see what you see in docs.

    For instance, what data structure is the "message" in when a layout is applied? A String? Or a POJO, like
    class Message {
    private String pattern;
    private List<String> arguments;

    /* ... */
    public String toString() {
    return fillTemplate(doLookups(pattern), arguments);
    }
    }

    If the latter, then it's safe to say that interpolation only happens in the "template" part of the message - arguments do not participate. If the former - then yes, all the pieces participate in the crime.

    As of now, `Giving an explicit hint` is a speculation. Not something clearly stated in docs. In fact, docs so far confirm my theory: 'logger.info("Try ${date:YYYY-MM-dd}") would replace the date template'. where logger.info() signature is:

    void info(String template, Object... arguments);

    that part of the doc claims that template gets interpolated. Not the args (conveniently, missing from the example)
  • 0
    @IntrusionCM So far I see plenty of room for speculation regarding this aspect.

    And leaving room for speculation is what I call flawed documentation.
  • 5
    @sariel honestly I don't want to give them too much blame. Yes the feature was horribly thought out and a big design flaw, but we had billion dollar companies using this library all over their products and they maintainers are still unpaid?

    That aspect just makes me a spiteful. They got a library free of charge, actively maintained and when the issue arrived the maintainers patched it. They didn't pay a dime for this – and yet they still have the audacity to complain?
  • 2
    There are people who instantly see the problem with such features in general (and especially in a logging framework).
    And there are people who never heared about Langsec.
  • 1
    @LotsOfCaffeine I think you might be confused.

    Those devs did get paid. In experience, in job security, in expertise.

    That's the double edge. You don't get your cake and eat it too.
  • 1
    @sariel joke's on you, I bought a cake today and ate it. Do you like apples?
  • 1
    @netikras

    https://web.archive.org/web/...

    You pass in just a message - everything will be chop sueyed to a string. Even of you pass additional objects as a paremeter, they'll be stringified.

    :)

    Mea culpa. Or as in German: "Asche auf mein Haupt".

    The pattern layout - as message will be stringified - thus mentions only %m or message as a value. Which is why I always assumed it had to be a recursive algorithm: When a log event with several possible injected values only becomes "one" single argument in the configuration, and the hint is given that lookups can be disabled, my brain automatically assumes that lookups must have happened recursively.
  • 0
    @IntrusionCM Yes, it WILL be stringified.. at some point.

    "the hint is given" - that's but a very weak hint, come on. Does it also hint whether the stringification is lazy or eager? It doesn't really.

    Also, I can't find where the docs state that lookups are performed on the whole string.

    Again, we're left to assume things. Even if I had read the docs beforehand, it would not occur to me that ALL the logging params participate in a lookup...

    logger.info("name: {}, age: {}", u.name, u.age);
    logger.info("server: ${env:HOSTNAME}, name: {}, age: {}", u.name, u.age);

    The pattern I see is that the first parameter (i.e. message template) holds all the variable parts: positional variables and lookup variables: {} and ${}.
    And this pattern makes a lot of sense to me.

    Unless you can (I couldn't) dig up in the docs where we can without assumptions see (example, explicit statement, etc.) that all the params participate in lookups, we can agree to disagree about the correctness of the docs.
  • 0
    @IntrusionCM IDK, man... everything tells me that the template must hold all the variable parts, not the parameters...

    come on, look at this..

    logger.printf(Level.INFO, "Logging in user %1$s with birthday %2$tm %2$te,%2$tY", user.getName(), user.getBirthdayCalendar());

    The only reason I would even get an idea telling otherwise is the configuration docs, where it says that you can use lookups in configs too. But then again, the first thought that crosses my mind is: "oh, they must be using a separate mechanism to extract the lookup tokens and evaluate them in the final log entry.. so the same mechanism is reused in two places: base config pattern and the "template" part in `logger.info(template, args...)"`.

    That's what my intuition says.

    And if other devs' intuition says a similar thing, that'd explain why it took THIS long for this bug to be overlooked.

    I mean it's impossible that NOONE read the docs for years. Even I did. Superficially, but I did
  • 1
    @netikras

    The thing you're missing is that you can nest lookups. Which is what they partially removed and largely secured in the last releases.

    I admit the documentation is very muddy and imprecise regarding this point, but still the examples are there ;)

    Look e.g. at the context map lookup, it's the first thing in the lookups page.

    "The PatternLayout supports interpolation with Lookups and will then resolve the variable for each event" is the sentence you're looking for.

    Event meaning Log Event. And yes, it means that you could do a lookup of variables in a deeply nested fashion.

    https://logging.apache.org/log4j/...

    If you take a look here, at the last point, you'll find your missing tidbit.

    "The StrSubstitutor class and StrLookup interface were borrowed from Apache Commons Lang and then modified to support evaluating LogEvents."

    When using JNDI lookup, nested lookups make sense - or context lookups, which is why they mentioned them security page XD
Add Comment