22

C has too much undefined behaviour because the standards comittee was being lazy and slapped that on a lot of issues that ought to have been implementation defined instead.

The most ridiculous example for UB: An unmatched ' or " character is encountered on a logical source line during tokenization.

Like WTF, that should be a compile time error, and it's easy to detect.

Comments
  • 2
    Although it's true that a lot of these ubs could've been better defined, luckily tools help us out a lot: I use linters, static syntax analyzers and the functionalities in my IDE to fill that void.
    I myself at least have not encountered a lot of ubs not identified by tools either pre-compile or post-compile..
  • 6
    @NEMESISprj Yeah, but stuff like the one I mentioned shouldn't even have been UB.

    Similarly for signed integer overflow. Sure, there were machines e.g. with 1's complement, but today, there is only 2's, and that should have been implementation defined instead of getting that useful behaviour only with compiler switches.
  • 1
    @Fast-Nop very true!
  • 8
    @Fast-Nop
    My personal favorite is malloc(0) and realloc(p,0).

    Malloc can return NULL or a pointer that points to a 0 size memory segment.

    Realloc might crash or free or shrink p to a 0 size segment.
  • 5
    Ah yes, language design by accretion or relatively-clueless committee or "we've done it this way since 1969, deal with it", my favourite kind.

    @metamourge I guess it's because of allocator overhead? A malloced memory segment has some metadata overhead so it's totally possible to have a 0 sized malloc that's still a valid chunk of data on the heap depending on your allocator design (alignment restrictions change this though).
  • 1
    C is a low-level, speed above everything else, platform independent language. The more behaviour is undefined the faster the language and its compiler can be and the easier it is to support additional platforms.

    If you don't strictly need that last ounce of speed, the hardware layer proximity, or support for multiple hardware platforms, don't use a language that is optimized for that.
  • 4
    @Oktokolo How does the example from the rant itself have anything to do with runtime speed?

    And also, when I benchmark with stuff like e.g. signed overflow defined vs. undefined via a compiler switch, I see no difference in speed.
  • 1
    @Oktokolo

    https://en.wikipedia.org/wiki/...

    I wish people would stop saying this crap.

    Is there a college teacher somewhere spewing this misinformation?
  • 0
    @Demolishun my college teachers were adamant that C was a "middle level" language.

    And that came in a MCQ test, with rhe fourth option being "levelless/unlevelled" language.
  • 2
    @Demolishun Why are you linking a completely redundant wikipedia article page?

    What @Oktokolo said is true. Most UB exists so that compilers for *all* platforms (not just your favorite) can make certain optimizations.

    The problem that @Fast-Nop is outlining is the fact there's just so much UB that it's really easy to miss all of it and you end up just using a smaller subset of the language out of fear of causing issues.

    They're both right, and clearly everyone involved knows what a low level programming language is.

    So what's your point? Aside from the fetish for calling things "crap".
  • 0
    @junon C isn't a low level language. People on this platform keep saying that. Its a systems language.
  • 5
    @Demolishun @sudo-compile I've literally never heard anyone formalize what a low level language is, it's a relative term. And part of my work is with languages.

    Machine language is (very) high level in computer architecture for example. It's such a high level representation when seen from microarchitecture's point of view that you'll feel like laughing at people calling it "low level" and "detailed".

    C is high level compared to machine language, but low af compared to say Java or Haskell or whatever.

    And languages really aren't low or high level, it's the concepts you try to represent in them. If I'm talking about hardware device pointers in Java that's still low level (relatively of course). Function closures in C? High level (relatively).

    Also how is this even relevant?
  • 2
    @Demolishun I make programming languages and operating systems for fun. C is both low-level (relative to most others) and is a systems language.

    C does not abstract much from the programmer aside from CPU internals.

    Again, what's your point?
  • 2
    @RememberMe exactly. x86 is very high level when you consider the microcode level, but of course very low level compared to javascript.

    Unless you're only working with weaving the individual magentic bits on your hard drive yourself to write hello world programs, then "low level" is a relative term.
  • 4
    @junon even uops are still high level because they don't encode details about the reorder buffer, issue queues, branch predictors, different ALU stages and configurations, speculative execution hardware, register renaming, load forwarding and reordering, store buffers, prefetchers, cache hierarchies, coherence mechanism and the billions of other things that go into actually executing instructions at the microarchitectural level.

    I'm agreeing with you anyway, this is just more info.
  • 3
    @RememberMe @junon

    I can see what you mean by relativity. I think sometimes I just want to be "right" rather than open to different interpretations. I think the phrase "low level" has become a trigger point for me. I will try to be less of a "correcting douche". I hate "correctors", but I do this shit myself. What does that mean?
  • 1
    @Demolishun It means you're insecure with your knowledge and skill level.

    Also, "low level" shouldn't be the trigger point. "Native" is the term that has been completely hijacked and diluted beyond recognition.

    "Cloud native" makes my skin crawl whenever I read it.
  • 0
    @junon I can totally see engineers working on quantum computers laughing at silicon engineers idea that they are working "low level".

    Lol on the native hijack. Didn't realize that is a thing.
  • 3
    @junon The performance could mostly have been gained with implementation defined behaviour instead.

    That wouldn't fuck up one's programs behind one's back and wouldn't break existing code upon compiler updates (or at least the implentation change would be documented).

    Slower? Only because the compiler uses UB to optimise whole code parts away that it should keep in the first place. The resulting program is faster but also useless.

    When they made liberal use of UB, they didn't have sadistic compiler writers like the GCC team in mind.
  • 3
    One example: bitshifts.

    There are ISAs such as x86-32 where the opcode has only 5 bits for the shift width, yielding 0-31. Shifting a 32 bit integer by 32 bits is modulo 32, i.e. a shift of 0, i.e. the value is not altered. On other ISAs, such as power PC, this isn't the case, so you get a different behaviour.

    OK, but they could have said, use whatever you want if the shift width exceeds the data type width, but document it and use it consistently. No shift width check required, but also no licence to optimise the whole program away.
  • 1
    @Fast-Nop Yes, agreed on that point.
Add Comment