5
atheist
103d

I'm having the dawning realization, reading the utf specification and thinking "parsing the data files isn't too hard..." that the little side project on a side project on a side project isn't going to be finished until like, Christmas.

Fuck.

Comments
  • 1
    There's a way to determine the width of an utf8 character afaik. Maybe using that is easier than read full spec. Why would you understand utf8 to fullest, you prolly only want to prevent cutting a chair in the middle.

    Also, if you read byte for byte and display it that way it's automatically "utf8" right? My C apps support emoticons out of the box
  • 1
    @retoor grapheme clusters, 1 encoded character isn't necessarily 1 displayed character, eg the woman astronaut emoji is woman, zwj, rocket. There's another library that extracts them but it's not that well maintained...
  • 1
    I'm not reading the full spec, only like, 2 files fortunately 😅
  • 1
    @atheist don't forget the gay astronaut, the trans astronaut and the one in a wheelchair
  • 1
    I prefer parsing media files than UTF-8 thanks to emojis.
  • 0
    @Tounai I did 5 years in video, variable framerate is it's own special hell, I figure a new kind of pain, variety is the spice of life after all
  • 1
    @retoor I think flags can basically be of an arbitrary size there are so many different combininf characters/variants so... Yup 😭
Add Comment