Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Fuck Apache TIKA.
Its supposed to be a "universal file reader" or some shit. Im trying to use it as a PDF/image parser that does OCR when needed and yelds a full-file string. It does so, but the text ends up being IN THE WRONG FUCKING ORDER.
WTF would I want to parse the text out of a PDF in any order that is not the one the text is supposed to be read?!?!
"It is more efficient to work in random ordering", says the docs. No shit, really? Wouldn't it be even more efficient to just spit out random strings? Just as useful and 100% CPU-bound.
"You can add a property to forcefully put the text in the right order". THEN WHY THE FUCK IT IS NOT THE DEFAULT SETTING?
Srsly, what's the use case to a parser that yields scrambled text?!?
rant