Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "python tesseract"
-
Soo my dad has a food printer he uses to print edible images on cakes our customers order. The food printer needs to run at least once a week (regularly) to kinda guarantee not to get fucked up with its ink, as that can damage the printer when it's dry. My dad though doesn't have regular orders...
The printer has a standard function to test all colors.
My dad asked me how this task could be managed regularly, as I'm the IT guy 🙄. His idea was to log all the dates on paper.
Now I'm trying to automate this task via Windows so we don't have to care about papers to manually log when the next test must run. On Windows the printer settings can be accessed to run this color check.
... I've got a feeling this will be another one of those tasks that I will overengineer over the top😅. I've already done my research with automated batch jobs (never done batch before) but the normally proposed code for a "Düsentestmuster", so the color check, prints a different overview I was not expecting, which doesn't fit the purpose.
Now I'm here and, as I currently see no way of simplifying it, I have to kinda simulate a person that opens these settings and runs this check. With Python, pyautogui and Tesseract OCR, to prevent the program from clicking anything wrong. Although I'm sure there should be an easier way for this, I haven't found it, so I guess I have to proceed on this path and take the experience I gain as a bonus...10 -
I used to think that I had matured. That I should stop letting my emotions get the better of me. Turns out there's only so much one can bottle up before it snaps.
Allow me to introduce you folks to this wonderful piece of software: PaddleOCR (https://github.com/PaddlePaddle/...). At this time I'll gladly take any free OCR library that isn't Tesseract. I saw the thing, thought: "Heh. 3 lines quick start. Cool.", and the accuracy is decent. I thought it was a treasure trove that I could shill to other people. That was before I found out how shit of a package it is.
First test, I found out that logging is enabled by default. Sure, logging is good. But I was already rocking my own logger, and I wanted it to shut the fuck up about its log because it was noise to the stuffs I actually wanted to log. Could not intercept its logging events, and somehow just importing it set the global logging level from INFO to DEBUG. Maybe it's Python's quirk, who knows. Check the source code, ah, the constructors gaves `show_log` arg to control logging. The fuck? Why? Why not let the user opt into your logs? Why is the logging on by default?
But sure, it's just logging. Surely, no big deal. SURELY, it's got decent documentation that is easily searchable. Oh, oh sweet summer child, there ain't. Docs are just some loosely bundled together Markdowns chucked into /doc. Hey, docs at least. Surely, surely there's something somewhere about all the args to the OCRer constructor somewhere. NOPE! Turns out, all the args, you gotta reference its `--help` switch on the command line. And like all "good" software from academia, unless you're part of academia, it's obtuse as fuck. Fine, fuck it, back to /doc, and it took me 10 minutes of rummaging to find the correct Markdown file that describes the params. And good-fucking-luck to you trying to translate all them command line args into Python constructor params.
"But PTH, you're overreacting!". No, fuck you, I'm not. Guess whose code broke today because of a 4th number version bump. Yes, you are reading correctly: My code broke, because of a 4th number version bump, from 2.6.0.1, to 2.6.0.2, introducing a breaking change. Why? Because apparently, upstream decided to nest the OCR result in another layer. Fuck knows why. They did change the doc. Guess what they didn't do. PROVIDING, A DAMN, RELEASE NOTE. Checked their repo, checked their tags, nothing marking any releases from the 3rd number. All releases goes straight to PyPI, quietly, silently, like a moron. And bless you if you tell me "Well you should have reviewed the docs". If you do that for your project, for all of your dependencies, my condolences.
Could I just fix it? Yes. Without ranting? Yes. But for fuck sake if you're writing software for a wide audience you're kinda expected to be even more sane in your software's structure and release conventions. Not this. And note: The people writing this, aren't random people without coding expertise. But man they feel like they are.5 -
TLDR;
How much do you earn for your skill set in your country vs your cost of living?
BONUS;
See how much I & others earn.
Recently I became aware of just how massive the gap in developers earnings are between countries. I'd love to calculate a fixed score for income vs cost of living.
I know this stuff is sensitive to some so if you prefer just post your score (avg income p/m after tax / cost of living).
I'm not shy so I'll go first:
MY RATES
Normal Rate (Long term): $23
Consulting / Short term: $30-$74
Pen Test: $1500 once off.
Pen Test Fixes: consulting rate.
Simple work/websites: min $400+
Family & Friends: Dev friends are usually free (when mutually beneficial). Family and others can fuck off, even if they can pay (I pass their info to dev friends with fair warning).
GENERAL INFO
Experience: 9 years
Country: South Africa
Developer rareness in country: Very Rare (+-90 job openings per job seeker).
Middle class wage in country: $1550 p/m (can afford a new car, decent apartment & some luxuries like beer/eating out).
Employment type: Permanent though I can and do freelance occasionally.
Client Locality: Mostly local.
Developer Type: Web Developer (True web dev - I do anything web related from custom HTTP servers to sockets, services, advanced browser api's, apps & more).
STACKS / SKILLSETS
I'M PROFICIENT IN:
python, JavaScript, ASP classic, bash, php, html, css, sql, msql, elastic search, REST, SOAP, DOM, IIS, apache
I DABBLE WITH:
ASP.net, C++, ruby, GO, nginx, tesseract
MY SPECIALTIES:
application architecture, automation, integrations, db's, real time data, advanced browser apps/extensions (webRTC, canvas etc).
SUMMARY
Avg income p/m after tax: $2250
Cost of living (car+rent+food): $1200
Score: 1.85
*Note: For integrity when calculating my cost of living I excluded debt repayments and only kept my necessities which are transport, food & shelter.
I really hope you guy's post your results, it would be great to get an idea of which is really the worst / best country to be a developer in.20 -
I have been working on OCRs recently and I just have one thing to say:
FUCK tesseract's documentation
SERIOUSLY HOW IS IT SOO POOR?1 -
Story of a first-time hackathon.
So, I took part in the COVID-19 Global Hackathon.
Long story short, I got excited at OCR and just went with the most challenging challenge - digitizing forms with handwritten text and checkboxes, ones which say whether you have been in contact with someone who could have Coronavirus.
And, unsurprisingly, it didn't work within 4 days. I joined up with 2 people, who both left halfway through - one announced, one silently - and another guy joined, said he had something working and then dissapeared.
We never settled on a stack - we started with a local docker running Tesseract, then Google Cloud Vision, then we found Amazon Textract. None worked easily.
Timezone differences were annoying too. There was a 15-hour difference across our zones. I spent hours in the Slack channel waiting.
We didn't manage the deadline, and the people who set the challenge needed the solution withing 10 days, a deadline we also missed. We ended up with a basic-bitch Vue app to take pictures with mock Amazon S3 functionality, empty TDD in Python and also some OCR work.
tbh, that stuff would've worked if we had 4 weeks. I understand why everyone left.
I guess the lesson from this is not to be over-ambitious with hackathons. And not to over-estimate computers' detection abilities.rant covid hackathon slack s3 google cloud vision python tdd aws tesseract textract covid-19 global hackathon2 -
TLDR;
Side project update.
Made simple nlp library in python and published it’s first version to open source.
Now I can feed it with parsed pdf text.
See rant https://devrant.com/rants/2192388/...
Why ?
Cause during reading book about nltk I couldn’t find simple extendible way to provide support for polish language and I wanted to abstract stemming, word normalization, tokenizer etc. so I can provide ex. different conditions for separate text files and don’t write much code what is an asset when you work solo.
It’s about 12GB of pdf public accessible law data I am trying to handle ( at first ) which is about 35000 files from last 90 years.
So far I automated downloading web pages and pdf documents from them. Extracting data from web pages and saving it to database. Extracting text from pdf files. I have about 5-6 projects to do all of it above maybe at the end I will put it to some workflow manager like Luigi or just run it by cronjob.
First thing for website version 1.0 part is find correlation between all documents inside law text using nlp library by building custom conditions. Then just generate directory structure and html files with links between documents.
Website version 2.0 is already in my mind but it will be creepy to make it and will take at least 1-2 months and I want to publish fast.
I have some pdfs with only images instead of text and tesseract worked quite good with them so maybe I will try to process them when everything go live.
Learned a lot about pdf as now I know that font in pdf is not always providing unicode characters ( stupid form of obfuscation) so when you extract text you need to build glyph vector to text map for every font.
Pdf is full vector representation - just like svg - what is logic if you think a bit and know that some printers are running using postscript.
Let’s hope next update will be about flutter mobile app which started all of shit above. It’s almost ready ( except getting data from api I am trying to do and logo for release version ). It’s last piece of puzzle.3 -
Developed this project "Audio Book Generator"
Implementing speech synthesis(📖 to 🗣) on eBooks
Bored with writing notes in a lecture? How about we convert the notes dictated by the lecturer into text? Use the speechtotext.py script to get the text format of spoken notes, which saves the text in a .txt file.
Too lazy to read a novel? Get an Ebook version of the novel and run the finalAudioBookGenerator.py script. It will generate an mp3(audio) format of the book. Enjoy book listening :)
You can also convert your single images using the singleImageReader.py script.
Demonstration:
https://youtu.be/xhMvGg1dAsg
Project:
https://github.com/globefire/...
Star If you liked it. :)rant project python github audio books speech synthesis youtube text to speech speech to text tesseract3