Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "postscript"
-
how to make a feature request
1. dump Db table with 153 column to Excel
2. print!
3. circle column 47 on page 3, scribble feature description
4. scan! remember to use proprietary file format no one has
5. new e-mail, add "VERY URGENT!!!" to subject line
6. write "will call, discuss details monday"
6.a. attach proprietary-scanned-excel-dump-feature-description (optional)
7. postscript: deadline wednesday!!
8. wait for tuesday
9. send!
...3 -
And just when you like Linux a little too much, it bites you in the ass to remind you why the year of the Linux desktop never happened.
Wifi printer is installed, CUPS test page works, even scanning works. But printing anything else results in the printer spitting out raw postscript with a few random lines per page.
Great. Looks like I'll have to print to PDF, then go to a copy shop and print because printing under Linux is still an unsolved issue.
And yes, that would have worked even with Windows 10. Fuck.24 -
I always wanted to kill my professors, but now I admire them a lot. I learnt to "Fake knowledge" from them. I learnt to talk all the nonsense thing from them. These quality helps a lot after college.
postscript: 20-25% professors were good in my college day.2 -
Isn't Perl a beutiful language? Just check the beutiful screenshot of a function I just written...
Also this is so beutiful. Did you know that you can actually print directly from a perl script?:
$qp = new PostScript::Simple(papersize => "A4",direction => "RightDown",coordorigin => "LeftTop", colour => 0, eps => 0, units => "pt");
$qp->newpage;
$qp->setfont("Courier", 20);
$qp->text(20,20,"Hello Devrant");
$psock = IO::Socket::INET->new(PeerAddr => "192.168.1.40", PeerPort => "9100", Proto => "tcp");
if ($psock) {
$psock->autoflush(1);
print $psock $qp->get();
close($psock);
}6 -
Fun times with postscript:
I have two EPS files that are generated by a program.
In there there is the postscript describing the file (~6000 lines) and then the preview image as TIFF. Each ps and TIFF image on its own renders correctly and looks good.
Now the fun part: The ps in combination with one of the images works, with the other image it doesn't. Somehow the ps-renderer tries to interpret the TIFF-data, which yields nonsense and the renderer stops altogether. But only for one file not for the other. And it's definitely not the ps, because if I switch the preview images the other file doesn't work.1 -
TLDR;
Side project update.
Made simple nlp library in python and published it’s first version to open source.
Now I can feed it with parsed pdf text.
See rant https://devrant.com/rants/2192388/...
Why ?
Cause during reading book about nltk I couldn’t find simple extendible way to provide support for polish language and I wanted to abstract stemming, word normalization, tokenizer etc. so I can provide ex. different conditions for separate text files and don’t write much code what is an asset when you work solo.
It’s about 12GB of pdf public accessible law data I am trying to handle ( at first ) which is about 35000 files from last 90 years.
So far I automated downloading web pages and pdf documents from them. Extracting data from web pages and saving it to database. Extracting text from pdf files. I have about 5-6 projects to do all of it above maybe at the end I will put it to some workflow manager like Luigi or just run it by cronjob.
First thing for website version 1.0 part is find correlation between all documents inside law text using nlp library by building custom conditions. Then just generate directory structure and html files with links between documents.
Website version 2.0 is already in my mind but it will be creepy to make it and will take at least 1-2 months and I want to publish fast.
I have some pdfs with only images instead of text and tesseract worked quite good with them so maybe I will try to process them when everything go live.
Learned a lot about pdf as now I know that font in pdf is not always providing unicode characters ( stupid form of obfuscation) so when you extract text you need to build glyph vector to text map for every font.
Pdf is full vector representation - just like svg - what is logic if you think a bit and know that some printers are running using postscript.
Let’s hope next update will be about flutter mobile app which started all of shit above. It’s almost ready ( except getting data from api I am trying to do and logo for release version ). It’s last piece of puzzle.3 -
My work product: Or why I learned to get twitchy around Java...
I maintain a Java based test system, that tests a raster image processor. The client is a Java swing project that contains CORBA bindings to the internal API of the raster image processor. It also has custom written UI elements and duplicated functionality that became available in later versions of Java, but because some of the third party tools we use don't work with later versions of Java for some reason, it's not possible to upgrade Java to gain things as simple as recursive directory deletion, yes the version of Java we have to use does not support something as simple as that and custom code had to be written to support it.
Because of the requirement to build the API bindings along with the client the whole application must be built with the raster image processor build chain, which is a heavily customised jam build system. So an ant task calls out to execute a jam task and jam does about 90% of the heavy lifting.
In addition to the Java code there's code for interpreting PostScript files, as these can be used to alter the behaviour of the raster image processor during testing.
As if that weren't enough, there's a beanshell interface to allow users to script the test system, but none of the users know Java well enough to feel confident writing interpreted Java scripts (and that's too close to JavaScript for my comfort). I once tried swapping this out for the Rhino JavaScript interpreter and got all the verbal support in the world but no developer time to design an API that'd work for all the departments.
The server isn't much better though. It's a tomcat based application that was written by someone who had never built a tomcat application before, or any web application for that matter and uses raw SQL strings instead of an orm, it doesn't use MVC in any way, and insane amount of functionality is dumped into the jsp files.
It too interacts with a raster image processor to create difference masks of the output, running PostScript as needed. It spawns off multiple threads and can spend days processing hundreds of gigabytes of image output (depending on the size of the tests).
We're stuck on Tomcat seven because we can't upgrade beyond Java 6, which brings a whole manner of security issues, but that eager little Java updated will break the tool chain if it gets its way.
Between these two components we have the Java RMI server (sometimes) working to help generate image data on the client side before all images are pulled across a UNC network path onto the server that processes test jobs (in PDF format), by reading into the xref table of said PDF, finding the embedded image data (for our server consumed test files are just flate encoded TIFF files wrapped around just enough PDF to make them valid) and uses a tool to create a difference mask of two images.
This tool is very error prone, it can't difference images of different sizes, colour spaces, orientations or pixel depths, but it's the best we have.
The tool is installed in both the client and server if the client can generate images it'll query from the server which ones it needs to and if it can't the server will use the tool itself.
Our shells have custom profiles for linking to a whole manner of third party tools and libraries, including a link to visual studio 2005 (more indirectly related build dependencies), the whole profile has to ensure that absolutely no operating system pollution gets into the shell, most of our apps are installed in our home directories and we have to ensure our paths are correct for every single application we add.
And... Fucking and!
Most of the tools are stored as source bundles in a version control system... Not got or mercurial, not perforce or svn, not even CVS... They use a custom built version control system that is built on top of RCS, it keeps a central database of locked files (using soft and hard locks along with write protecting the files in the file system) to ensure users can't get merge conflicts by preventing other users from writing to the files at all.
Branching is heavy weight and can take the best part of a day to create a new branch and populate the history.
Gathering the tools alone to build the Dev environment to build my project takes the best part of a week.
What should be a joy come hardware refresh year becomes a curse ("Well fuck, now I loose a week spending it setting up the Dev environment on ANOTHER machine").
Needless to say, I enjoy NOT working with Java. A lot of this isn't Javas fault, but there's a lot of things that Java (specifically the Java 6 version we're stuck on) does not make easy.
This is why I prefer to build my web apps in python or node, hell, I'd even take Lua... Just... Compiling web pages into executable Java classes, why? I mean I understand the implementation of how this happens, but why did my predecessor have to choose this? Why?2 -
Just bought a Chromebook Pixel. Love the hardware - Chromebooks in general are a great way to get a Linux laptop with guaranteed driver support.
But why is it still so hard to get decent HiDPI support in Linux (or for that matter Windows) desktop environments?
I realise Apple had an advantage in using vector-based Display Postscript, but massively divergent screen sizes and resolutions have been around for YEARS now, so why is it still such a faff?1