Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "classifier"
-
"Big data" and "machine learning" are such big buzz words. Employers be like "we want this! Can you use this?" but they give you shitty, ancient PC's and messy MESSY data. Oh? You want to know why it's taken me five weeks to clean data and run ML algorithms? Have you seen how bad your data is? Are you aware of the lack of standardisation? DO YOU KNOW HOW MANY PEOPLE HAVE MISSPELLED "information"?!!! I DIDN'T EVEN KNOW THERE WERE MORE THAN 15 WAYS OF MISSPELLING IT!!! I HAD TO MAKE MY OWN GODDAMN DICTIONARY!!! YOU EVER FELT THE PAIN OF TRAINING A CLASSIFIER FOR 4 DAYS STRAIGHT THEN YOUR GODDAMN DEVICE CRASHES LOSING ALL YOUR TRAINED MODELS?!!
*cries*7 -
Having to use home PC because your office PC takes a millennia to train a classifier. Why not just work from home?4
-
Random af project idea that will see me burned alive by the internet (because if I do it I intend to put it in dev.to which is full of "that offends me" people):
Generate a classifier that will scan text from different websites and categorize where the person might be from.
Example: "plz send bob and vagene" <--- we all know
"mami que ricas nalgas" <--- Mexican for the most part.
"there, their, they're and similar text" <--- my fellow Americans for the most part....
"cyka blyat" <--- 0.o we know
"pompous statement about the way Americans do shit" <--- European, meaning, from Yurop.
"angry as fuck rant/banter" <-- German
"lol whatever Trump is the best president ever" <--- some moron from the south of the U.S (south much like myself but I am not a Trump supporter nor a republican)
etc etc.
What makes this complex is that I would have to put together my own dataset in the highly likely chance of something like that not existing already for me to use.
Can you imagine the chaos?11 -
!rant
Thanks google for giving me the opportunity to work with neural networks without being an expert about them.
http://automl.github.io/auto-sklear...
To sum it up:
1. Preprocess data
2. Use Automl to train classifier
3. ????
4. Profit1 -
That feeling when your first classifier on a real life problem exceeds the 97% majority class classifier accuracy.
I'm doing something right! -
Am I more active because I am feeling good, or is it because it's too hot here? 🤔
Anywho, giant NN classifier, come at me!
PS. I think it's gonna be a a major failure, but YOLO. 😜1 -
Used pip install to integrate tensorflow with python 3.5 on windows 10 machine but there were no models available in it. Had to download that separately and add it to tensorflow. Then tried using both inception and clasify_image.py but gives name error cannot find core. But when tested on python idle there were no error there. I don't want to custom create my own classifier but retrain the model. Any solution people?1
-
Black box. It does seem to put messages with an URL in a certain category though, but also that's not always correct. It's trained on 3000 normal dR messages, and 3000 spam dR messages. 6000 dR messages in total. Many epochs but not good for use yet. The idea that the system could classify without discriminating new users is from the table. That discrimination is needed as a safe margin. Original spam system is a bit simple, but it doesn't do false positive and works great. Still, I want to make smth advanced out of it for the sake of education. Tomorrow I'll have my neural networks book. Probably over two weeks I have some good insights how to improve this all. New hobby :)
(pretrained 3b models are fine for recognizing spam btw. But it costs resource. 8 CPU's 100%. A self trained model pure on spam doesn't and is fast. With a pretrained model you can't do mass classification.)7 -
It's not a real dev regret but it's related to it: Not being able to fix a price or a value for my skills.
It's a real regret.
Just coming out of college I have tried my hand at freelancing at found it real hard to fix a value for what work was offered because I just found it weird to fix a monetary value on something that I've done for free for my entire life ( at school and uni I mean).
To make it worse my first experience was with a grad student who wanted me to complete her project.
Now being from India, I know that we have a stereotype of doing work for a lower price.
But this girl took the cake.
She wanted me to create a custom Image classifier using tensorflow.
It had to train with live images and then detect those images in the live video feed.
It's quite simple but still training the basic network(which would be used to just detect features) would take a decent amount of time and effort.
No pre trained models was also a prerequisite for her.
After hearing all her requirements I asked her what price she was willing to pay.
She said 50$ lump sum.
Being really confused as to what to say to that I just stopped replying.
To this day I have no clue what would be a reasonable price to quote a client like that.
After that I just continued dealing with people I knew personally and am currently doing that as an internship. But entering the proper freelancing system again has become a kinda weird thing in my head now, since I have no clue as to what price to put on my skills.
Is there any advice that any of the more experienced people would give?
Also consider the fact that I'm relatively fresh out of college and have no corporate experience.
Even if you've read my rant and have no advice it's okay. I guess this is a path of self realization after all.3 -
Didn't really know how to categorize, bit of a question/discussion/curiosity, so I put it here.🤷
Just today I read an article that stated about the Netherlands, where the police will use an "AI surveillance camera" (yey buzzwords incoming 🙄, but it would actually make sense(?)🤷) to detect and punish drivers, holding a smartphone. Pictures without smartphone shall be deleted. How would this system work without having non-smartphone pictures? It needs to build a classifier, doesn't it? (To be clear, the system only reports those images to an officer for further analysis and actions.)
I mean let's consider that the images are somehow pre-processed, then some convolution(s) for feature extraction, then maybe some more intermediate steps and at the end apply the results on a classifier. How would that classifier work? Would a probability between 0 and 1 suffice? And if so, report those from 0,5 and above? Or would there be better techniques?9 -
Heres the initial upgraded number fingerprinter I talked about in the past and some results and an explanation below.
Note that these are wide black images on ibb, so they appear as a tall thin strip near the top of ibb as if they're part of the website. They practically blend in. Right click the blackstrip and hit 'view image' and then zoom in.
https://ibb.co/26JmZXB
https://ibb.co/LpJpggq
https://ibb.co/Jt2Hsgt
https://ibb.co/hcxrFfV
https://ibb.co/BKZNzng
https://ibb.co/L6BtXZ4
https://ibb.co/yVHZNq4
https://ibb.co/tQXS8Hr
https://paste.ofcode.org/an4LcpkaKr...
Hastebin wouldn't save for some reason so paste.ofcode.org it is.
Not much to look at, but I was thinking I'd maybe mark the columns where gaps occur and do some statistical tests like finding the stds of the gaps, density, etc. The type test I wrote categorizes products into 11 different types, based on the value of a subset of variables taken from a vector of a couple hundred variables but I didn't want to include all that mess of code. And I was thinking of maybe running this fingerprinter on a per type basis, set to repeat, and looking for matching indexs (pixels) to see what products have in common per type.
Or maybe using them to train a classifier of some sort.
Each fingerprint of a product shares something like 16-20% of indexes with it's factors, so I'm thinking thats an avenue to explore.
What the fingerprinter does is better explained by the subfunction findAb.
The code contains a comment explaining this, but basically the function destructures a number into a series of division and subtractions, and makes a note of how many divisions in a 'run'.
Typically this is for numbers divisible by 2.
So a number like 35 might look like this, when done
p = 35
((((p-1)/2)-1)/2/2/2/2)-1
And we'd represent that as
ab(w, x, y, z)
Where w is the starting value 35 in this case,
x is the number to divide by at each step, y is the adjustment (how much to subtract by when we encounter a number not divisible by x), and z is a string or vector of our results
which looks something like
ab(35, 2, 1, [1, 4])
Why [1,4]
because we were only able to divide by 2 once, before having to subtract 1, and repeat the process. And then we had a run of 4 divisions.
And for the fingerprinter, we do this for each prime under our number p, the list returned becoming another row in our fingerprint. And then that gets converted into an image.
And again, what I find interesting is that
unknown factors of products appear to share many of these same indexes.
What I might do is for, each individual run of Ab, I might have some sort of indicator for when *another* factor is present in the current factor list for each index. So I might ask, at the given step, is the current result (derived from p), divisible by 2 *and* say, 3? If so, mark it.
And then when I run this through the fingerprinter itself, all those pixels might get marked by a different color, say, make them blue, or vary their intensity based on the number of factors present, I don't know. Whatever helps the untrained eye to pick up on leads, clues, and patterns.
If it doesn't make sense, take another look at the example:
((((p-1)/2)-1)/2/2/2/2)-1
This is semi-unique to each product. After the fact, you can remove the variable itself, and keep just the structure in question, replacing the first variable with some other number, and you get to see what pops out the otherside.
If it helps, you can think of the structure surrounding our variable p as the 'electron shell', the '-1's as bandgaps, and the runs of '2's as orbitals, with the variable at the center acting as the 'nucleus', with the factors of that nucleus acting as the protons and neutrons, or nougaty center lol.
Anyway I just wanted to share todays flavor of insanity on the off chance someone might enjoy reading it.1 -
FUCKING. HAAR.
WHY CAN'T YOU FUNCTION PROPERLY EVEN AFTER SPENDING HOURS INTO TRAINING YOU???!!
DO YOU REALLY WANT ME TO ABANDON YOU CASCADE CLASSIFIER?
You were like a brother to me. Now look at what you've done.10 -
Working a week on LSTM based text classifier, getting 89% accuracy only to then get better result with Logistic Regression which was supposed to serve as baseline, lol. Background: 180+ classes of google product categorization taxonomy, 20 million rows of data items (short texts). Had a similar experience once on sentiment classification, where SVMlight outperformed NN models.