Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "nlp python"
-
I am traveling 550 Km (9hrs) just to give my first interview for the position of Jr. Natural Language Processing Engineer.
Wish me luck...8 -
What you are expected to learn in 3 years:
power electronics,
analogue signal,
digital signal processing,
VDHL development,
VLSI debelopment,
antenna design,
optical communication,
networking,
digital storage,
electromagnetic,
ARM ISA,
x86 ISA,
signal and control system,
robotics,
computer vision,
NLP, data algorithm,
Java, C++, Python,
javascript frameworks,
ASP.NET web development,
cloud computing,
computer security ,
Information coding,
ethical hacking,
statistics,
machine learning,
data mining,
data analysis,
cloud computing,
Matlab,
Android app development,
IOS app development,
Computer architecture,
Computer network,
discrete structure,
3D game development,
operating system,
introduction to DevOps,
how-to -fix- computer,
system administration,
Project of being entrepreneur,
and 24 random unrelated subjects of your choices
This is a major called "computer engineering"4 -
This February, I posted a !rant here ( https://devrant.com/rants/1999689/... ) about getting a NLP internship with the help of the community.
In the past few months, I have gone up, and now I have a job offer from a small organisation (StrataVAR) as their Python dev.
I received the offer letter today. Since I am in the third year of graduation, then want me to work parallel to the university classes, they pay way above Indian freshers' average, and they have put me in a team that works on things I like.
It would not have been this way without the help and support of the communities I'm a part of, such as DevRant and StackOverflow (obviously). I just wanted to thank all who cared and helped. It means a lot.8 -
1. Study C and Python
2. Learn NLP and ML
3. Participate again in one hackathon and kick their ass after winning it
4. Get one awesome internship
5. Master algorithms and DS -
! wk95
My project back in university, where I used bash and NLP and Python to create a utility thay would execute sentences written in English. Much like typing "change my wallpaper to abc.jpg"
Even though the tokenizer took almost five minutes to tokenize a sentence ( longer than five words ), and the parser took even longer, I still love it, for it was my first dive into ML ! -
After using and learning programming with Python for two years and getting comfortable with the language's ins-and-outs, now it has come the time to learn my second language. I selected C++, and I am so glad I waited until I understood my first language before jumping into a new one because it was worth the time. Before, C++ looked intimidating, but now I see its beauty (reasonably strongly typed language). It took me some hours to understand the basics and ended the day making a simple Python-3 adapter using C++.
Side notes:
Maybe because I am a noob, I don't see why Rust is preferred over C++?
While I only plan to use C++ to speed heavy preprocessing tasks within Python projects - I was surprised to find no NLP libraries?4 -
TLDR;
Side project update.
Made simple nlp library in python and published it’s first version to open source.
Now I can feed it with parsed pdf text.
See rant https://devrant.com/rants/2192388/...
Why ?
Cause during reading book about nltk I couldn’t find simple extendible way to provide support for polish language and I wanted to abstract stemming, word normalization, tokenizer etc. so I can provide ex. different conditions for separate text files and don’t write much code what is an asset when you work solo.
It’s about 12GB of pdf public accessible law data I am trying to handle ( at first ) which is about 35000 files from last 90 years.
So far I automated downloading web pages and pdf documents from them. Extracting data from web pages and saving it to database. Extracting text from pdf files. I have about 5-6 projects to do all of it above maybe at the end I will put it to some workflow manager like Luigi or just run it by cronjob.
First thing for website version 1.0 part is find correlation between all documents inside law text using nlp library by building custom conditions. Then just generate directory structure and html files with links between documents.
Website version 2.0 is already in my mind but it will be creepy to make it and will take at least 1-2 months and I want to publish fast.
I have some pdfs with only images instead of text and tesseract worked quite good with them so maybe I will try to process them when everything go live.
Learned a lot about pdf as now I know that font in pdf is not always providing unicode characters ( stupid form of obfuscation) so when you extract text you need to build glyph vector to text map for every font.
Pdf is full vector representation - just like svg - what is logic if you think a bit and know that some printers are running using postscript.
Let’s hope next update will be about flutter mobile app which started all of shit above. It’s almost ready ( except getting data from api I am trying to do and logo for release version ). It’s last piece of puzzle.3 -
So I am considering side games to add my main games. Mini games I guess they are called. I thought it might be fun to have random chessboards in game you can actually play. I wanted to actually have a decent chess engine behind the game. Off the bat I found a GPL one. I think it is designed to be communicated externally. So what does that mean for using it in my game? If I communicate to an external process is this violating GPL? I have no intention of making my game open source. Well it seems this use case is very nuanced:
https://opensource.stackexchange.com/...
The consensus on a lot of these discussions is the scope of the use of the program. Are you bundling for convenience or bundling for intrinsic utility? This is fascinating because using a compiler on a Windows platform could be a possibly violation. That is a proprietary program calling a GPL one. This is actually handled in the GPL as far as I know. So, if I use a GPL engine as a mini game is that the same as a full blown chess game? What if I support 10 different engines in a full blown chess game?
Now to play devil's advocate even further. Are proprietary phone apps that communicate to GPL software that serve data intrinsically linked? The app will not function without the server or computer os the server runs on. A lot of the web tech is largely GPL or has large amount of GPL programs. Should the web code be under GPL? Should the phone app be under GPL? This sounds ridiculous to some degree. But is that the same as bundling a GPL app and communicating to it from the program via network or command line? The phone app depends upon this software.
Now to protect myself I will find a decent chess engine that is either LGPL or something more permissive. I just don't want the hassle. I might make the chess engine use a parameter in case someone else might want a better engine they want to add though. At that point it is the user adding it. Maybe the fact that it would not be the only game in town is a factor as well.
I am also considering bundling python as a whole to get access to better AI tools (python is pretty small compared to game assets). It seems everything is python when it comes to AI. The licensing there is much better though. I would love to play with NLP for commanding npcs.
I am not discussing linking at all, btw.3 -
i feel its a great time to be a developer we have so many toys to play with
machine learning, scientific python, nodejs, frontend js frameworks, nosql, NLP, elasticsearch, mongodb, open source .net, big data with java, arduino..., VR, 3d printing
what toys are you playing with? -
I'm currently a java developer. I've dabbled in python too. Mostly worked on API development and some data processing. I want to learn something new, that'll keep me engaged. It can be something within java (like image processing or NLP) or some other language (Go, scala, js). What do you all suggest?6
-
Hello guys, started learning NLP a week ago. With the book
Do you have any ideas for mini projects? (Something more simple than a chatbot)7 -
It was my first time doing an NLP task / implementing a RNN and I was using the torchtext library to load and do sentiment analysis on the IMDB dataset. I was able to use collate_fn and batch_sampler and create a DataLoader but it gets exhausted after a single epoch. I’m not sure if this is the expected behavior, if it is then do I need to initialize a new DataLoader for every epoch? If not is something wrong with my implementation, please provide me the correct way to implement the same.
PS. I was following the official changelog() of torchtext from github
You can find my implementation here
changelog - https://github.com/pytorch/text/...
My implementation - https://colab.research.google.com/d...