5
JsonBoa
2d

One of my minions (erm, I mean, "a valued junior member of my team") asked to be assigned to tasks more "data science related".
Regardless of the very last-decade sounding request, I tried to explain to the Jr that there is more to "data science" than distilling custom llms and downloading pytorch models. There are several entire fields of study. And those are all sciences. In this context, science equals math.
But they said they were not scared of math.
I've seen them using their phones to calculate freaking tips. If you can't do 15% of a lunch bill in your head, hypothesis tests might be a bit more than challenging.
But, ok then. Here we go.

So I had them do some semi-supervisioned clustering. On a database as raw as dirt, but with barely 5Gb, few dimensions and regarding subjects with easily available experts.
Even better, we had hundreds of manually classified training and test cases.
The Jr came back a month later with some convoluted mess of convoluted networks; just the serialized weights of the poor thing were about as large as the database itself.
And when I tried it on some other manually classified test datasets... Freaking 41% error rate, for something that should be a slam dunk. Little better than a coin toss.
One month of their time wasted on an overfitted unusable mess.

I had to re-assign the task to someone else, more experienced, last friday. It was monday when they came up with an iterative KNN approach giving error rates for several values of K... some of them with less than 15% error on the test dataset.

WTF are schools teaching and calling "data science" nowadays?!?!?
I reeeeally need to watch those juniors more closely. Maybe ask for middle-sprint demonstrations. But those are soooo boring and waste so much time from people who know what they are doing...
Does anyone have a better idea to prevent this type of off-track deviation? Without being a total bore, that is.

And... should I start asking people "gotcha" data analysis questions before giving them free reign on this type of tasks? Or is it an asshole boss move? I would hate someone giving me a pop quizz before letting me work... But I got no other ideas.

Comments
  • 2
    Meh, anyway, i still think they've learned a lot. I do know we're working in commercial companies and learning is not the main product. I mean, you're allowed to learn and fail a little bit, but not too hard :P It's called investment in employee :P

    I dunno, letting a junior burn out trough having the assumption he can do it is kinda normal training of them right? It's very educational, better than you explain them why something is not easy or not the way to go.

    For me it was always interesting, how much should a model trainer know about the content it feeds? I mean, if it is something advanced medical, should he understand it all? I mean, there are validation sets and stuff, so maybe not? No idea. Please tell me :P

    I shut my mouth for now, you're obviously working on a way higher level than I do in general.
  • 2
    I'd just throw them into whatever they want and overload them with information

    if you doubt people it's possible you'll sabotage them because you don't believe in them

    seems kind of mean not to make it their choice. come on. everybody deserves a choice 😝

    but I would keep overloading them with information and increasing my expectations of them. if they can't keep up, either they'll realize it's not for them or they'll actually impress you by going through a trial by fire (which generally means they'll be more creative in the field because they'll think differently from everyone else already in the field). win / win
  • 2
    @retoor about your question, if they should know a lot about the data... I have only the most cowardly yet the most realistic response: "it depends on the problem, the data itself and the technique used in analysis".
    In this case, the first two solve to "yes, they should know as much as possible about the data".

    That being said, maybe your "employee development" argument may be combined wit @jestdotty 's trial-by-fire approach... if the presentation of potentially disgusting results is made in such a way as not to make the juniors feel too much embarrassed. But maybe it would indeed be good for them to get some calluses in order to thicken the skin.
    I just need something to avoid them losing too much time on a doomed approach...
    Maybe I should indeed ask for a midsprint 1:1 demo and suck up the boredom. But ask in private, and only for the people I can not trust blindly... yet.
Add Comment