devRant - A fun community for developers to connect over code, tech & life as a programmer

Search - "text to speech"

42

Skepter

539

9y

My friend left their macbook unlocked, so we parsed the entire story of Moby Dick into the text to speech and left it in the background on full volume. Never seen such a confused face in my life.

undefined wk37
37

xecute

373

9y

A speech to text ide for programming when you are feeling lazy.

undefined wk31

6
31

ctnqhk

1109

4y

I was on vacation when my employer’s new fiscal year started. My manager let me take vacation because it’s not like anything critical was going to happen. Well, joke was on us because we didn’t foresee the stupidity of others…

I had to update a few product codes in the website’s web config and deploy those changes. I was only going to be logged in for 30 minutes to complete that.

I get messaged by one of our database admins. He was doing testing and was unable to complete a payment on the website. That was strange. There was a change pushed by our offsite dev agency, but that was all frontend changes (just updating text) and wouldn’t affect payments.

We don’t want to enlist the dev agency for debugging work, especially when it’s not likely that it’s a code issue. But I was on vacation and I couldn’t stay online past the time I had budgeted for. So my employer enlists the dev agency for help. It’s going to be costly because the agency is in Lithuania, it was past their business hours, and it was emergency support.

Dev agency looks at error logs. There are Apple Pay errors, but that doesn’t explain why non Apple Pay transactions aren’t going through. They roll back my deployment and theirs, but no change. They tell my employer to contact our payment processor.

My manager and the Product Manager contact Payroll, who is the stakeholder for our payment gateways. Payroll contacts our payment gateway and finds out a service called Decision Manager was recently configured for our account. Decision Manager was declining all payments. Payroll was not the person who had Decision Manager installed and our account using this service was news to her.

Payroll works with our payment processor to get payments working again. The damage is pretty severe. Online payments were down for at least 12 hours. Our call center had logged reports from customers the night before.

At our post mortem, we had to find out who ok’d Decision Manager without telling anyone. Luckily, it was quick work. The first stakeholder up was for the Fundraising Dept. She said it wasn’t her or anyone on her team. Our VP of Analytics broke it to her that our payment processor gave us the name of the person who ok’d Decision Manager and it was someone on the Fundraising team. Fundraising then starts backtracking and says that oh yes she knew about it but transactions were still working after the Decision Manager had been configured. WTAF.

Everyone is dumbfounded by this. How could you make a big change to our payment processor and not tell anyone? How did our payment processor allow you to make this change when you’re not the account admin (you’re just a user)?

Our company head had to give an awkward speech about communication and how it’s important. The web team can’t figure out issues if you don’t tell us what you did. The company head was pissed because it was a shitty way to start off the new fiscal year. Our bill for the dev agency must have been over $1000 for debugging work that wasn’t helpful.

Amazingly, no one was fired.

rant amazingly no one was fired for this fuck up wk334

4
24

qbalsdon

2720

9y

The secretaries at my university had to scan documents in the masters students lab. So one day, when the lab was empty save one of our secretaries, I remote into my machine and write a text to speech app and have the computer announce "Hello Selma, you really know how to push my buttons"

It took a while, but we are friends again :D

undefined wk37
23

-ANGRY-STUDENT-

13414

7y

That moment when you work the whole day to write a discord bot from scratch. No discord.py and other wrappers. Pure websockets, oauth2, https, json loads here and there. Understanding how the discord API works was a real challenge, but I did it :).
Most of my time was spent on discord's gateway connection and identification system.

The bot can renew its token, get all the guilds it is part of, all the channels and users of these guilds, send message and communicate with the gateway.

Tomorrow I will start connecting it to a voice channel and let it "speak". Thinking of combining text-to-speech with it, but I am not sure how well they are going to harmonize together.

random scratch bot discord

5
21

Jappe

2901

9y

!rant

Coding is like having superpowers.
For instance: For school i have to read 8 books and I have limited time and motivation. What I did? I wrote a program that filters the text from a pdf or epub and converted it to spoken text with gtts (Google Text To Speech).

Now all I have to to is to listen to the story and relax..

undefined google translate python reading literature is kinda boring

5
21

CozyPlanes

25416

8y

If you don't know how to explain about your software, but you want to be featured in Forbes (or other shitty sites) as quickly as possible, copy this:

I am proud that this software used high-tech technology and algorithms such as blockchain, AI (artificial intelligence), ANN (Artificial Neural Network), ML (machine learning), GAN (Generative Adversarial Network), CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), DNN (Deep Neural Network), TA (text analysis), Adversarial Training, Sentiment Analysis, Entity Analysis, Syntatic Analysis, Entity Sentiment Analysis, Factor Analysis, SSML (Speech Synthesis Markup Language), SMT (Statistical Machine Translation), RBMT (Rule Based Machine Translation), Knowledge Discovery System, Decision Support System, Computational Intelligence, Fuzzy Logic, GA (Genetic Algorithm), EA (Evolutinary Algorithm), and CNTK (Computational Network Toolkit).

🤣 🤣 🤣 🤣 🤣

rant gan ta rnn smt ml rofl ai dnn cnn ann high-tech

3
18

EmberQuill

3439

5y

I've actually had mostly good instructors for CS. Or at least mediocre. The worst teacher I had was actually my Algebra II teacher in high school. She taught by reading, word for word, from our textbook. She would copy the example problems from that chapter onto the whiteboard. And then give us the rest of class to work on homework. She was basically a Text-to-Speech program for our textbook.

We all joked that she was drunk and the one locked cabinet in her classroom contained liquor. A year after I had her class she was fired. For drinking on the job. The joke turned out to be 100% true and they actually did find alcohol in the locked cabinet.

rant wk253
16

donuts

23240

5y

Short: Still supported after 10 years

First Chromebook sent to testers.

Long: Lost my voice recently so need to use text to speech on my phone but to many typos so slow...

If only I had a keyboard... Wait a laptop... Hm... Chromebook...

Looks for cheapest on Amazon

But go.... Wait I already have one.... That I installed Ubuntu on and then somehow corrupted.

And never got around to fixing it because didn't really need it... Or have a 8gb USB stick around at the time.

Well now I saved $100 assuming the battery still works too

rant

5
13

Jumpshot44

10988

10y

I am sitting at Starbucks trying to focus and finish some posts for my side project. The lady at the next table has been loudly 😁talking about her Disney trip in excruciating detail. Do we really have to know she will store her bagel and tuna fish in the hotel room fridge - really!

30 minutes in and she is just getting to the hotel - yikes 😫

No other seats open. I am trapped. Project deliverable delay - reason loud lady.

I was going to do this as a speech to text post but too nice to do that. HELP!

undefined loud lady starbucks

2
12

Grumpy

2866

9y

Peeve of the week: Youtube videos with robot voices (text-to-speech).

Youtube needs detection and a filter option to let users remove those vids from search results.

undefined
11

Jumpshot44

10988

10y

So yesterday I discussed how I am using speech to text to do approximately 50% of my rants. I am now doing a growing percentage of my outlook emails by voice as the human-computer voice interaction is pleasing and very natural. I have even named my iPhone 'little jumpshot' today.

Today I experimented with text to speech so that my rants are automatically read back to me before I send them. Some decent results.

In settings - general - accessibility you will find voice over (not recommended - be careful). Below that is Speech - speak selection or speak screen options.

Speak selection allows you to highlight text to be spoken. Too much human interaction for my purposes of walking hopefully not tripping be looking down. Using up my nine lives 😐

Below voiceover is - Speak screen - which allows you to pull down the screen with two fingers to speak what is on the screen. This will read the rant or of there are multiple rants on the screen it will read those as well.

It works but it will take a bit of getting used to. It also requires a few clicks here and there.

My goal is to interact with devRant fluidly 100% by voice. Just talking to 'little jumpshot' and him creating and posting all of my rants and reading all the other rants developers post.

For a few days experimenting I am satisfied with the progress but there is a long way to go.

Hopefully, in the end, this may help some people. Any ideas are very welcome.

undefined speech to text text to speech

4
11

kiki

36855

4y

“An engineer?!… An open, shining mind, easy and inoffensive humour, this wide reach, they’re switching from one engineering realm to another, and really, from tech problems to society, then — to art. Those manners, that fine taste, good speech, coherent and free of filler words. One engineer is also a musician, another one — an artist, but all of them have those smart eyes…”

INCREASE SALES
this text is not for managers like you.

random

6
9

Chea

351

7y

I wonder if they have speech to text for code.

Var cars equals left bracket quote Saab quote comma quote Volvo quote comma quote BMW quote right bracket semi-colon

rant

3
8

vane

10439

6y

So I decided to run mozilla deep speech against some of my local language dataset using transfer learning from existing english model.
I adjusted alphabet and begin the learning.
I have pc with gtx1080 laying around so I utilized that but I recommend to use at least newest rtx 3080 to not waste time ( you can read about how much time it took below ).

Waited for 3 days and error goes to about ~30 so I switched the dataset and error went to about ~1 after a week.
Yeah I waited whole got damn week cause I don’t use this computer daily.

So I picked some audio from youtube to translate speech to text and it works a little. It’s not a masterpiece and I didn’t tested it extensively also didn’t fine tuned it but it works as I expected. It recognizes some words perfectly, other recognize partially, other don’t recognize.

I stopped test at this point as I don’t have any business use or plans for this but probably I’m one of the couple of companies / people right now who have my native language speech to text machine learning model.

I was doing transfer learning for the first time, also first time training from audio and waiting for results for such long time. I can say I’m now convinced that ML is something big.

To sum up, probably with right amount of money and time - about 1-3 months you can make decent speech to text software at home that will work good with your accent and native language.

random experiment machine learning speech to text
8

QueenMorgana

13937

8y

Suggestions for a good speech to text program for someone who mumbles and talks too fast?

I sliced my thumb while washing a knife and typing on my computer without it is getting annoying

question never let me near sharp objects or flames thanks i know i'm an idiot

3
8

IronPhreak

1028

10y

Sweet, my motivation for coding my personal projects has started to come back.

Last night I setup my Personal Assistant project with Text to speech and Voice recognition.

Now I just have to get it to react to commands.

undefined python assistant javis-wanna-be

6
8

caramelCase

10462

4y

Quite amazingly, yes!

as a matter of fact one of my parents is actually also in information technology or related field so there are very much aware of how in demand the job is and how difficult it is as well and the best part is a lot of my engineering friends are also switching to computer science and just because it is the better choice of because of how over saturated the engineering field is so yeah i think i have a better career choice than most of my peers

(PS: I used Speech to text here so forgive the grammar errors)

rant wk287

1
7

AleCx04

26871

6y

For those of you scared of the ZOMG imminent threat of AI.....

In Spanish, in particular to the way it is spoken in Mexico, we know curly hair to be called "chino" or "chinos" in certain places. This is funny because Chino is actually what we call Chinese people.

So. The other day I mentioned in a friend of mine's post the text "pinches chinos" in regards to the pain of having curly hair(which I also have) during windy days.

FB being the retarded piece of shit that it is took it as hate speech, pinches chinos can be roughly translated to "fucking curly hair" in this regard, but because FB is retarded as all fuck it took it as me spewing some hate speech again'st their Chinese overlords.

I normally wouldn't give a fuck, if it weren't because one of my friends is celebrating their birthdays today and I can't post shit on his wall due to me being on facebook jail.

I have known this dude since I was 6, currently 29, but no, FB decided that I was some racist prick somehow and because of that I can't go ahead and post something to him. Its fine, I was still capable of calling him and celebrating with my boy, but still.

An AI will not be able to detect the difference between a fucking cat and a lion, it is shitty technology, it is interesting because of the math behind it, but seriously, not something to be scared about, skynet is far from coming into existence.

Fuck FB and fuck people scared about AI and deep learning

random

12
7

xorith

2643

8y

So I'm pretty sure that when I went all in to Apple's Siri shitfest and turned on "Hey, Siri" that I had to accept some kind of privacy notice or some shit. Essentially, being an iPhone user and turning on that service, I've agreed to allow them to listen to me.

But what about the guy who sits next to me? Are his rights to privacy being infringed because my phone can also hear him? If so, who is the one infringing? Me or Apple?

Apple is just an example, but imagine if someone has an Echo in an office or a Google Home. Or what if you're unknowingly standing next to someone with a Google or Apple device that's always listening?

I know my old Android phone had picked up people at the grocery store before. I never turned on "Ok Google" but I used the speech to text of the keyboard a few times. When people showed how you could go see what Google had "heard", I was surprised to find how many OTHER people it picked up.

Anyway, just some thinking.

question rights alexa-is-evil-but-i-like-it assistants privacy
7

vane

10439

5y

Ability to understand all machine learning models to modify code and those models directly and create better ones every time.

I would take existing ml model, modify it by hand to create better one, win some multimillion dollars competitions and make them open source.

Eventually all recommendation systems, text to speech, speech to text, music generation, movies generation etc would be opensource.

This would either destroy or boost all modern economy but for sure it would make harm to corporations and make them cry.

That would be fun to see.

rant wk262 machine learning

6
7

Jumpshot44

10988

10y

IPhone speech to text has come a long way. Definitely has improved. Real-time dictation rather than batching it.

I am currently doing approximately 50 percent of my rants by voice. In fact the rank you are reading I did by voice.

You can easily do punctuation such as a period, new paragraph, new line, caps and lower case. The speech recognition is excellent even with my New York accent and it learns the more you use it. Rarely does it get a word wrong.

Editing still has to be done manually and is a pain but that may change as dragon already allows you to do in-line editing. iOS speech to text has already surpassed dragon in some facets.

I do have to press the add new and post buttons at this Time to post my rants. But that may change as the enhanced dictation on the map allows you access to specific commands.

I will keep you informed of progress and I will be testing on android over the next few days as well.

undefined speech to text experiment

4
7

aalonzolu

317

348d

Client: “We need a very simple app where waiters speak orders, and it prints automatically. No typing.”

Also client:

Speech-to-text with noise

Identify table, dish, modifiers

Group orders per table

Add more orders later

Dual printer output

CSV reports

Budget? Less than a round of beers at the bar.

joke/meme web-dev ai clients unrealistic-expectations sarcasm speech-to-text devrant programmer-life software-estimates freelance voice-recognition

9
7

NitinSahu

461

7y

Damn happy to see this much traffic in my repo...
Title: Audio book generator
GitHub link:
https://github.com/globefire/...

Demonstration:
https://youtu.be/xhMvGg1dAsg

Star if you like it.. :)

rant speech to text audio books? text to speech innovative github audio books github audio project ebooks github star nailedit
6

myss

4396

7y

Eavesdropping by phone's microphone and speech recognition to serve targetted ads by Google? Anyone here had a feeling this happened to them or knows is this already a thing?

Happened to me on my Android phone multiple times over last year on different subjects, that I was talking live with a person, for example how someone had eyelid surgery (my phone was locked in my pocket the whole time and I didn't google search what that is, or made any text input into device whatsoever) and couple minutes later an ad came on my phone for exactly something we were discussing before. Weird coincedence or something more? 🤔

rant speech recognition google eavesdropping conspiracy ads 9/11

9
5

Lensflare

21314

1y

Have you tried chatgpt's text-to-speech feature?
It’s so much better than anything that I tried before. You can even choose different "personalities" or tones or whatever.

I‘d even say that it‘s perfect. I can’t think of anything that could be improved in terms of how well it pronounces words and puts emphasis on specific words. It’s 100% natural sounding.

random text2speech chatgpt

8
5

memeboard

1389

8y

I am building a web application which is multimedia centric (mostly video chat). Text to Speech and vice versa.
I have chosen Node with mongoDb as backend API with React Frontend.
What stack would you suggest for such an application?

question

8
5

vane

10439

7y

Next personal fail ...
previous rant
https://devrant.com/rants/2060249/...
Turned out that wavenet is sequential so it needs previous step to predict next.
Quite obvious when you look at how people speak sentences, they hardly stop in the middle of the word.
🤔
need to think how to proceed next, how to cut sentences.
Watched deepvoice3 and some accent models from baidu.
I can generate 8 sentences at a time, each takes 8 minutes so if I cut between words and got last mels between words right I can get 1 minute but I need to store model somewhere.

I forgot my machine learning and speech synthesis skills from previous life, time to load more skills ...

rant matrix text to speech machine learning developers life
5

tamasane

119

8y

Some professors at my university just come to the class and read out the pdf/slides.

Now I know how came the idea of Audio Books and Text-to-speech PDF readers !!!

rant boring lectures teaching wk89 university pdf reader mumbai

4
4

marthaeclark

4

5y

HTML Writers Guidelines

When designing your web site you want to make the visiting experience as enjoyable as possible and at the same time make it so that if the site needs to be changed in any way, the changes are not too difficult to make. You want the look to be as appealing as possible for all browsers and also make the site accessible to users with disabilities. In order to accomplish all this there are some general guidelines when creating your HTML code.

1. The first thing that will really make your life easier is through the use of Cascading Style Sheets (CSS) - CSS is used to maintain the look of the document such as the fonts, margins and color. HTML directly on the page is not a good choice to handle these aspects because if say, the font color you are using for certain paragraphs needs to be changed from blue to red, you would have to go in and change each color tag manually. By using CSS you can designate the color for each of those paragraphs just once in the CSS file. That way if you have to change the font color from blue to red you make one change instead of the countless number of changes you might have to make, especially if your web site contains hundreds of pages. This is a big time saver and a must for all professionally designed web sites.

2. Don't use the FONT tag directly in your HTML code - This becomes a problem when using some cheap authoring tools that try to mimic what a web page should look like by using excessive FONT tags and nbsp characters. These tools end up creating web pages that are impossible to keep maintained. There is a program you can use, if you've created one of these disaster pages, called the HTML Tidy Program which you can actually download here . This will clean up your code as well as possible.

3. You want your web pages readable to people who have disabilities - People who surf the Internet depend on speech synthesizers or Braille readers to interpret the text on the page. If your HTML markup is sloppy or isn't contained in CSS the software these people use to read pages have a difficult time in interpreting these pages. You should also include descriptions for each image on your page. Also, don't use server side image maps. If you are using tables you should include a summary of the table's structure and also associate table data with the correct headers. This gives non visual browsers a chance to follow the page as they go from one cell to another. And finally, for forms, make sure you include labels for form fields.

By following just these three guidelines you give your visitors, especially disabled visitors the best chance of having an enjoyable visit to your site while at the same time making it so that if you have to make changes to your site, those changes can be made easily and quickly.

rant html

2
4

bazmd

268

2y

Several years ago I spent over two months working out how to integrate Text To Speech and Speech To Text (TTS/STT) into any windows program I wrote in Delphi, originally for a powerful flat-file search engine. Does anyone know if TTS/STT is useful on windows 10+ or have any use?

I was thinking about redeveloping the search engine into a stand alone program which can be used as a fast and light query tool with trigger functions, it can be made into a "reply bot" or used with a server like Apache, but without the old IBM mainframe mentality being readopted as "AI" and "social media" everywhere today. low-level Independent and secure droid like systems sound more fun to develop.

question droid flatfile tts thoughts stt
4

leosuncin

185

9y

For my final project of first year at middle school (that's before university), I had to make a experiment and measured it using a circuit connected to the computer. At the end I couldn't finish but I made a program for explain what the circuit (expected) did using one of the Microsoft Office's assistant (Merlin the wizard), Merlin moved around the screen talking about the experiment and what the circuit measured it over and over, almost forgotten to tell I had to show it in a science festival to anybody who came at school, none asked about the experiment or the circuit, all the questions was about how I made the program, how the program could speech in spanish and explain the experiment.
At the begining of that day I was so nervous, but at the end I could say fuck yeah.
And the program was a macro in Basic with text to speech of a Loquendo like voice, I only record the movements and put the text.
That's one of the reason of I like programming, it save it my ass.
That was more than ten years ago, I didn't have a computer only at the school, internet not was so common.

undefined school basic true story

4
4

kosio-t

2348

7y

So happy!
I made my first project (or at least started) using my iPad (with some help from my laptop).

I am trying to make it possible for web comic artists to upload their comics without any text in the speech bubbles and then load the text using javascript for the specific locale.

It’s in an early stage (a few hour old) and the editor and the viewer share data only with cookies and local storage instead of a server but it's still a concept.
What do you think?

Github: https://github.com/konstantintuev/...

rant webcomics web development concept early stage internationalization

2
4

Ganofins

730

7y

Anybody know about a good open source speech to text engine?

I googled but there are tons of them and I don't have much time right now to try each them of out

What I actually want is just to convert the audio (in English) to text and would also want to note the time those sentences were spoke in the audio like a subtitle file.

question stt speech to text

7
4

nosoup4u

1945

9y

Inspired by an overheard conversation (partial) among some of my co-workers:

I'm going to make an app that takes a speech sample, either text, or audio file, and accurately gauges the speakers' ages based on the number of times per minute the word "restaurant" is used.

undefined jeopardy comes on at seven metamucil blue plate special

1
4

NitinSahu

461

7y

Developed this project "Audio Book Generator"

Implementing speech synthesis(📖 to 🗣) on eBooks
Bored with writing notes in a lecture? How about we convert the notes dictated by the lecturer into text? Use the speechtotext.py script to get the text format of spoken notes, which saves the text in a .txt file.
Too lazy to read a novel? Get an Ebook version of the novel and run the finalAudioBookGenerator.py script. It will generate an mp3(audio) format of the book. Enjoy book listening :)
You can also convert your single images using the singleImageReader.py script.

Demonstration:
https://youtu.be/xhMvGg1dAsg

Project:
https://github.com/globefire/...

Star If you liked it. :)

rant project python github audio books speech synthesis youtube text to speech speech to text tesseract

3
4

vane

10439

2y

Are text suggestions in phones threat to free speech ?

question future

3
4

Drekel

724

6y

Good day all
This is a Text Detector app I created using Google API and firebase MLKit
https://play.google.com/store/apps/...
Text to speech
Translate up to 60 languages
Download and give a review

rant mechine learning android java firebase

13
3

Wisecrack

9195

4y

Anyone tried converting speech waveforms to some type of image and then using those as training data for a stable diffusion model?

Hypothetically it should generate "ultrarealistic" waveforms for phonemes, for any given style of voice. The training labels are naturally the words or phonemes themselves, in text format (well, embedding vectors fwiw)

After that it's a matter of testing text-to-image, which should generate the relevant phonemes as images of waveforms (or your given visual representation, however you choose to pack it)

I would have tried this myself but I only have 3gb vram.

Even rudimentary voice generation that produces recognizable words from text input, would be interesting to see implemented and maybe a first for SD.

In other news:
Implementing SQL for an identity explorer. Basically the system generates sets of values for given known identities, and stores the formulas as strings, along with the values.
For any given value test set we can then cross reference to look up equivalent identities. And then we can test if these same identities hold for other test sets of actual variable values. If not, the identity string cam be removed, or gophered elsewhere in the database for further exploration and experimentation.

I'm hoping by doing this, I can somewhat automate the process of finding identities, instead of relying on logs and using the OS built-in text search for test value (which I can then look up in the files that show up, and cross reference the logged equations that produced those values), which I use to find new identities.

I was even considering processing the logs of equations and identities as some form of training data perhaps for a ML system that generates plausible new identities but that's a little outside my reach I think.

Finally, now that I know the new modular function converts semiprimes into numbers with larger factor trees, I'm thinking of writing a visual browser that maps the connections from factor tree to factor tree, making them expandable and collapsible, andallowong adjusting the formula and regenerating trees on the fly.

random

6
3

thiagoavadore

703

9y

A medical equipment that you can attach to employees and excruciatingly kill them as soon as they say things like (please note that the list is not limited and we should use a speech to text API to provide NLP states for the meaning - I want to catch all false negatives!! Kill them all!!!!):

- It works on my machine
- I tested it before!
- Haskell is a terrible language
- Big data and actionable insights
- why do you need unit tests here?
- I am a recruiter
- Anything that comes with the following construction as well: "I don't have anything against X, but..."

Any other suggestions of phrases?

undefined wk31

1
3

applejag

1395

7y

Coding a voice controlled IoT project is all fun and games in research until you notice no frameworks support your native language...

rant speech to text iot cognitive services

2
3

cuddlyogre

1518

2y

Do you want to use text to speech huh? Ha ha ha here’s a low battery pop up to completely derail what you were saying and make you repeat the last 10 seconds.

an actual good design would have waited until the text to speech function was complete to pop up the message or at least don’t stop recording what’s being said. But I guess I don’t understand innovation.

Think different indeed

rant

10
2

fraktalisman

1436

5y

Android Text-to-speech output: "English (Germany) is not supported" (my default language) ... but no alternative language option offered.

rant texttospeech android

1
2

LOLjustCoding

1088

7y

Looking for speech-to-text library python for a home automation project. SpeechRecognition doesn't really work out for me and Google won't give me any other good alternatives. Thanks!

question

3
2

AvatarOfKaine

3680

5y

I feel really lost in neural network theory.
the mnist sample made sense, but now I'm looking at Gans and CNN's.. and now all of a sudden I'm lost.

True also are the examples I'm finding of something I know I was able to get to work when more at peace once upon a time called wavenet for text to speech.

I used the Onyx model however which was very easy to implement, but I quickly get lost looking at the tensorflow and pytorch code, even though it is very short I feel intimidated.

The ssd mobilenet documentation also is pretty straightforward, but when I look for wavenet information about providing input in what format and interpreting output I'm having some trouble.

Its frustrating.
I'm tense, I'm poorly rested, I'm sick of having to redo crap and I'm surrounded by people who make me hypervigilant, skin crawly and tense.

How to overcome these things when I'm not at peace at all ?

I don't know. Pushing through it isn't compatable with the mindset I've been forced into.

random

5
2

vane

10439

2y

Part 3

https://devrant.com/rants/9881158/...

I dropped subtitles and started extracting audio from movie, after that I use whisper to convert speech to text.

I parse srt from whisper, adjust timestamps to get >= arbitrary amount of voice seconds. I put text to vector database with timestamps and movie file name.

I query database by ex. “I don’t know” and extract first n results, after that I walk trough movies and extract parts with found text.

I normalize and merge parts into one movie.

Results are satisfying so now I decided to try to find a common dialogue that I can watch by combining multiple persons speaking from multiple movies.

Might also try to extract person from one movie and put it to other movie.

rant artificial intelligence movies

2
2

NitinSahu

461

6y

Are you out of your free medium articles?😢 My Scrapy is here for the rescue.💸
This is simple application of web scraping, it scrapes the articles of medium and allows you to read or hear the article. If you use this on computer there will be a number of accents in the option.
The audio feature is provided only to the premium medium users, so here comes My Scrapy to save your 5$/month. 💸
.

Tech Stack used :
Python, beautiful soup, Django, speech synthesis
.

PS: This application was built for educational purpose and the source code for this application is not open sourced anywhere.
Fun Fact : You can still read any medium articles if they ask you to upgrade, you must be wondering how? Well, copy the link of the article and browse it in incognito mode on any browser.😂🤣

Try the app and lemme know if you liked it:
https://mymediumscraper.herokuapp.com/...

rant medium.com hackerman article django python python ranting after a long time text to speech

4
2

Wisecrack

9195

2y

Chinese remainder theorem

So the idea is that a partial or zero knowledge proof is used for not just encryption but also for a sort of distributed ledger or proof-of-membership, in addition to being used to add new members where additional layers of distributive proofs are at it, so that rollbacks can be performed on a network to remove members or revoke content.

Data is NOT automatically distributed throughout a network, rather sharing is the equivalent of replicating and syncing data to your instance.
Therefore if you don't like something on a network or think it's a liability (hate speech for the left, violent content for the right for example), the degree to which it is not shared is the degree to which it is censored.

By automatically not showing images posted by people you're subscribed to or following, infiltrators or state level actors who post things like calls to terrorism or csam to open platforms in order to justify shutting down platforms they don't control, are cut off at the knees. Their may also be a case for tools built on AI that automatically determine if something like a thumbnail should be censored or give the user an NSFW warning before clicking a link that may appear innocuous but is actually malicious.

Server nodes may be virtual in that they are merely a graph of people connected in a group by each person in the group having a piece of a shared key.
Because Chinese remainder theorem only requires a subset of all the info in the original key it also Acts as a voting mechanism to decide whether a piece of content is allowed to be synced to an entire group or remain permanently.

Data that hasn't been verified yet may go into a case for a given cluster of users who are mutually subscribed or following in a small world graph, but at the same time it doesn't get shared out of that subgraph in may expire if enough users don't hit a like button or a retain button or a share or "verify" button.

The algorithm here then is no algorithm at all but merely the natural association process between people and their likes and dislikes directly affecting the outcome of what they see via that process of association to begin with.

We can even go so far as to dog food content that's already been synced to a graph into evolutions of the existing key such that the retention of new generations of key, dependent on the previous key, also act as a store of the data that's been synced to the members of the node.

Therefore remember that continually post content that doesn't get verified slowly falls out of the node such that eventually their content becomes merely temporary in the cases or index of the node members, driving index and node subgraph membership in an organic and natural process based purely on affiliation and identification.

Here I've sort of butchered the idea of the Chinese remainder theorem in shoehorned it into the idea of zero knowledge proofs but you can see where I'm going with this if you squint at the idea mentally and look at it at just the right angle.

The big idea was to remove the influence of centralized algorithms to begin with, and implement mechanisms such that third-party organizations that exist to discredit or shut down small platforms are hindered by the design of the platform itself.

I think if you look over the ideas here you'll see that's what the general design thrust achieves or could achieve if implemented into a platform.

The addition of indexes in a node or "server" or "room" (being a set of users mutually subscribed to a particular tag or topic or each other), where the index is an index of text audio videos and other media including user posts that are available on the given node, in the index being titled but blind links (no pictures/media, or media verified as safe through an automatic tool) would also be useful.

random open platforms social media rtd distributed ledgers

9

Top Tags

rant linux code windows fuck i java c programming android dev the is javascript js a life joke python

Weekly Rant

Most unrealistic deadline you've had?

devRant © 2021 Hexical Labs LLC
Privacy Policy | Terms of Service