Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API

From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
retoor81492dThat's the big issue with hackers scanning my repository all the with such much concurrent brute force that I had to do rate limiters, but now migrated to dedicated server, so meh. Still 500gb+ a month of html serving tho. It's stupid to brute force all that shit, it only makes people notice and why can't you just wait a hour for the results. The scanning of my repositories takes days, especially with the hard ratelimiter on firewall level. So, if they didn't bruteforce it as much, they would not have faced a ratelimiter making their task impossible. Idiots btw, just scanning for Auth thinks in git history. Just clone the repos and scan locally. Saves days and I wouldn't notice it because cloning is normal behavior.
-
@retoor yeah except I'm only rubbing 5 to 10 threads on small html fragments I'm parsing.
And the Lan is fucking throttling me to like one 5 pages every 4 seconds using wget -
@retoor nm I remembered the reason I'm using wget is requests was throwing an error
-
retoor81492d@AvatarOfKaine I recently found out how how you can crawl using thousands of ip addresses. I prefer aiohttp to do the requests in async these days, never use requests library. Asyncio will be much faster than requests and wget, also, it checks if it has to use ipv6 or ipv4 what could've been the issue for your requests or the default request user agent was blocked. Try to copy the headers of your browser. http://molodetz.nl/debug.json lists your headers.
-
@retoor the problem with using Asuncio and requests is the server or some part of the stack was rejecting it
-
retoor81492d@AvatarOfKaine don't you want to figure out what is is? I still vouch for headers or ipv6 usage or so.
-
@retoor Exception has occurred: SSLError
HTTPSConnectionPool(host='frs-public.epa.gov', port=443): Max retries exceeded with url: /ords/frs_public2/national_kml.registry_html?p_registry_id=110001406194 (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1028)')))
ssl.SSLError: [SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1028)
During handling of the above exception, another exception occurred:
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='frs-public.epa.gov', port=443): Max retries exceeded with url: /ords/frs_public2/national_kml.registry_html?p_registry_id=110001406194 (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1028)'))) -
@retoor
During handling of the above exception, another exception occurred:
File "/home/john/Documents/placeflattener_git/Utils/EPAScraping/htmltableextractor.py", line 56, in <module>
html = requests.get('https://ofmpub.epa.gov/frs_public2/...')
requests.exceptions.SSLError: HTTPSConnectionPool(host='frs-public.epa.gov', port=443): Max retries exceeded with url: /ords/frs_public2/national_kml.registry_html?p_registry_id=110001406194 (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1028)'))) thrown by request.get of a url -
@retoor this is a hefty government server that serves these pieces of data to platforms like arcgis and Google earth and I'm simply getting error 4 from wget
-
retoor81492d@AvatarOfKaine ah, that's very nasty. I would upgrade ssl and recompile python I guess. What does file_get_contents("https://..") do in php? Interesting, wget does apparently not use libcurl. Never thought about it. Wget is always available afaik and curl not huh. If you do ldd wget, doesn't it show the ssl module?
Or you know what? Fuck it, if you don't wanna rape your whole system, just run it in a docker container. Sure that would be effective. Use a python container.
once.. there was this code that
had major bottle neck
when performing get calls
and when i finally fixed it
the feds
called me and told me to stop it
they said it was cuz that
the client hit their site
soooo hard
mmmmmmmMMMMmmmMMmmMMMMMMMmmmMMM
and then there was this time that
I adjusted the hitting rate
to something a little slower
and when I finally ran it
dem feds
they didn't even notice
I couldn't quite explain it
maybe if I raised the limit
hiiighhhh errrrrrr
MmmmmmmMMMMMMMmmmmmMMMMMMmmMMM
hahaha crash test dummies :P
rant