Diogenes
kiwifarms.net
Hello, I just made an account to ask for some advice, as this program I want to make could be interesting for the website. It's a program which archives all images and stuff from a thread on 4chan and saves it. I'm going to proceed to describe it, and pray for someone to either give me some link to learn what I'm missing, or spoonfeed me:
1/ The program checks every thread of any amount of specific boards each tick time, which might range from seconds to around 10 minutes or so.
2/ The moment the program captures a keyword the user wants to find, which might be the name of a person or whatever you are interested in collecting, the thread will be monitored until its impending death, and saved. This means all text and full images are stored.
3/ The program can be turned on and off whenever the user needs and the threads are stored in order and by board.
As for now, my only programming skills have to do with mathematics one way or the other (see Project Euler), I've never made a web-crawler before and the only programming language I'm half decent at is Java. I request for some programming savant to show me the first step on my quest to make this program, which will be released on github for everybody to play with.
I'm also concerned with how slow it might be and the limits it might have to search for keywords, as 4chan holds a lot of threads and material at any given moment. I know it goes against the perishable philosophy of the website but I want to archive some parts. I'm sorry for the autism this posts shows, I hope someone helps me so I can make something worthwhile for this community.
1/ The program checks every thread of any amount of specific boards each tick time, which might range from seconds to around 10 minutes or so.
2/ The moment the program captures a keyword the user wants to find, which might be the name of a person or whatever you are interested in collecting, the thread will be monitored until its impending death, and saved. This means all text and full images are stored.
3/ The program can be turned on and off whenever the user needs and the threads are stored in order and by board.
As for now, my only programming skills have to do with mathematics one way or the other (see Project Euler), I've never made a web-crawler before and the only programming language I'm half decent at is Java. I request for some programming savant to show me the first step on my quest to make this program, which will be released on github for everybody to play with.
I'm also concerned with how slow it might be and the limits it might have to search for keywords, as 4chan holds a lot of threads and material at any given moment. I know it goes against the perishable philosophy of the website but I want to archive some parts. I'm sorry for the autism this posts shows, I hope someone helps me so I can make something worthwhile for this community.