986
987
988

I Archived The Entire Subreddit And Coded A Simple Website To Read It (self.TheRedPill)

submitted by dream-hunter

Link: https://theredarchive.xyz/

Preview: https://i.imgur.com/BXXQOke.png & https://i.imgur.com/niDdoEW.png

As a web developer that discovered TRP 1 year ago and is very grateful for the subreddit, I've always wanted to contribute here, but I never knew how, until now. After TRP has been quarantined, I feared it would get banned one day. So I decided to figure out a way to scrape the entire subreddit and have it viewed on a simple website.

I saw TRP's current backup of the subreddit and I wasn't happy with its design and hard-to-use website (and its lack of posts). So I decided to spend 8 hours trying to figure out how to scrape the entire subreddit and then code a website to view the posts as simple as possible.

Features:

  • 160,035 posts from TheRedPill & askTRP & RedPillParenting & RedPillWomen & ThankTRP & becomeaman & altTRP & GEOTRP
  • Comments (+ replies) included
  • TRP's subreddit theme
  • Search through all the posts instantly
  • Lightweight and simple to use website & no ads
  • DDoS protected and secured

Edit: Thank you for the amazing feedback everyone. I just finished scrapping RedPillWomen & RedPillParenting & ThankTRP and added them both to the website. Up to 1.5k posts added.

Edit 2: Almost every single post (around 64k) ever posted on TRP, from all the way back to 2012 till now, can now be viewed on the website.

Edit 3: Option to search through entire archive added; search through titles, posts, authors and even comments.

Edit 4: Every single post from askTRP, becomeaman and RedPillWomen (altTRP + GEOTRP) has been archived; that counts up to 60k posts since 2012. We are now at a total of 160k archived posts!


[–]Modredpillschool[M] [score hidden] stickied comment (5 children)

Thanks. So did we. https://www.forums.red/i/theredpill

We will always welcome more backup points, so thank you for helping out.

Our backup includes 3,176 Posts between TRP and AskTRP. With over 519,619 individual comments.

Counting RPW and ThankTRP, our total archive is 4,573 posts.

Instant search can be found here

In case of emergency we will release the entire database as torrent. FYI.

Forums opening soon with new mobile-friendly design.

[–]ReeZoX 51 points52 points  (3 children)

Really like it!

Only thing I would like is being able to sort/search the results based on the categories (flairs) and then sort that with the most upvoted one's :)

[–]izzyinjurious 24 points25 points  (12 children)

What language did you use to scrape it? It's awesome btw good work.

[–]SpiderAlpha33 2 points3 points  (0 children)

Yeah BeautifilSoup is fast and reliable, and for dynamically generated websites I use Selenium with headless Firefox.

[–]ThePantsThief 2 points3 points  (5 children)

Why scrape it when they have an API? Fellow developer here. Curious in case the scraping solution is better somehow

[–]Modredpillschool 3 points4 points  (0 children)

The https://forums.red/i/TheRedPill archive was done via API

[–]needz 0 points1 point  (3 children)

Sometimes if you already have your favorite tool and it works, there's no need to learn another tool (or in this case, API).

[–]SilkTouchm 4 points5 points  (2 children)

Scraping is always harder than just using the api.

[–]needz 0 points1 point  (1 child)

Never say never or always.

So if I've been using BeautifulSoup since it came out, have an entire framework and dev environment dedicated to scraping websites and have little to no experience with APIs, which is gonna be harder?

[–]SilkTouchm 1 point2 points  (0 children)

which is gonna be harder?

Scraping, by far. It doesn't matter how much experience you have, you still have to do the legwork. An API is just using a few pre-packaged methods, you don't need experience on it.

[–]the-dan-man 0 points1 point  (1 child)

Out of curiousity, why is PHP your favourite language, and where did you learn it?

[–]Brushyourteethm8 11 points12 points  (2 children)

Solid work, thanks! Will you be doing the same for AskTRP, MRP and AskMRP? Some solid posts and advice in each

[–]fuckboiwithfeelings 26 points27 points  (4 children)

The community that keeps on giving, way to go!

[–]Modredpillschool 12 points13 points  (2 children)

I can invite you to our private dev forum (open soon). Instead of duplicating efforts we have a lot of tasks that need doing.

[–]RedPillHanSolo 0 points1 point  (1 child)

I've sent you numerous e-mails, but haven't heard back. Is it because you didn't get them or you only invite trusted members and whatnot?

[–]Modredpillschool 2 points3 points  (0 children)

I haven't forgotton about you. I simply haven't gotten the private forum running yet. Lots of work going on in the background.

[–]robodylan123 10 points11 points  (2 children)

Did something similar here: http://trpbackup.com I have every single post backed up but only the top 1000 are browsable at the moment. You’ll be able to search all of them soon though.

[–]unn4med 2 points3 points  (0 children)

This is great that so many backups are being created! TRP shall live on.

[–][deleted] 7 points8 points  (3 children)

Great website but is there an option to search by date?

[–]hardlifeman 4 points5 points  (5 children)

Thanks for doing this.

When I click on these "load more comments" in old posts, it opens up a seperate "about:blank" page.

[–]Modredpillschool 1 point2 points  (0 children)

Until he fixes it, forums.red does have exhaustive comment chains.

[–]WarViper1337 4 points5 points  (0 children)

With the way things are going having extra back ups is probably a good thing.

[–]huhub 10 points11 points  (0 children)

I've been working on one of TRP backups, tried to post here, but the post did not make it.

AskTRP announce is here: asktrp/comments/9jxdtf/trp_sub_offline_backup/

I have completed 2016 and 2017, 2013-2015 are in progress, and I'm going to periodically update 2018. The entire archive contain 68,000+ posts + uncountable comments, no scripts and other unnecessary information (which reduced the size by half at least).

[–]El_Ejcovero 3 points4 points  (0 children)

You can never have too many backups, especially with the purging of "controversial" content and people online as of late.

[–]Magnus_ORily 3 points4 points  (0 children)

I feel part of an ancient civilisation who's history has been preserved

[–]A_Bandini 1 point2 points  (0 children)

Beautiful work dude. Thank you for this. The sacred archives cannot be lost

[–]Mgtow_Maester 3 points4 points  (3 children)

Any chance you could do that for mgtow as well? Not to mention some other manosphere subs.

[–]JerryAwesome 0 points1 point  (0 children)

Awesome, thanks!

[–]ImmunosuppressedTau 0 points1 point  (0 children)

Thanks man!

[–]SalesOverEverything 0 points1 point  (0 children)

Just wanted to thank you as well. This makes a real impact on men everywhere, usually for the better.

Thank you.

[–]lemonized 0 points1 point  (0 children)

Thank you! This is just so compact and simply amazing. Contains every type of sorting and a live search. Amazing work. Never hurts to have more backups just in case I guess.

[–]_Neon_Shadow_ 0 points1 point  (0 children)

Amazing! I'm not in love with trp.red but this is an excellent substitute. Thank you.

[–]YesToControversy 0 points1 point  (0 children)

Have this virtual beer, mate. Don't drink it and drive though!

🍺

[–]lonefireinwater 0 points1 point  (0 children)

Thank you! This is amazing.

[–]duehvdke 0 points1 point  (0 children)

You're doing god's work my guy, keep it up!

[–]BlueFreedom420 0 points1 point  (0 children)

Thank you for this. Keep the memory before the big blackout (when the state and the elites to finally take the Earth)

[–]MortalSisyphus 0 points1 point  (1 child)

This is amazing. It would be great to have this done on my subreddit, which is obviously at high risk of being banned. Since you already developed the code, would it be particularly difficult to set this up for another sub? Please let me know.

[–]ubisoft-vs-ea 0 points1 point  (0 children)

You are amazing, if I wasn’t poor I’d give you gold

[–]johnpayne10 0 points1 point  (0 children)

Dude, excellent job. The website interface is really smooth, it loads very quickly. I want to give two suggestions though.

 

First: Maybe you shouldn't put asktrp posts in there? Most of the posts in asktrp are not worth anyone's time. It is a good thing to provide advice to someone asking for help. But I don't think it is necessary to add asktrp questions to the archive site. Add more TRP posts if you can. Because the posts and the comments sections on TRP are invaluable. If you already have added the top posts from all time, add the ones from this current year. Add the hot posts. The new posts.

 

Second: For the posts that you have uploaded only a part of the comments section is avaIlable. If possible, try to add the entire comments section for all posts. Because their is a lot of gold in the comments.

 

That being said, you have done an excellent job. It will help a lot of people out there. So congratulations and thank you.

[–]wereworm5 0 points1 point  (0 children)

You are the hero we all wanted !

[–]Asktrpthrowaway420 0 points1 point  (0 children)

Really awesome, clean and easy to navigate

[–]standardmissile 0 points1 point  (0 children)

Brilliant. Apart from the practical benefit this is a great example of DOING rather than WHINGING in response to the quarantine.

Some readers here really are internalising TRP and it's fantastic to see. Well done OP.

[–]DulceDeLecheMardel 2 points3 points  (2 children)

Can you make it downloadable so we can mirror it?

[–] points points

[permanently deleted]