Dark Theme


270,011 posts archived

986
987

Link: https://theredarchive.xyz/

Preview: https://i.imgur.com/BXXQOke.png & https://i.imgur.com/niDdoEW.png

As a web developer that discovered TRP 1 year ago and is very grateful for the subreddit, I've always wanted to contribute here, but I never knew how, until now. After TRP has been quarantined, I feared it would get banned one day. So I decided to figure out a way to scrape the entire subreddit and have it viewed on a simple website.

I saw TRP's current backup of the subreddit and I wasn't happy with its design and hard-to-use website (and its lack of posts). So I decided to spend 8 hours trying to figure out how to scrape the entire subreddit and then code a website to view the posts as simple as possible.

Features:

  • 160,035 posts from TheRedPill & askTRP & RedPillParenting & RedPillWomen & ThankTRP & becomeaman & altTRP & GEOTRP
  • Comments (+ replies) included
  • TRP's subreddit theme
  • Search through all the posts instantly
  • Lightweight and simple to use website & no ads
  • DDoS protected and secured

Edit: Thank you for the amazing feedback everyone. I just finished scrapping RedPillWomen & RedPillParenting & ThankTRP and added them both to the website. Up to 1.5k posts added.

Edit 2: Almost every single post (around 64k) ever posted on TRP, from all the way back to 2012 till now, can now be viewed on the website.

Edit 3: Option to search through entire archive added; search through titles, posts, authors and even comments.

Edit 4: Every single post from askTRP, becomeaman and RedPillWomen (altTRP + GEOTRP) has been archived; that counts up to 60k posts since 2012. We are now at a total of 160k archived posts!


[–]Modredpillschool[M] [score hidden] stickied comment (5 children)

Thanks. So did we. https://www.forums.red/i/theredpill

We will always welcome more backup points, so thank you for helping out.

Our backup includes 3,176 Posts between TRP and AskTRP. With over 519,619 individual comments.

Counting RPW and ThankTRP, our total archive is 4,573 posts.

Instant search can be found here

In case of emergency we will release the entire database as torrent. FYI.

Forums opening soon with new mobile-friendly design.

[–]SpiderAlpha33 2 points3 points  (0 children)

Yeah BeautifilSoup is fast and reliable, and for dynamically generated websites I use Selenium with headless Firefox.

[–]ThePantsThief 2 points3 points  (5 children)

Why scrape it when they have an API? Fellow developer here. Curious in case the scraping solution is better somehow

[–]Modredpillschool 3 points4 points  (0 children)

The https://forums.red/i/TheRedPill archive was done via API

[–]needz 0 points1 point  (3 children)

Sometimes if you already have your favorite tool and it works, there's no need to learn another tool (or in this case, API).

[–]SilkTouchm 4 points5 points  (2 children)

Scraping is always harder than just using the api.

[–]needz 0 points1 point  (1 child)

Never say never or always.

So if I've been using BeautifulSoup since it came out, have an entire framework and dev environment dedicated to scraping websites and have little to no experience with APIs, which is gonna be harder?

[–]SilkTouchm 1 point2 points  (0 children)

which is gonna be harder?

Scraping, by far. It doesn't matter how much experience you have, you still have to do the legwork. An API is just using a few pre-packaged methods, you don't need experience on it.

[–]the-dan-man 0 points1 point  (1 child)

Out of curiousity, why is PHP your favourite language, and where did you learn it?