We have three datasets in Jmail now:

1. DOJ (The White House's docs that they were required by law to drop yesterday plus many court documents, videos, and other docs from many news cycles this year)

2. HOUSE_OVERSIGHT (the House Oversight Committee's releases. giant November drop that led to the original Jmail, then some photo drops this month)

3. Yahoo emails (originally sourced by DDoSecrets, then provided to us, redacted and verified by Drop Site News)

There is so much material in HOUSE_OVERSIGHT that never appears in DOJ, and vice versa. And then the Yahoo drop reveals even more new material. It feels like three odd slices of a giant dataset that keeps getting released.

re: people's complaints about yesterday's release having way too many redactions, I have no idea how much they over-redacted. I hear that they will release even more quite soon though.

Why and how is the data from DDoSecrets redacted?

Do you have a page about each dataset you're sourcing and the background on them like your provide here?

The "EFTA00000468" saga has me distrusting the authenticity of most of these datasets.

Re: the DOJ emails prefixed with "EFTA", I have no idea how over-redacted they are. They definitely seem dubious though.

Re: the DDoSecrets emails though (YAHOO dataset), I have more to share.

Drop Site News agreed to give us access to the Yahoo dataset discovered by DDoSecrets, but on the condition that we help redact it. It's a completely unfiltered dataset. It's literally just .eml files for jeeprojects@yahoo.com. It includes many attached documents. There is no illegal imagery, but it has photos of Epstein's extended family (nephews, nieces, etc) and headshots of many models that Epstein's executive assistant would send to him. I was quite shocked that this thing existed.

We built some internal redaction tools that the Drop Site team is now using to comb through all of this. We've released 5 batches of the Yahoo mail now, with the 1k+ Amazon receipts being the most recent.

A few thoughts on how we do redaction are here: https://www.jmail.world/about.

Unlike the DOJ, we've tried to minimize the ambiguity about what was redacted.

For example: all redacted images are replaced with a Gemini-generated description of that photograph.

Another example: we are aggressively redacting email addresses and phone numbers of normal people to avoid spamming them. Perhaps others would leave it all in, but Riley and I don't want to be responsible for these people's lives getting disrupted by this entire saga. For example, we redacted this guy's email but not his name: https://www.jmail.world/thread/4accfb5f3ed84656e9762740081a4...

Riley and I were not expecting this type of scope when we first dropped Jmail. Jmail is an interesting side project for us, and this new dataset requires full-time attention. Thankfully we have help though. We're happy to take on this responsibility given how helpful, thoughtful and careful both the Drop Site and DDoSecrets team has been here.

I appreciate the links and transparency. Answers a lot of my questions. Thanks

[deleted]

Ah I was going to ask about the Yahoo emails.. are those distinct from the cloned Gmail messages or are they in the same inbox on your site?

Has anyone written a parser for the text messages? A messages-like UI to be able to read through all the texts would be super interesting too. The format DOJ released them in is impossible to follow.

big motivation for the whole project is to help structure the mess that was released

Another person made an oddly beautiful ASCII ui for the text messages. All seem to be from HOUSE_OVERSIGHT (we have those plus DOJ, YAHOO. No dedicated text UI from us)

https://michelcrypt4d4mus.github.io/epstein_text_messages/

He also shouted us out last month which was very kind of him