Hacker News

I'll note that Persona's CEO responded on LinkedIn [1] pointing out that:

  - No personal data processed is used for AI/model training. Data is exclusively used to confirm your identity.
  - All biometric personal data is deleted immediately after processing.
  - All other personal data processed is automatically deleted within 30 days. Data is retained during this period to help users troubleshoot.
  - The only subprocessors (8) used to verify your identity are: AWS, Confluent, DBT, ElasticSearch, Google Cloud Platform, MongoDB, Sigma Computing, Snowflake

The full list of sub-processors seems to be a catch-all for all the services they provide, which includes background checks, document processing, etc. identity verification being just one of them.

I have I've worked on projects that require legal to get involved and you do end up with documents that sound excessively broad. I can see how one can paint a much grimmer picture from documents than what's happening in reality. It's good to point it out and force clarity out of these types of services.

[1]: https://www.linkedin.com/feed/update/urn:li:activity:7430615...

y-c-o-m-b 6 hours ago [ - ]

All of which is meaningless if it's not reflected properly in their legal documents/terms. I've had interactions with the Flock CEO here on Hacker News and he also tried to reassure us that nothing fishy is/was going on. Take it with a grain of salt.

shimman 6 hours ago [ - ]

Why anyone would trust the executives at any company when they are only incentivized to lie, cheat, and steal is beyond me. It's a lesson every generation is hellbent on learning again and against and again.

It use to be the default belief, throughout all of humanity, on how greed is bad and dangerous; yet for the last 100 years you'd think the complete opposite was the norm.

godelski 5 hours ago [ - ]

  > when they are only incentivized to lie, cheat, and steal

The fact that they are allowed to do this is beyond me.

The fact that they do this is destructive to innovation and I'm not sure why we pretend it enables innovation. There's a thousands multi million dollar companies that I'm confident most users here could implement, but the major reason many don't is because to actually do it is far harder than what those companies build. People who understand that an unlisted link is not an actual security measure, that things need to actually be under lock and key.

I'm not saying we should go so far as make mistakes so punishable that no one can do anything but there needs to be some bar. There's so much gross incompetence that we're not even talking about incompetence; a far ways away from mistakes by competent people.

We are filtering out those with basic ethics. That's not a system we should be encouraging

judahmeek 3 hours ago [ - ]

Because the liars who have already profited from lying will defend the current system.

The best fix that we can work on now in America is repealing the 17th amendment to restrengthen the federal system as a check on populist impulses, which can easily be manipulated by liars.

touristtam 2 hours ago [ - ]

So your senators were appointed before that? No election needed?

bitwize 2 hours ago [ - ]

Yes, by state legislatures. The concept was the Senate would reflect the states' interests, whereas the House would reflect the people's interests, in matters of federal legislation.

jeffybefffy519 3 hours ago [ - ]

Yup exactly, if this is the truth then put it on the terms/privacy policy etc... exec's say anything these days with zero consequences for lieing in a public forum.

nashashmi 5 hours ago [ - ]

Can a ceo's word on linkedin and X be used to make claims against them?

majormajor 7 hours ago [ - ]

But why believe that when their policy says any of it may not be true, or could change at any time?

Even if the CEO believes it right now, what if the team responsible for the automatic-deletion merely did a soft-delete instead of a hard delete "just in case we want to use it for something else one day"?

BorisMelnik 6 hours ago [ - ]

I dont believe that for one second. I can think of many examples of times CEO's have said things publicly that were not or ended up being not true!

vinay_ys 6 hours ago [ - ]

> that require legal to get involved and you do end up with documents that sound excessively broad

If you let your legal team use such broad CYA language, it is usually because you are not sure what's going on and want CYA, or you actually want to keep the door open for broader use with those broader permissive legal terms. On the other hand, if you are sure that you will preserve user's privacy as you are stating in marketing materials, then you should put it in legal writing explicitly.

godelski 5 hours ago [ - ]

  > - All biometric personal data is deleted immediately after processing.

The implication is that biometric data leaves the device. Is that even a requirement? Shouldn't that be processed on device, in memory, and only some hash + salt leave? Isn't this how passwords work?

I'm not a security expert so please correct me. Or if I'm on the right track please add more nuance because I'd like to know more and I'm sure others are interested

wholinator2 5 hours ago [ - ]

I'm not an expert but i imagine bio data being much less exact than a password. Hashes work on passwords because you can be sure that only the exact date would allow entry, but something like a face scan or fingerprint is never _exactly_ the same. One major tenant that makes hashes secure is that changing any singlw bit of input changes the entirety of the output. So hashes will by definition never allow the fuzzy authentication that's required with biodata. Maybe there's a different way to keep that secure? I'm not sure but you'd never be able to open your phone again if it requires a 100% match against your original data.

godelski 4 hours ago [ - ]

I'd assume they'd use something akin to a perceptual hash.

Btw, hashes aren't unique. I really do mean that an input doesn't have a unique output. If f(x)=y then there is some z such that f(z)=y.

Remember, a hash is a "one way function". It isn't invertible (that would defeat the purpose!). It is a surjective function. Meaning that reversing the function results in a non-unique output. In the hash style you're thinking of you try to make the output range so large that the likelihood of a collision is low (a salt making it even harder), but in a perceptual hash you want collisions, but only from certain subsets of the input.

In a typical hash your collision input should be in a random location (knowing x doesn't inform us about z). Knowledge of the input shouldn't give you knowledge of a valid collision. But in a perceptual hash you want collisions to be known. To exist in a localized region of the input (all z are near x. Perturbations of x).

https://en.wikipedia.org/wiki/Perceptual_hashing

egorfine 6 hours ago [ - ]

A KYC provider is a company that doesn't start with neutral trust. It starts with a huge negative trust.

Thus it is impossible to believe his words.

jcheng 5 hours ago [ - ]

Can you say more? Why isn't it neutral or slightly positive? I would assume that a KYC provider would want to protect their reputation more than the average company. If I were choosing a KYC provider I would definitely want to choose the one that had not been subject to any privacy scandals, and there are no network effects or monopoly power to protect them.

egorfine an hour ago [ - ]

> Why isn't it neutral or slightly positive?

Because KYC is evil in itself and if the linked article does not explain to you why is that then I certainly cannot.

> KYC provider would want to protect their reputation more than the average company

False. It is exactly the opposite. See, there are no repercussions for leaking customers data, while properly securing said data is expensive and creates operational friction. Thus, there are NO incentives to protect data while there ARE incentives to care as less as possible.

Bear in mind that KYC is a service that no one wants, anll customers are forced and everybody hates it: customers, users, companies.

flumpcakes 6 hours ago [ - ]

What does the (I assume) acronym KYC mean?

egorfine 6 hours ago [ - ]

Kill Your Customer.

sieabahlpark an hour ago [ - ]

[dead]

astura 6 hours ago [ - ]

Know your customer

https://en.wikipedia.org/wiki/Know_your_customer

tripdout 6 hours ago [ - ]

Know Your Customer

pyrale an hour ago [ - ]

> pointing out that

Certainly, you mean: "claiming that".

In the terms of Mandy Rice-Davies [1], "well he would, wouldn't he?" Especially, his claim that the data isn't used for training by companies that are publicly known to have illegally acquired data to train their models doesn't look very serious.

[1]: https://en.wikipedia.org/wiki/Well_he_would,_wouldn%27t_he%3...

saghm 7 hours ago [ - ]

I'm not convinced there's any significant overlap between "people who are worried about which subprocessors have their data" and "people who don't think that eight subprocessors is a lot"

__float 7 hours ago [ - ]

I mean, two of them are cloud vendors. The rest just seem like very boring components of a (somewhat) modern data pipeline.

barryhennessy 5 hours ago [ - ]

As an industry we really need a better way to tell what’s going g where than:

- someone finally reading the T&Cs

- legal drafting the T&Cs as broadly as possible

- the actual systems running at the time matching what’s in the T&Cs when legal last checked in

Maybe this is a point to make to the Persona CEO. If he wants to avoid a public issue like this then maybe some engineering effort and investment in this direction would be in his best interest.

whatever1 3 hours ago [ - ]

Facebook at some period was pushing users to enable 2fa for security reasons, and guess what they did with the phone numbers they collected.

dataflow an hour ago [ - ]

If he's really so confident these assurances will stand scrutiny then why doesn't he put them in the agreement and provide legal assurance to that effect?

mdani 3 hours ago [ - ]

I am wondering what the 'sub-processor' means here. Am I right in assuming that the Persona architecture uses Kafka, S3 data lake in AWS and GCP, Elastic Search, MongoDB for configuration or user metadata, and Snowflake for analytics, thus all these end up on sub-processle list as the data physically touches these company's products or infra hosted outside Persona? I hope all these aren't providing their own identity services and all of them aren't seeing my passport for further validation.

singleshot_ 4 hours ago [ - ]

Why would anyone believe this?

hansmayer 3 hours ago [ - ]

Right, because as seen over the last several years, the Big Tech CEOs should totally be trusted on their promises, especially if it is related to how our sensitive personal data is stored and processed. This goes even wtihout knowing who is one of the better known "personas" investing in Persona.

rawgabbit 5 hours ago [ - ]

This reads like their entire software stack. I don’t understand the role ElasticSearch plays; are people still using it for search?

Infrastructure: AWS and Google Cloud Platform

Database: MongoDB

ETL/ELT: Confluent and DBT

Data Warehouse and Reporting: Sigma Computing and Snowflake

smw 5 hours ago [ - ]

What possible use legitimate use is Snowflake in verifying your identity? ES?

rawgabbit 7 minutes ago [ - ]

It's probably used to aggregate all their data sources to compile profiles. They then match the passport against their database of profiles. To say, yup, this passport is for real person; not a deceased person whose identity was stolen for example.

lysace 7 hours ago [ - ]

All of those statements require trust and/or the credible threat of a big stick.

Trust needs to earned. It hasn't been.

The big stick doesn't really exist.

kwar13 6 hours ago [ - ]

this is just "trust me bro" with more words. even if true, the point is not what they do right now, the point is what they CAN do, which clearly as pointed in terms is a lot more than that.

YorickPeterse 4 hours ago [ - ]

Ah yes, because companies never lie about how they process your data...

SilverElfin 6 hours ago [ - ]

Why would we believe they are deleted after processing and not shared with the government?

astura 6 hours ago [ - ]

What's the government going to do with a picture of the ID they, themselves issued to you?

JoshTriplett 4 hours ago [ - ]

Associate it with the specific service they don't want you using, or transactions they don't want you making, or conversations and connections they don't want you having.

attila-lendvai an hour ago [ - ]

it's one service collecting ID's issued by dozens of governments.

the already too centralized is being made even more centralized here.

SilverElfin an hour ago [ - ]

As an example, the state government may issue a particular ID that I use in several different places. But the federal government did not issue that ID to me.

paulnpace 7 hours ago [ - ]

Whelp, so long as the CEO says it's fine, we've no reason to worry about what's in the legal verbiage.