I'm happy to see it. They should have included Roku in that too!

> Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution. These snapshots are scanned through a database of content and ads, which allows the exposure to be matched to what is airing. For example, if a streamer is watching an NFL football game and sees an ad for a hard seltzer, Roku’s ACR will know that the ad has appeared on the TV being watched at that time. In this way, the content on screen is automatically recognized, as the technology’s name indicates. The data then is paired with user profile data to link the account watching with the content they’re watching.

https://advertising.roku.com/learn/resources/acr-the-future-...

I wouldn't be surprised if my PS5 was doing the same thing when I'm playing a game or watching a streaming service through it.

Most likely case is that the tv is computing hash locally and sending the hash. Judging by my dnstap logs, roku TV maintains a steady ~0.1/second heartbeat to `scribe.logs.roku.com` with occasional pings to `captive.roku.com`. The rest are stragglers that are blocked by `*.roku.com` DNS blackhole. Another thing is `api.rokutime.com`, but as of writing it's a CNAME to one of `roku.com` subdomains.

The block rates seem to correlate with watch time increasing to ~1/second, so it's definitely trying to phone home with something. Too bad it can't since all its traffic going outside LAN is dropped with prejudice.

If your network allows to see stuff like that, look into what PS5 is trying to do.

  > Most likely ... sending the hash
If you're tracking packets can't you tell by the data size? A 4k image is a lot more data than a hash.

I do suspect you're right since they would want to reduce bandwidth, especially since residential upload speeds are slow but this is pretty close to verifiable, right?

Also just curious, what happens if you block those requests? I can say Samsung TVs really don't like it... but they will be fine if you take them fully offline.

> If you're tracking packets can't you tell by the data size? A 4k image is a lot more data than a hash.

I admit, I've not gotten around to properly dumping that traffic. For anyone wanting to do this, there's also a spike of DNS requests every hour on the hour, even if tv is off(well, asleep). Would be interesting to see those too. Might be a fun NY holiday project right there. Even without decrypting (hopefully) encrypted traffic, it should be verifiable.

> Also just curious, what happens if you block those requests?

Due to `*.roku.com` DNS black hole, roku showed no ads but things like Netflix and YouTube using standard roku apps("channels") worked fine. I now moved on to playing content using nvidia shield and blocking outside traffic completely. Only odd thing is that the TV occasionally keeps blinking and complains about lack of network if I misclick and start something except HDMI input.

[deleted]

Hashing might not work since the stream itself would be a variable bitrate, meaning the individual pixels would differ and therefore the computed file hash

They're using perceptual hashing, not cryptographic hashing of raw pixels. So it's invariant to variable bitrate, compression, etc.

How does perceptual hashing work?

Have you got any recommendations for further reading on this topic?

These are two articles I liked that are referenced in the Python ImageHash library on PyPi, second article is a follow-up to the first.

Here's paraphrased steps/result from first article for hashing an image:

1. Reduce size. The fastest way to remove high frequencies and detail is to shrink the image. In this case, shrink it to 8x8 so that there are 64 total pixels.

2. Reduce color. The tiny 8x8 picture is converted to a grayscale. This changes the hash from 64 pixels (64 red, 64 green, and 64 blue) to 64 total colors.

3. Average the colors. Compute the mean value of the 64 colors.

4. Compute the bits. Each bit is simply set based on whether the color value is above or below the mean.

5. Construct the hash. Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent.

The resulting hash won't change if the image is scaled or the aspect ratio changes. Increasing or decreasing the brightness or contrast, or even altering the colors won't dramatically change the hash value.

https://www.hackerfactor.com/blog/index.php?/archives/432-Lo...

https://www.hackerfactor.com/blog/index.php?/archives/529-Ki...

In the same way that Shazam can identify songs despite the audio source being terrible over a phone, mixed with background noise. It doesn't capture the audio as a WAV and then scan its database for an exact matching WAV segment.

I'm sure it is way more complex than this, but shazam does some kind of small windowed FFT and distills it to the dominant few frequencies. It can then find "rhythms" of these frequency patterns, all boiled down to a time stream of signature data. There is some database which can look up these fingerprints. One given fingerprint might match multiple songs, but since they have dozens of fingerprints spread across time, if most of them point to the same musical source, that is what gets ID'd.

Possibly one of the better known (and widely used?) implementations is Microsoft's PhotoDNA, that may be a suitable starting point.

wouldn't LSH (Locality Sensitive Hashing) make more sense here?

Perceptual hashes are a type of locality sensitive hash.

What system do you use to get that level of visibility?

Main data comes from unbound[1], I use vector[2] to ship and transform logs. Dnstap[3] log format IME works better than the standard logs, especially when it comes to more complex queries and replies. Undesired queries get 0.0.0.0 as a response which I track.

Firewall is based on hand-rolled nftables rules.

[1]: https://www.nlnetlabs.nl/projects/unbound/about/ [2]: https://vector.dev [3]: https://dnstap.info/Examples/

Besides what others have said, another dead simple option is to use Nextdns: https://nextdns.io

Doesn't require running anything locally and supports various block rules and lists and allows you to enable full log retention if you want. I recommend it to non-techies as the easiest way to get something like pi-hole/dnscrypt-proxy. (but of course not being self-hosted has downsides)

edit: For Roku, DNS blocking like this only works if Roku doesn't use its own resolver. If it's like some Google devices it'll use 8.8.8.8 for DNS resolution ignoring your gateway/DHCP provided DNS server.

Seems like you could have a router or firewall mitm queries to e.g. 8.8.8.8 and potentially redirect/rewrite/respond

I would not be surprised if Google TV devices will sooner than later start using DoH to 8.8.8.8

My router owns the IP 8.8.8.8 when seen from inside my network; the Roku literally can't ask Google for DNS via DNS, HTTP, or DNS-over-TLS.. It also answers DNS requests on port 53, and believes that there is no scribe.logs.roku.com, along with many other domains.

The downside is that Google seems to think I'm in a botnet, and wants us to login to see anything on YouTube.

I've explored that! Couldn't figure it out but it certainly sounds possible. And even easier solution is just to block all DNS resolvers except your chosen one. When 8.8.8.8 doesn't work GDevices will fallback to the DHCP assigned resolver (usually your gateway)

I'm a noob at this, but can you do that when it is DoT or DoH? Like I thought the point of them is that you can't forget the DNS request. Even harder with oDoH, right? So does that really get around them?

yea when it's DoH or DoT I don't think you can re-route the DNS request inflight. (where the device thinks it's talking to 8.8.8.8 but it's not).

You can block access to other resolvers though which usually works.

Eventually devices might just start using hardcoded IPs...

Replace your router's DNS with something like pi-hole or a bog standard dnsmasq, turn up the logging, that's it. Ubiquiti devices I think also offer detailed DNS logging but not sure.

I believe unifi offers aggregated dns logs ootb but you could always set up more detailed ones on the gateway itself.

My suggestion would be to configure your own router using a Linux distro. It's not as difficult as it sounds, the kernel already does most of the heavy lifting. All you need to really do is enable packet forwarding and configure the firewall using iptables rules (block all in, allow all out is a reasonable default). I use Unbound as my recursive DNS resolver, together with Hagezi's blacklists to provide DNS filtering. I filter ports 53 and 853, and filter by IP known public DNS servers (Hagezi maintains a list). DHCP is provided by the isc-dhcp-server package on Debian.

That's a more or less complete home router, with plenty on spare resources to run internal or external services like a Wireguard tunnel, file server, or the Docker/Podman runtime.

That being said, I still wouldn't connect a "smart" TV to the Internet. There are better options like a Linux HTPC.

Pfsense firewall. There is a week long learning curve and it’s best to put it on dedicated hardware.

I don’t know why you quoted the addresses.

It's polite to give parsers (human or otherwise) hints that they're about to encounter text which is now intended for a different kind of parser.

I recently forgot to surround my code in ``` and Gemini refused to help with it (I think I tripped a safety guardrail, it thought I was targeting it with an injection attack). Amusingly, the two ways to work around it were to fence off my code with backticks or to just respond to:

> I can't help you with that

With

> Why not?

After which it was then willing to help with the unquoted code. Presumably it then perceived it as some kind of philosophical puzzle rather than an attack.

It's disappointing to see people here use language like "perceived" for an LLM.

As a panpsychist I have no special esteem for an LLM's perceptive powers. I also anticipate that the planet perceives us as nuisance.

Fair question, it does look a bit jarring when not rendered. I write a lot of markdown and it's a very strong force of habit to use backticks to sort of highlight a technical term and turn it into a noun. Similar to writing endash as a double hyphen.

When I read what I write, my eyes glance through backticks and maybe come back if I need to parse the inner term in more detail.

Markdown habit.

Tell me you don't Markdown, without telling me you don't Markdown.

It's a developer thing, using backticks means the enclosed text is emphasised when rendered from Markdown.

Backticks mark fixed width inline code, not emphasis.

I know what they do, it doesn't change the fact that we use them for emphasis.

Chuckled when I saw the reasonable correct informative and perfectly polite answer…is the one in gray. Cheers.

Backticks long predate markdown.

How dare someone not be a developer!

That sounds so expensive it's hard to see it making money. You'd processing a 2fps video stream for each customer. That's a huge amount of data.

And all that is for the chance to occasionally detect that someone's seen an ad in the background of a stream? Do any platforms even let a streamer broadcast an NFL game like the example given?

I used to work for an OTT DSP adtech company i.e. a company that bid on TV ad spots in real time. The bidding platform was handling millions of requests per second, and we were one of the smaller fish in the sea. This system is very real. Your tv is watching what you’re watching. I built the attribution pipeline, which is what this is. If you go buy a product from one of these ads, this is how they track (attribute) it. Not to be alarmist butttt you have zero privacy.

The TV thing isn't a new story, this was public. Everyone should have known about it and no one cared. (I could inset a boilerplate rant about Snowden here)

Those datacenters are not being built so that you can talk to ChatGPT all day, they are being built to generate and optimize ads. People who were not previously very suggestible are going to be. People who are suggestible will have their agency sold off to the highest bidder.

Avoid owning a TV? Your friends will. Maybe you can not have a FB/IG/WhatsApp account, only use cash, not have a mobile phone, but Meta (or Google, or Apple) can still detect your face in the background of photos/videos and know where you shop, travel and when.

This is really interesting. Can you expand on this? What are OTT and DSP in this context?

Do you have a sense for what data is tracked and how it's used? Or if this sort of system is blind in certain cases? (eg: I hook up an N64 to the a/v ports -- will I get retro game ads on the TV?)

OTT = over the top = ads that aren't shown on cable ("linear") DSP = demand-side platform = real-time bidding on ad space on behalf of advertisers

What data is tracked? Don't think we can see what's plugged into the TV if it's not connected to the internet but besides that... all of it... If we have your TV we know where you live. We know what you're watching (hopefully our customers' ads!). We know all the devices that connect to your home network. We know where those devices go when you leave the house. We know you were driving down this stretch of road when you saw that ad on that billboard or on the side of that truck ("out-of-home" advertising). We know if you saw that ad and then bought something ("conversion" + "attribution"). We know what apps you have downloaded. Did you know Candy Crush is spying on you, too? Did you know Grindr sells your IP address? We likely know your age and your race and how much your home cost and where you went to college and how many kids you have ("segmentation"). Privacy laws have gotten in the way a little bit, but not much - it's less "we can't get this data anymore" and more "here's the hoop(s) we now have to jump through but we still get it".

I don't want to freak anyone out. In my time in adtech I never felt like anyone was using this data for anything besides "Please buy more coca-cola..." but you never know. Privacy _can_ exist it's just insanely hard because there's so much money hell-bent on tracking you down.

So, you helped with this... Why?

> you have zero privacy

Is this data linked to me personally in some way (e.g. though an account) or is it anonymous data?

They can definitely work out who you are from your IP address. (or get close enough that the advertisers don't care) Not too many people are putting a VPN on their router and using throwaway accounts for their smart TVs. This might be difficult anyhow if your log into major services such as Amazon, etc, who will know who you are.

I'm not saying this is impossible to avoid, but it ends up being a LOT of work when the alternative is just not connecting the TV to the internet and using a laptop / Apple TV / etc. instead.

Personally identifiable. Most smart TVs force a login to connect to the Internet or even use at all.

>Not to be alarmist butttt you have zero privacy.

Hence why I will never connect my TV to the internet

I understand the perils of a capitalist system but whyyy would you agree to build this

The perils of the capitalist system man. For what’s its worth, I left adtech many moons ago specifically because it is a horrifyingly depressing industry and very very not fun to talk about at parties.

I'm glad you got out, but given your vantage point what would you say to those who feel pressured to do these types of jobs? Would you say more "it isn't worth it" or "if you have to... but get out as fast as possible" or something else?

Money pays the bills. It’s probably not deeply rooted.

Forgive me, but I'd actually like to hear vrosas's response or someone else with a similar background. I appreciate you trying to answer my question and help try to make me informed, but I don't want to hear speculation, especially the rather obvious ones. That's not helping, it just adds more noise to the conversation and discourages a response by them. We all know money pays the bills, no one needs to hear that. But hey, if that's what they say, then you'll be proven right. So let's wait and find out. I really do want to understand their mentality. I hope you do too because how else do we break the cycle?

My man’s not wrong. Adtech has some seriously cool engineering problems and scale. It’s its own form of high frequency trading mixed with everyone you’d imagine from a modern day Mad Men. Plus tons and tons and tons of money.

But that's different from money pays by bills. And I think it's also important to recognize that there can be some fun in this due to the challenge.

I started on another side of engineering and I get that. Building rockets is exciting and fun. But while you build those things it's easy to forget you're building something much more destructive.

It’s not different. I take jobs to pay bills and in the selection process pick opportunities that offer career growth and interesting problems. Money pays the bills.

I've talked to a lot of engineers building DRM technology, and most of them are just a combination of swept up in the fun of the challenge, and also deeply bought into the idea of protecting intellectual property. I would say probably 90% don't see any philosophical issues with what they're building at all. If you can convince them of that, quite a few of them would probably try to get out, but it's quite an uphill battle. I forget who said the quote and the exact words, but something along the lines of it's very difficult to disabuse somebody of a belief when their livelihood depends on believing it.

As someone who was in an industry that I later discovered was doing things I wasn't personally ethically okay with, I would advise them to do similar to me. Start looking for a new gig and just get out as soon as you can.

Unfortunately as an individual there just isn't much you can do. There will always be someone willing to do the job that you aren't willing to do. Just get out and find something you can sleep at night doing

This is incredibly close-minded of you. It’s important creative work can be compensated for. This debate was tired in 2002. You should at least understand the other side instead of treating it with the moral simplicity that’d apply equally to Nazis.

Heh, I love that you accuse me of being closed-minded on this, while simultaneously claiming that nothing relevant has happened since 2002 because you're "tired" of the debate.

I'd also recommend you actually find out about somebody's background before making such a giant assumption about how much they know about the issue.

And if you don't think DRM has been used to abuse legitimate users, you yourself have a significant amount of understanding the other side to do. A great start might be just reviewing some of the anecdotes in this comment section or any of the other many threads that continually happen on HN. It turns out that DRM technology has actually changed a bit in the last 24 years, and it's not just about "protecting" IP

If I read a full biography of you, does it change anything about my perception that you're close-minded for saying working on DRM is obviously 100% morally wrong and it's beyond you to understand why anyone would?

It makes its creator the money they can spend buying the products they see in TV ads.

If someone is going to get paid to build it anyway, I might as well be the one getting paid for it.

This attitude is the reason “someone is going to get paid”.

If you see a unattended laptop in a coffeeshop, do you steal it because “someone will steal it, so it might as well be me”?

Why stop here? We can also blame the people, who implemented such features on the TVs, the people who worked at companies, who used data acquired by these devices for advertisement, the people who worked on the mentioned ads for such devices and the people who bought products from companies, that spend money on such marketing techniques.

At this point you might as well blame the average guy for global warming...

The average guy is exactly the person responsible for global warming. The evil of the world is just the meta accumulation of the average person following their mirco incentives.

Where I'm from, it probably would not be stolen by anyone.

Where do you draw the line?

Ready to do anything for money as long as it seems legal-ish or your ass is covered by hierarchy?

If something should not be done: make it illegal. Trying to have a gentlemen's agreement not to do something seems like a futile position.

Having you own morale and ethics is far from futile. Each individual should be able to question the law and object taking part in something they don't agree, as long as it doesn't break the law.

Killing someone is legal in certain countries for different reasons (I'm not talking about war). Not sure I would like to get involved in that business, for instance if I don't agree on how and why people are sentenced to death in my country.

Some people are built with low ethics. Sure, if it's not made illegal, they'll always find someone to do it. Looks like in that case it might be illegal, as TV makers are sued.

Yeah, there are reasons why "someone is going to do it anyway" is a classic example of an ethically unsound argument.

It isn't ethically unsound. It's a commons/coordination problem. What is the optimal strategy in infinite-round prisoners dilemma with randomized opponents? The randomization effectively makes it an infinite series of one-round prisoners dilemma. So the best strategy is always to defect.

The only way you can change this is very high social trust, and all of society condemning anyone who ever defects.

If morality never factors into your own decisions, you don't get to be upset when it doesn't factor into other peoples'. In other words, society just sucks when everyone thinks this way, even if it true that resolving it is hard.

This is called a “replacement excuse”. It’s a hallmark of nihilists and utilitarians, but I tend to prefer the more prosaic group noun, “jerks”.

This is an intellectually and morally deficient position to take. There is no moral principle in any system anywhere in the history of the universe that requires me to bind myself to a contract that nobody else is bound to.

We can all agree, as a society, "hey, no individual person will graze more than ten cows on the commons," and that's fine. And if we all agree and someone breaks their vow, then that is immoral. "Society just sucks when everyone thinks this way" indeed.

But if nobody ever agreed to it, and you're out there grazing all you're cattle, and Ezekiel is out there grazing all his cattle, and Josiah is out there grazing all his cattle, there is no reasonable ethical principle you could propose that would prevent me from grazing all my cattle too.

> There is no moral principle in any system anywhere in the history of the universe that requires me to bind myself to a contract that nobody else is bound to.

Is there not? I don't feel this makes sense to me, as the conclusion seems to be "if everyone (or perhaps a large amount of people) do it, then it's not immoral". My immediate thought goes to moral systems that universalise an action, such that if everyone did it and it makes the world worse, then it's something that you should not do. That would be an example of a system that goes counter to what you say. Since morals are personal, you can still have that conclusion even if other people do not subscribe to the same set of moral beliefs that you have. Something can be immoral to you, and you will refuse to do it even if everyone else does.

> But if nobody ever agreed to it [...] there is no reasonable ethical principle you could propose that would prevent me from grazing all my cattle too.

Why not? I don't quite understand your conclusion. Why could the conclusion not be "I feel what everyone else is doing is wrong, and I will not do it myself"? Is it because it puts you at a disadvantage, and you believe that is unfair? Perhaps this is the "reasonable" aspect?

Your confusion is understandable. The way the terms "moral" and "ethical" are thrown around is sloppy in most vernacular. Generally, ethics refers to system-wide morality. E.g., I may feel that personal morality compels me to offer lower rates to clients, even though a higher rate may be acceptable under legal ethics. I tried to make that distinction clear in my post ("moral principle in any system") but perhaps I didn't do a good enough job.

The original poster was not referring to individual moral feelings, but to formal ethical systems subject to systematized logical thinking: "classic example of an ethically unsound argument."

There is no religious tradition, no system of ethics, no school of thought in moral philosophy, that is consistent with that position. The closest you might come is Aristotelian virtue ethics. But it would be a really strained reading that would result in the position that opting out of commons mismanagement is required. Aristotle specifically said that being a fool is not a virtue. If anything, a virtue ethics lens would compel someone to try to establish formal community rules to prevent the tragedy of the commons.

I think this argument would justify slavery: no one (white people) has decided that holding others as slaves is bad, therefore I can hold slaves.

But let me entertain it for a moment: prior to knowing, e.g., that plastics or CO2 are bad for the environment, how should one know that they are bad for the environment. Fred, the first person to realize this would run around saying "hey guys, this is bad".

And here is where I think it gets interesting: the folks making all the $ producing the CO2 and plastics are highly motivated to say "sorry Fred, your science is wrong". So when it finally turns out that Fred was right, were the plastics/CO2 companies morally wrong in hindsight?

You are arguing that morality is entirely socially determined. This may be partially true, but IMO, only economically. If I must choose between hurting someone else and dying, I do not think there is a categorically moral choice there. (Though Mengzi/ Mencius would say that you should prefer death -- see fish and the bear's paw in 告子上). So, to the extent that your life or life-preserving business (i.e. source of food/housing) demands hurting others (producing plastics, CO2), then perhaps it is moral to do so. But to the extent that your desire for fancy cars and first class plane tickets demands producing CO2...well (ibid.).

The issue is that the people who benefit economically are highly incentivized to object to any new moral reckoning (i.e. tracking people is bad; privacy is good; selling drugs is bad; building casinos is bad). To the extent that we care about morality (and we seem to), those folks benefitting from these actions can effectively lobby against moral change with propaganda. And this is, in fact, exactly what happens politically. Politics is, after all, an attempt to produce a kind of morality. It may depend on whom you follow, but my view would be that politics should be an approach to utilitarian management of resources, in service of the people. But others might say we need to be concerned for the well-being of animals. And still others, would say that we must be concerned with the well-being of capital, or even AIs! In any case, large corporations effectively lobby against any moral reckoning against their activities and thus avoid regulation.

The problem with your "socially determined morality" (though admittedly, I increasingly struggle to see a practical way around this) is that, though in some ways true (since society is economics and therefore impacts one's capacity to live) is that you end up in a world in which everyone can exploit everyone else maximally. There is no inherent truth in what the crowd believes (though again, crowd beliefs do affect short-term and even intermediate-term economics, especially in a hyper-connected world). The fact that most white people in the 1700s believed that it was not wrong to enslave black people does not make that right. The fact that many people believed tulips were worth millions of dollars does not make it true in the long run.

Are we running up against truth vs practicality? I think so. It may be impractical to enforce morality, but that doesn't make Google moral.

Overall, your arguments are compatible with a kind of nihilism: there is no universal morality; I can adopt whatever morality is most suitable to my ends.

I make one final point: how should slavery and plastics be handled? It takes a truly unfeeling sort of human to enslave another human being. It is hard to imagine that none of these people felt that something was wrong. Though google is not enslaving people nor are its actions tantamount to Nazism, there is plenty of recent writing about the rise of technofascism. The EAs would certainly sacrifice the "few" of today's people for the nebulous "many" of the future over which they will rule. But they have constructed a narrative in which the future's many need protection. There are moral philosophies (e.g. utilitarianism) that would justify this. And this is partially because we have insufficient knowledge of the future, and also because the technologies of today make highly variable the possible futures of tomorrow.

I propose instead that---especially in this era of extreme individual power (i.e. the capacity to be "loud" -- see below)---a different kind of morality is useful: the wielding of power is bad. As your power grows, so to does the responsibility to consider its impact on others and to more aggressively judge every action one takes under the Veil of Ignorance. Any time we affect the lives of others around us, we are at greater risk of violating this morality. See eg., Tools for Conviviality or Silence is a Commons (https://news.ycombinator.com/item?id=44609969). Google and the tech companies are being extremely loud, and you'd have to be an idiot to see that it's not harmful. If your mental contortions allow you to say "harm is moral because the majority don't object," well, that looks like nihilism and certainly doesn't get us anywhere "good". But my "good" cannot be measured, and your good is GDP, so I suppose I will lose.

It is definitely ethically unsound and it is definitely a common example even related to Nazis. Similar to "just following orders". Which I'll remind everyone, will not save you in a court of law[0]...

You are abdicating your own moral responsibility on the assumption of a deterministic reality.

The literal textbook version of this ethical issue, one you'll find in literally any intro to ethics class is

  If I don't do this job then somebody else will. The only difference is that I will not get paid and if I get paid I will do good with that money where as if somebody else gets paid they might not.
Sometimes a variant will be introduced with a direct acknowledgement of like donating 10% of your earnings to charity to "offset" your misgivings (ᶜᵒᵘᵍʰ ᴱᶠᶠᵉᶜᵗᶦᵛᵉ ᴬˡᵗʳᵘᶦˢᵐ ᶜᵒᵘᵍʰ).

But either way, it is you abdicating your personal responsibility and making the assumption that the job will be done regardless. But think about the logic here. If people do not think like you then the employer must then start offering higher wages in order to entice others. As there is some function describing people's individual moral lines and their desire for money. Even if the employer must pay more you are then helping deter that behavior because you are making it harder to implement. Alternatively the other person that does the job might not be as good at the job as you, making the damage done less than had you done the job. It's not hard to see that often this will result in the job not even existing as truthfully these immoral jobs are scraping the bottom of the barrel. Even if you are making the assumption that the job will be done it would be more naive to assume the job is done to the same quality. (But kudos on you for the lack of ego and thinking you aren't better than other devs)

[0] https://en.wikipedia.org/wiki/Superior_orders

Most of those convicted at the Nuremberg trials eventually had their sentences commuted and only served a fraction of their time. Only a few were convicted and executed. Justice rarely prevails.

[dead]

Objectively incorrect. There is no reasonable argument that it's ethically unsound. The fact that you immediately Godwin'd should have been your first clue.

  > There is no reasonable argument that it's ethically unsound.
I didn't claim the argument was reasonable.

  > The fact that you immediately Godwin'd
Well it is a classic example.

Considering you're a military lawyer I'm absolutely certain you've heard this example before and its connection to Nazi Germany. I'm not dating anyone is a Nazi for making that argument, but it is a classic example when pointing to how Ordinary Men can do atrocities. And no, I didn't make a grammatical mistake there.

> will not save you in a court of law

Not in the USA. LEO or ICE - or even some judges misuse and never are punished. Qualified immunity.

Moral is different story. Too many people in HN work in Google or Apple. That by itself if immoral.

  > even some
Some is a keyword.

Some doesn't change the law.

You're right to push back in case I intended something different. But I'll state this clearly: those LEO, ICE agents, and judges are committing crimes.

But the fact that not all criminals are punished or prosecuted does not change the laws either.

What I'm concerned about is people becoming disenfranchised and apathetic. Dismissing the laws we have that does punish LEO, ICE agents, and judges for breaking the laws. To take a defeatist attitude. Especially in this more difficult time where that power is being abused more than ever. But a big reason it is being able to be abused is because a growing apathetic attitude by people. By people giving up.

So I don't know about you and your positions. I don't know if you're apathetic or invested. All I know is a random comment from a random person. It isn't much to go on. But I hope you aren't and I hope you don't spread apathy, intentionally or not.

Care to articulate them?

If you want a consequentialist answer:

If, for ethical reasons, fewer people were willing to take these jobs, then either salaries would have to rise or the work would be done less effectively.

If salaries rise, the business becomes more expensive and harder to scale. If effectiveness drops, the systems are less capable of extracting/using people’s data.

Either way, refusing these jobs imposes real friction on the surveillance model.

If you want a deontological answer:

You have a responsibility not to participate in unethical behavior, even if someone else would.

The fact that it can be used to "justify" almost anything. It obviously doesn't work as a defense in the court, and neither does it work as a justification for doing legal but unethical things.

Soooo.... Why did you build it for them? You didn't have to further enable it. Despise people who just drop this kind of thing without any hint of repentance or contrition.

Would love to know what are the best things we can do to prevent this sort of tracking in general. PiHole? Don't re-use emails? On a scale of 1 to fucked are we cooked?

I don't think they mean that kinda streamer - the idea is the roku tv can tell you're watching an ad even if it's on amazon prime, apple tv, youtube, twitch, wherever, and associate the ad watching with your roku account to potentially sell that data somehow?

That way they aren't cut out of the loop by you using a different service to watch something and still have a 'cut'.

It'd make sense if they're using streamer in a different sense than I'm used to. I see that's at the bottom of the definitions Google will produce.

Yeah I think they mean "user of a streaming service" here, which would more conventionally be user or watcher or so on.

The actual screenshot isn’t sent, some hash is generated from the screenshot and compared against a library of known screenshots of ads/shows/etc for similarity.

Not super tough to pull off. I was experimenting with FAISS a while back and indexed screenshots of the entire Seinfeld series. I was able take an input screenshot (or Seinfeld meme, etc) and pinpoint the specific episode and approx timestamp it was from.

> The actual screenshot isn’t sent, some hash is generated from the screenshot and compared against a library of known screenshots of ads/shows/etc for similarity.

this is most likely the case, although there's nothing stopping them from uploading the original 4K screengrab in cases where there's no match to something in their database which would allow them to manually ID the content and add a hash or just scrape it for whatever info they can add to your dossier.

I thought that similar inputs do not give similar hashes..but apparently that is cryptographic hashing. Locality-Sensitive Hashing methods (e.g. Perceptual hashing[1]) makes similar inputs have similar hashes.

[1] https://en.wikipedia.org/wiki/Perceptual_hashing

Ah, bingo, yes!

I should have been more specific in my comment. Perceptual hashing allowsfor higher similarity scores between similar looking images.

Lots of cool techniques to experiment with. Highly recommend playing around if you’re interested.

I immediately did a little exploration for potential utility in neuroimaging analyses...not that anything was immediately obvious to me, but I love learning about things like this.

I assume these systems are calculating an on device perceptual hash. So not that much data needs get flown back to the mothership.

That's the thing about scaling; you offload the work to the "client" (the TV in this case) and make it do the work, it need not send back more than a simple identifier or string in an API call (of course they'll send more), so they get to use a little bit of your electricity and your TVs processing power to collect data on you and make money, with relatively little required from them, other than some infra to handle the requests, which they would have had anyway to collect the telemetry that makes them money.

Client side processing like this is legitimate and an excellent way to scale, it just hits a little different when it's being used for something that isn't serving you, the user.

source: backend developer

Confirming how many people actually seen the ad is worth big bucks. No one wants to pay for ads they cannot confirm and publisher can make up impressions - if you can catch publisher making up numbers you might get a huge discount or loads of money back.

Not necessarily, it can be done on-device, the screenshot hashed, and the results deduplicated and accumulated over time, then compressed and sent off in a neat package. It'd still be a huge amount of data when you add it all up, but not too different from the volume that e.g. web analytics produces.

Then server-side the hash is matched to a program or ad and the data accumulated and reduced even further before ending up in someone's analytics dashboard.

Are there video "thumbprints" like exists for audio (used by soundhound/etc) - IE a compressed set of features that can reliably be linked in unique content? I would expect that is possible and a lot faster lookup for 2 frames a second. If this is the case, the "your device is taking a snapshot every 30 seconds" sounds a lot worse (not defending it - it's still something I hope can be legislated away - something can be bad and still exaggerated by media)

There are perceptual hashing algorithms for images/video/audio (dsp and ML based) that could work for that.

Given that the TV is trying to match one digital frame against another digital frame, you could probably get decent results even with something super naive like downsampling to a very low resolution, quantizing the color palette, then looking for a pixel for pixel match.

All this could be done long before any sort of TV-specific image processing, so the only source of "noise" I can think of would be from the various encodings offered by the streaming service (e.g. different resolutions and bitrates). With the right choice of downsample resolution and color quantization I have to imagine you could get acceptable results.

That's basically what phash does

I've been led to believe those video thumbprints exist, but I know the hash of the perceived audio is often all that is needed for a match of what is currently being presented (movie, commercial advert, music-as-music-not-background, ...).

This is why a lot of series uploaded to YouTube will be sped up, slowed down, or have their audio’s pitch changed; if the uploader doesn’t do this, it gets recognized by YouTube as infringing content.

You only need to grab a few pixels or regions of the screen to fingerprint it. They know what the stream is and can process it once centrally if needed.

Is this what these sort of companies are doing?

In a word yes. Here is a starting point.

https://arxiv.org/abs/2409.06203

Attribution is very painful and advertisers will pay lots of money to close that loop.

Is it? I don’t think you need particularly high fidelity to fingerprint ads/programs.

it's hashed on the tv then they compare hashes in aggregate

[dead]

This is especially annoying and just incredibly creepy -- I was watching a clip of Smiling Friends on YouTube (via my Apple TV), and I suddenly got a banner telling me to watch this on HBO Max.

I never felt more motivated to pi-hole the TV.

>I never felt more motivated to pi-hole the TV.

Or just disconnect from the internet entirely? You already have an apple tv. Why does your tv need internet access?

TVs tend to incessantly ask for internet access, especially android ones.

Then don’t buy an Android tv?

The problem with 'well just don't buy it' is that in many product categories, enshittification has become so entrenched that there are no longer options to avoid it. The availablity of product features is driven by market forces, if it's no longer profitable to sell a TV that doesn't require online connectivity for the purposes of ads, then such TVs will no longer be sold.

Alternatives like using monitors designed for digital signage come with drawbacks. Expense, they don't have desirable features like VRR, HDR or high refresh rates, since they aren't needed for those use cases. Older TV models will break and supply will dry up.

In the long term, this problem, not just TVs but the commercial exploitation of user data across virtually all electronic devices sold, isn't something that can be solved with a boycott, or by consumers buying more selectively. The practice needs to be killed with legislation.

Good point. I’ll just argue about HDR and high frame rates being desirable features :) I don’t even know what VRR is.

VRR is Variable refresh rates, so if there is nothing going on in the content, they can bring the refresh rate down and save processing, thermal issues and energy. If there is a lot going on(say a game), they can ramp the refresh rate back up super high.

There are a few different "standards" around VRR, not every device supports all of them.

Meh, I wonder why I care about saving energy or processing on a tv that’s plugged in anyway but hey. Thanks for explaining!

Their explanation of the reason for VRR is bad. The primary reason people want it is gaming where the game is not locked to a specific frame rate. Without VRR, the timing of a frame being delivered isn't necessarily going to match when the display is expecting a new frame. This leads to one of two effects. Either the display is forced to hold an old frame for longer and pick up the new frame on the next refresh cycle, which creates stutter. Or the display switches which frame its using partway through the refresh cycle, which creates a visual tear in the image.

Some TVs have a dedicated mobile connection, there is a SIM card and baseband radio inside. Of course only they can use it, not you.

You mean they pay for data charges? Don't be stupid.

Data doesn't cost that much. They are buying in bulk for lower priority access. That is a very different cost from what you pay for your phone data.

Source? This sort of conspiracy started with "smart tvs will connect to open wifi networks", then evolved to "it uses amazon sidewalk", and apparently now morphed into "tvs have 5g modems". Given how poorly supported the prior claims were, that does not bode well for the 5G claim.

Isn't that one of the marketed advantages of 5G. Lot of smart IoT devices including TVs being able to connect independently.

What we are lacking is implementation but the tech and probably the intent was always there. If HDMI ethernet connectivity(HEC) had gained traction, we would have seen a fire stick, apple tv or roku providing internet to your tv without asking for explicit consent.

Sounds obvious for TV manufacturers to do this if they plan to spy on you and sell ads you can't hide. Same with locking down firmware.

Cheaper to just use the wifi access that 99% of TVs will be given.

You said 5G, not me

I agree that I misquoted you, but that's a distinction without a difference in this context. "SIM card and baseband radio inside" means 5G, 4G, 3G, whatever. I still demand that you produce proof that there are TVs with "SIM card and baseband radio inside".

I was curious so I did some research. These devices do seem to be being produced, currently mostly overseas. The inclusion of 5G support does not seem to be hidden or nefarious. They provide a SIM card slot just like your phone would. Some models are incorporating a built-in router to provide connectivity to other devices. It seems like the cellular companies are promoting these TV's too, with built in service.

My opinion is this is just a consolidation of devices. I have many friends who live off their phone data plan giving hotspot to the TV and other devices. Now being moved into a common device format, the TV. I don't think they can spy any more effectively this way. Eexcept via the router integration that gives them way more access, but I'm sure this exists already as a wifi feature on tvs. Just technology trudging along. Perhaps they have a secret sim card or esim embedded, that might be a risk as the hardware is already there for a valid reason.

Every time the topic is TV on HN someone repeats this conspiracy or that "it'll happen soon!"...

This place like a flat-earther gathering sometimes.

You could try getting an European TV, at least then it will ask and you can say no.

A banner from Apple or your TV trying to navigate you back to its own HBO app?

The latter. In addition to being creepy, it’s such a horrible “feature”. I can’t imagine who thought it was a good idea.

It’s far less important for ad-free content. They mainly want to connect your ad watching behaviour to an email and then have loyalty program data connected to the same email so that they can identify which ads convert vs not.

It’s still a privacy violation a lot of people would be outraged by if they knew it. Tracking what shows you are watching is a valuable data set.

I'm surprised to see how few of my non-technical friends and family actually care about privacy.

It’s right there in your TV’s settings though. Personally, I don’t trust them to obey the setting so my TV has no internet and I use an Apple TV.

In your settings under how many nested menus under which deceptively named option?

And how many options do you need to toggle to actually opt out?

For my LG TV, it is as deep as you can possibly go! And I think it took two options to fully opt out (including turning off the microphone).

So potentially completely noncompliant if used in a business. E.g. it may have HIPAA, top secret etc.

Boardroom presentation TVs in publicly traded companies would yield insider information.

Sending 4k screenshots twice a second to a server would be tremendously bandwidth hungry. My guess is that it's all done locally.

There's probably compact signatures extracted from the screenshots (color profiles, OCR, etc) which are then uploaded later in bulk. You don't need the full original image to be able to reliably uniquely identify the content if you have an index of it already.

I'm wondering if there is some sort of steganographic watermark that broadcasters are including in media, to enable stuff like this. Probably would need to be robust in the presence of re-encoding, more compression, etc..

This has been long solved by YouTube for detecting CP and other non-compliant videos.

For example, check out https://github.com/akamhy/videohash

It is a violation of the VPPA to collect this for streaming services and prerecorded media. Scheduled broadcast and cable TV aren't covered.

I thought the 2013 amendment to the VPPA largely defanged it by allowing sharing with customer consent (which is probably one of the clauses in the million-word customer agreement nobody reads).

Pretty sure that’s why this lawsuit will have some legs - the deceptive way folks are opted in without really understanding what is happening.

I’m shocked to be agreeing with Ken Paxton but he’s right on this one.

Yeah that’s why Webex is still in business. TVs are a great entry point to LANs.

> HIPAA

Are health providers using PS5s in a context where information may be leaked to other providers? What kind of information would you expect to be displayed that might violate HIPAA?

Patient xray for example, blown up on big tv

As other users mentioned, these screenshots are almost certainly not being transmitted as screenshots as the bandwidth costs would be enormous. The screenshots are converted to a hash on the user’s device before being sent to a server where the hash is compared to a database of known hashes. A user’s x-ray would just appear as a hash. This might still constitute a HIPAA violation, but I doubt it.

One cannot unscramble hash and tell what does it present

This seems like an extremely unrealistic scenario for a given ps5

Also how would other providers be privvy to this view of this xray?

I’m not sure what relevance there is to other providers?

I work with a lot of small medical offices, and they do use consumer Smart TVs in some contexts. I typically limit their network access for other reasons, and displaying X-rays isn’t something I’ve personally facilitated, but it wouldn’t shock me to discover it’s being done in other clinics, and the popularity of cloud-based ePHR software has left a lot of smaller clinics with very limited internal I.T. services.

The destination isn’t relevant, if the image leaves the clinic at all without consent, that’s a HIPAA violation. Fortunately, I think it’s more likely that the images are sampled and/or hashed in a way that means the full image isn’t technically transmitted, but considering the consequences and costs of a data breach, I’d definitely be wary of it.

> I’m not sure what relevance there is to other providers?

The point of HIPAA is to prevent providers from colluding against you.

The PS5 doesn't need to, they get it all in metadata because they control the full stack — TVs do it because they have less control over sources.

The PS5 does actually record video all the time in a ring buffer. That’s how when you press the share button, it includes a video of the recent past.

right that's the purpose though, they don't need to ship screenshots for monitoring

Is the PS5 not jailbroken?

I'm sure somebody's done it, but mine isn't. I do make sure to pull the microphones out of the controllers at least so while they can watch everything I'm doing on my screen they can't listen to the entire house.

I'd like to weaponize all this scanning into a force for good. Instead of phoning home to Roku, send the fingerprints up to an ADID database registering every ad on the planet. Open up an API so that any video stream can detect an ad and inject Max Headroom replacement clips.

Come on hackers. We could murder the global economy with this shit.

I've been thinking about this as well - make a small device that in real time detects ads and turns off audio an video while it's playing. I'd rather see a blank screen than an ad. That way, the whole ad pyramid scheme stays intact while the conversion rates plummet.

> while the conversion rates plummet.

Isn't the segment who will set this up also likely to have a low conversion rate to begin with?

You'd need to make it so easy that it becomes fully mainstream. I suspect that's what happened to adblockers, it got a bit too "standard" for (Google's) comfort.

Same here. I've done this for podcasts (not in real time) and it works great. TV should be easier in some ways since the video stream and captions can also indicate an ad.

I used to find when listening to a good many podcasts with VLC there would be:

> ... See you after the break.

brief pause

> And we're back ...

Unfortunately, most ads are now burnt in. The 10 second advance will skip through them, but as it's usually the host parroting the ad text and it's easy to over shoot.

The only real question is whether they're doing screen-level analysis or just relying on app telemetry

They're definitely doing screen level analysis.

I work for a company that does some work on Internet advertising and one of the main issues that came up when we discussed supporting smart TV platforms was how we could protect our proprietary advertising audience data while still showing ads on these devices. Knowing what ads we show the user tells them what the user is interested in, which is valuable information for our competitors.

Unfortunately, we were not able to solve that problem, and instead to just use lower fidelity user models for advertising on smart TVs. That makes smart TV ads less valuable, but allows us to keep our competitive advantage on desktop and mobile.

If I’m understanding you right, I’m confident it’s screen analysis. I have a Hisense Roku TV I exclusively use with an AppleTV. I get creepy intrusive popups telling me: “you could be watching this on other streaming providers!” all the time. So it “knows” what’s being displayed on the screen regardless of what app (or HDMI input) is being used.

I'm fairly puzzled by my own reaction to this.

I'm indifferent to YouTube have frame-by-frame nanodata about me.

But as a Roku user, this snap shotting makes me very angry.

Maybe because much of what I watch on my TV via my Roku is content I own and stream from my personal server?

For me, I despise having different abstractions get crossed.

I expect my media app, ie. YouTube, to know what I watch from the media app. YouTube knows about YouTube.

My operating system, ie. Roku, should not know about what's happening inside a given app. ie. Roku does not know about YouTube.

When they start crossing layers, that greatly upsets me.

Does this apply for external video inputs, outside of the smart TV OS?

I guess I can always just refuse the TV OS access to the wifi, assuming they're not using 4G modems.

Time for me to get Apple TV.

This is not sufficient because the TV you are showing the video on can (does/will) take the screencaps.

If you have a plugged-in device, then you can just disconnect the TV from the network.

As if it didn’t track your habits as well.

...it doesn't.

Like, Apple knows what you're watching within the Apple TV app obviously.

But it's certainly not taking screenshots every second of what it displaying when you use other apps -- which shows and ads you're seeing. Nor does Apple sell personal data.

Other video apps do register what shows you're in the middle of, so they can appear on the top row of your home screen. But again, Apple's not selling that info.

Having each app report what is going on vs figuring it out from a screenshot locally is the same from a privacy POV.

But I do trust apple more

A lot of this stuff is actually being used to track which ads are being watched. Apps definitely aren't reporting those.

Like all data collection you can bet that the data our smart TVs and devices take from us is (or one day will be) used for a lot more than just ads.

> > Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution.

Isn't that too much data to even begin to analyze? The only winner here seems like S3.

It runs a hashing algorithm locally, I believe, rather than transmitting the entire image. pHash or something similar would work.