> the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT)

Holy crap that is dark. I like learning about ML for fun, and now I have to assume that their model is intentionally misinforming me to sabotage my learning? It is absolutely bananas that somebody decided that was ok behavior.

time to support open source and local models

I don’t see how that helps, unless you actually mean open source, rather than open weights like most people do. Without everything that goes into the model, including training data, these things are opaque.

Actual open source is hard without a big war chest that allows you to flagrantly steal the training data.

That may very well be the case. In fact, I'm nearly certain that you're right. But it doesn't change the fact that open weight models are altogether insufficient on a number of important dimensions regarding freedom and transparency. And so often (such as the comment I replied to, I think), even technical people seem to just ignore the difference. Open weights are just weights. No amount of open-washing changes that.

The raw training data is so large that very few parties could host it for free even if there weren't copyright barriers.

But I think you could have a full open source training software pipeline that's set up to work with Wikipedia, Common Crawl, Books3, Library Genesis, Anna's Archive, and whatever other useful data sets people can name. There would just be a step where you have to provide your own copy of Library Genesis (or whatever subset of it you have managed to obtain).

Someone could write a cyberpunk Three Body Problem with this premise.

They kinda did (though it's more inspired by Trusting Trust than AI)

https://corecursive.com/coding-machines-with-don-and-krystal...

TLDR :-)

This comment is not entirely on point with your comment, it circles around and above it looking for lift though.

If you're not doing work that requires your code to stay in home nation data centres, Claude for Deepseek, Deepclaude (https://github.com/aattaran/deepclaude) is a great way to get better at using Claude like tools for software development. It even does a pretty good job of putting together cover letters for job applications...

Using Deepclaude is very much cheaper than using claude... For hobby projects, I've found it useful. A recipe (for cooking) management app I've made took a couple of hours to put together and cost $US 0.5. Claude is far more expensive.

The downsides of Deepclaude for many are:-

- DeepSeek is a Chinese corporation so the Chinese Communist Party may ask for data if it wants it.

- DeepClaude isn't as fast as normal Claude, though it's still pretty fast and I think fast enough (YMMV).

- DeepClaude might not be as optimised for various code issues that Claude may be able to solve more quickly or effectively.

- The same safeguards are probably on DeepSeek, but you won't be "wasting" as much money as you might on using Claude.

Inference focused hardware (https://www.youtube.com/watch?v=nvPqHoVSenE, AI generated speech) may in the medium future cause a large enough cost/energy reduction for LLM tools like Claude to make local LLMs more attractive.

Inference focused hardware would make running Open Source models like DeepSeek on local machines far cheaper and control over safeguards would return to the end user.

Hopefully this leads to a localised LLM provision market where local businesses provide varieties of these "local" LLM services. Here, local could mean on premise through to state or nationally based LLM services. Eventually, government orgs outside of the US may demand this kind of LLM use, in the same way governments legally require data to be stored within national borders for many critical government functions.

A bloke can dream I guess...

...Could affordable inference focused hardware also cause the bottom to fall out of these stock market bending valuations for AI corps and their datacentre obsessions?... Not to mention the societal costs caused by the AI super corps building these data centres. At the moment, they're nearly making a profit... They seem almost like speculative companies... Is that a term?