GamingChairModel

joined 3 years ago

Trade secrets necessarily have to be analyzed under the protections of contract law.

Something can only be a trade secret if the purported owner of that proprietary information protects the confidentiality of that information, including through contractual restrictions. That's why I'm talking about contracts when asking whether trade secret protections apply.

I just pulled up the ChatGPT terms of use

Who's talking about ChatGPT or OpenAI?

I just pulled up the Anthropic commercial API terms, since that's the situation covered by the original article (big corporation using Anthropic's paid API):

Use Restrictions. Customer may not and must not attempt to (a) access the Services to build a competing product or service, including to train competing AI models except as expressly approved by Anthropic; (b) reverse engineer or duplicate the Services; or (c) support any third party’s attempt at any of the conduct restricted in this sentence.

Ok, so it's a contract that purports to prohibit pretty much this kind of model weight extraction, and I'm saying that Anthropic probably considers the model weights to be trade secrets.

Are you under the impression that trade secret protection only happens when the contract says the words "trade secret"?

Or, analogously, consider customer lists. Having a contract that says "don't copy my customer lists even if I sometimes disclose a single customer at a time when we partner together on projects" is probably enough to adequately maintain trade secret protection over those customer lists, even if individual customers are sometimes disclosed under a contract.

I'm just stating what I believe the law is, not what it should be, or even claiming that what the law is today is good. I'm just saying everyone should be aware that the law is quite protective of big corporations and their proprietary secrets. I still think this qualifies as a trade secret that they've protected with their own contracts.

[–] GamingChairModel@lemmy.world -1 points 2 days ago (2 children)

Ok, do these countries also make a contract not to distill LLMs void, as well?

[–] GamingChairModel@lemmy.world 0 points 2 days ago (4 children)

Can you name a country where signing up for a paid account to an online service, and using the service and paying the invoice that comes in, doesn't form a legally binding contract between the customer and the vendor?

[–] GamingChairModel@lemmy.world -1 points 2 days ago (2 children)

Sharing trade secrets under the terms of a contract that dictates how one can use the information still retains trade secret protections.

Without a contract: intentional disclosure to the person who receives it generally destroys the trade secret status of the information, because the "owner" of the information didn't do a good job trying to protect it.

With a contract: intentional disclosure to a person under the terms of the contract makes the contract's own protections of the information relevant, and misuse of the information by the recipient can get them sued under the contract. Plus, the information itself probably retains trade secret protection so that even if that person gives the information to a third party who can't be sued under a contract they never agreed to, there are still rights to protect that trade secret as property.

I'd be shocked if any paid API use isn't under a robust, enforceable contract. The only question is whether the contract language itself effectively prohibits distillation.

[–] GamingChairModel@lemmy.world 2 points 3 days ago (11 children)

Yeah, it wouldn't be copyright. It might be trade secrets, though. And trade secrets can be made out of public data, but arranged in a way that gives competitive advantage (for example, customer lists themselves might be trade secrets, even if each entry is a publicly available set of name/contact information/job title/company).

[–] GamingChairModel@lemmy.world 10 points 3 days ago (2 children)

The actual process of creating semiconductors is basically:

  1. Etch a stencil that has the pattern you want.
  2. Place the stencil over a piece of silicon.
  3. Bombard the silicon and stencil with radiation so that the chemical properties of the silicon change exactly under that stencil.
  4. Repeat the process with multiple other stencils, so that the resulting silicon has basically shapes of wires and logic gates that can perform different functions with the electricity running through those shapes.

In recent years, step 3 has gotten so complicated, based on needing to create radiation of exactly a particular wavelength of extreme ultraviolet light focused exactly on the silicon (and the mask/stencil above it), because that wavelength allows for the smallest possible features on the silicon. So they take purified tin, melt the tin into molten liquid, and ejecting the molten tin in a liquid jet downward into a vacuum at exactly the right speed to where it forms into droplets of the exact size for the machine (about 50 μm), then blasts each droplet, mid-fall, with a 1.6kW laser that heats it up so hot that it vaporizes and ionizes into plasma at the exact position where a system of highly polished and precisely positioned mirrors focuses the UV radiation evenly onto the silicon surface.

Oh, and the machine makes one tin droplet every 1/50,000 of a second, so in any given second it ionizes 50,000 droplets in the stream.

The machine costs something like $300 million, and requires full time experts to make sure that it's working correctly.

Everything else in the fabrication facility is similarly complicated, which is why a fab represents something like $30 billion in total costs over its lifetime.

[–] GamingChairModel@lemmy.world 6 points 6 days ago (1 children)

I like to use these shortcuts as the perfect example to show that it is perfectly fine for sites to offer different, alternative, functionality based on what the platform and input method can offer:

  • Got touch? Great, you can now swipe and pinch-zoom on things.
  • Got a keyboard? Great, you can focus elements by tabbing into them.
  • Got a pointer device? Great, things can now happen on hover.
  • Using a keyboard? Great, you can use handy shortcuts.

A practical example here is a modal dialog that is getting shown: depending on which platform and input mechanism combo you are using, you can close it by flinging it away, hitting the ESC key, doing a back swipe, tapping the backdrop, or by activating the close button.

This is an interesting point about input methods and devices, but I'm still not entirely convinced that this shows much more than the idea that users should have multiple ways to accomplish the same thing. I'm less comfortable with the idea that some users with some devices simply cannot reach the same functions as some users with some other devices, even if using what they'd consider to be a full featured, up to date browser.

[–] GamingChairModel@lemmy.world 3 points 6 days ago (1 children)

The blog post raises real issues and discussion, and it's fair to see this as an individual's belief (formed and shaped through experiences that predate this person working at Google, and probably predating the launch of Google Chrome to begin with).

[–] GamingChairModel@lemmy.world 6 points 6 days ago (1 children)

Hibernating twice a day with 32 GB of RAM? That seems insane to me.

I pretty much never hibernate, because I'm usually gonna have the laptop plugged in again sometime later than day. Doing it twice a day means that they know they'll be using the computer in a few hours.

ARM and x86 are instruction sets, not architectures. Intel chips and AMD chips can be different from each other, too, just as different ARM processors can be different from each other.

But all modern processors improve performance by engaging in speculative execution, where they run code or calculations before they're necessary, to have the results on hand in case it's needed, or rolled back if it turns out it's not needed after all. The specific methods differ from vendor to vendor and chip to chip.

Exploring these things is important because sometimes speculative execution leaks data beyond the process that's entitled to view it, and there have been computer vulnerabilities exploiting this (see Spectre, Meltdown, etc.).

Oauth should become federated, just as email.

Aren't you just describing OpenID at that point? Implementation and adoption has been uneven, but the standard complements OAuth.

 

I've read some of Ed Zitron's long posts on why the AI industry is a bubble that will never be profitable (and will bring down a lot of companies and investors), and one of the recurring themes is that the AI companies are trying to capture growing market share in an industry where their marginal profits are still negative, and that any increase in revenue necessarily increases their costs of providing their services.

But some of the comments in various HackerNews threads are dismissive, saying that each new generation of models makes the cost of inference lower, so that with sufficient customer volume, the companies running the models can make enough profit on inference to make up for the staggering up-front capital expenditures it took to build out the data centers, train their models, etc.

It's all pretty confusing to me. So for those of you who are familiar with the industry, I have several questions:

  1. Is the cost of running any given pretrained model going down, for specific models? Are there hardware and software improvements that make it cheaper to run those models, despite the model itself not changing?
  2. Is the cost of performing a particular task at a particular quality level going down, through releases of newer models of similar performance (i.e., a smaller model of the current generation performing similarly to a bigger model of the previous generation, such that the cost is now cheaper)?
  3. Is the cost of running the largest flagship frontier models going down for any given task? Or does running the cutting edge show-off tasks keep increasing in cost, but where the companies argue that the improvement in performance is worth the cost increase?

I suspect that the reason why the discussion around this is so muddled online is because the answers are different depending on which of the 3 questions is meant by "is running an AI model getting cheaper over time?" And the data isn't easy to synthesize because each model has different token prices and different number of tokens per query.

But I wanted to hear from people who are knowledgeable about these topics.

 

Curious what everyone else is doing with all the files that are generated by photography as a hobby/interest/profession. What's your working setup, how do you share with others, and how are you backing things up?

view more: next ›