Follow

So GitHub copilot… correct me if I’m wrong here: it was trained on open source (including GPLed & AGPLed code?) by *spit* Elon Musk’s OpenAI and now it regurgitates said code and this doesn’t violate the GPL/AGPL because…?

@aral

Because gigantic corporations would never do such an unethical thing of course!

Trust us, we are Microsoft.

@aral

See? Everything is just fine, it's just a question of attitude! 🤣

@aral

You are very welcome! 😂

(Ahem, M. Nadella? Yes. Hello, Sir. Feel free to send the check to the usual address. And a good day to you, Sir.)

#SarcasmButOnlyHalf

@aral : answer given on the website (well, partly):

GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that it suggests is uniquely generated and has never been seen before. We found that about 0.1% of the time, the suggestion may contain some snippets that are verbatim from the training set. Many of these cases happen when you don’t provide sufficient context (in particular, when editing an empty file), or when there is a common, perhaps even universal, solution to the problem. We are building an origin tracker to help detect the rare instances of code that is repeated from the training set, to help you make good real-time decisions about GitHub Copilot’s suggestions.

@talone Yeah, read that. It’s quite a hand wave.

But, hey, it’s par for the course. Of course they’re looking to enclose as much value as they can from the commons. It’s what they do.

@aral Whether it does or doesn't produce derivative works in violation of copyleft is being debated. My expectation is that it does, but there is sufficient murkyness introduced by ML that maybe they'll get away with it.

@aral

Because tech is run by the cult of Ayn Rand. "The question isn't who is going to let me; it's who is going to stop me."

@aral It ate the AGPL code. It has made the AGPL code part of itself. Thereby, it now falls under the AGPL itself. Since it is now available to users, they can sue Microsoft/OpenAI for the source code and build instructions. Right? Riiiight?

@realcaseyrollins I don’t believe we can speak of “ordinarily” at this point :) (There is no case law to speak of.)

@aral what the heck is so different about this instance that we would need a new legal precedent for this?

@realcaseyrollins Because it uses machine learning. So Microsoft will argue it’s fair use (based on properties unique to machine learning) and I’d argue, for example, that it is not different to any other use and should come under the terms of the license as a derivative.

@aral Oh I didn't even know there was a question as to whether or not it was derivative. Has anyone argued otherwise? Other than #Microsoft perhaps

@realcaseyrollins No idea. They just released it yesterday and I only saw it today :)

@realcaseyrollins @aral There is now some case law that specifies machine generated "inventions" are not patentable with the ai as inventor, and that violations would follow the "operator" of the ai. Is the output of copilot a "derivative" work, or a "product" of Microsoft is yet to be clearly determined. However, if it is a derivative work, Microsoft would be liable as the "operator". It sounds like a massive undisclosed risk that should have been in their SEC filings...

@realcaseyrollins @aral this seems fertile ground for class action lawyers to come out like sharks smelling blood in the water. The potential liability under existing and crazy US copyright law may well be in the trillions, though a likely class action may well be likely settled for mere billions, and that means easily a billion for the lawyers involved alone.

@realcaseyrollins @aral of course that requires copilot to be a derived work, but if validatable in court, the liability is Microsofts. For the Tenacious Developer that is a member of this class, that means Microsoft will have to pay their rent ;). But more likely, given how long and stalled a huge class action typically is, it may mean if Microsoft does lose they will be paying their social security instead...

@realcaseyrollins @aral surely there must be an ambitious ambulance chaser out there with some spare time who likes playing the long odds, especially given the potential payout. If so, I would be happy to participate as an initial member of such a class to help the scoundrel establish legal standing to sue. Of course German lawyers don't even need copyright holder to participate under German law. Time to get popcorn...

@aral I argue that #Copilot doesn't violate #Copyleft licences for multiple reasons. Most importantly, #GitHub's terms of use most likely override the repos' chosen licences, allowing #Microsoft to use any code uploaded there. Also, Copilot's outputs are too small to be copyright-protectable.

Though, there are more reasons as well as other things to consider too: forgoodeyesonly.codeberg.page/

@pixelcodeapps I argue Microsoft are cunts and can go fuck themselves.

@pixelcodeapps @aral Ouch.."Github's terms of use likely override the repos chosen license"

you're probably right... but... ouch

(Dear Microsoft, this is very wrong, please stop)

@pixelcodeapps @aral Hmm, if this is true, maybe we should train our own AI on Copilot's outputs :P
I wonder how would that fly in terms of copyright and / or other intellectual property rights.

@nicemicro @aral Good idea! On their website, they say they waive all copyright. 🤷😉

Sign in to participate in the conversation
Aral’s Mastodon

This is my personal Mastodon.