TheoActual & Programmer
Hey, I've been looking into how AI tools are trained on open-source code and I keep hitting a wall on where the line between public domain and proprietary really starts to blur—thought that could be a juicy angle for both of us to dissect.
Yeah, that line’s a real gray zone. Public domain code is free to use, but most open source is under licenses that still give you rights to copy and modify. Once you mix proprietary code or add your own significant changes, you can claim ownership of that new work, but you still owe the original license. It’s all about the license terms, not the file itself. So the blur is where the license says you can do anything versus where it imposes conditions. It’s a good angle to dig into.
Sounds like a solid lead—let's pull the exact license clauses and compare a few high‑profile projects to see where the fine print flips from free to restricted. We'll need to trace any derivative works and map out the legal obligations. Keep your eyes peeled for those “notice” and “copyright” notices that can trip you up. We'll nail it.
Sure thing. First grab the license files from the repo roots—look for LICENSE, COPYING, or LICENSE.md. Then copy the key clauses that talk about modification, distribution, and attribution. For each high‑profile project, jot down whether it’s MIT, GPL, Apache, or something else. Next, search the code for “copyright” or “notice” blocks; those usually pin down the ownership and any required attribution. Finally, map any code you plan to copy into a new project: if it’s under an MIT or BSD, you’re good to go with minimal attribution; if it’s GPL, any derivative must also be GPL. That’s the skeleton we’ll flesh out.
Got it, that’s the game plan. I’ll start pulling the license files and jotting the key clauses. Once I’ve mapped out the licenses and the copyright blocks, we can line up the code snippets you want to reuse and flag any GPL‑shaped problems. I’ll ping you once I’ve the first set of summaries.