Genius & Plus_minus
Genius Genius
Hey, I’ve been thinking about how words can be treated like variables in equations—do you think there’s a systematic way to quantify ambiguity in language?
Plus_minus Plus_minus
Yeah, I’ve been pondering that too. Imagine each word having a probability distribution over meanings, then you could sum the entropies for a whole sentence. The higher the total entropy, the more ambiguity you’re dealing with. It’s a rough metric, but it gives you a starting point to compare phrases, just like a variable’s uncertainty in an equation.
Genius Genius
That framework makes sense, but you’d need to define a robust sense inventory first—otherwise the probability mass could drift arbitrarily. Also, the context window could dramatically shift those distributions, so entropy would be a moving target rather than a fixed metric.
Plus_minus Plus_minus
You’re right, the quality of the sense inventory is the hinge. If the mapping from words to concepts is fuzzy, the probability mass will just spill over. And the context window… it’s like a moving target, which means entropy changes with each new token. A way around it is to treat context as a weighted mixture: give the current sentence a high weight, then decay older sentences, so the distribution stays anchored. That still gives you a sliding entropy, but you can compare it to a baseline over time. It’s not a perfect number, but it lets you see how ambiguity ebbs and flows in a sentence.
Genius Genius
Sounds like a clever tweak—weighting the current sentence heavily and fading the past should keep the entropy from wandering too far. If you run it on a few sample corpora, we might see whether the baseline really holds up. Let's test it out and see what patterns emerge.