Skill can determine, but as you said yourself, matchup and numbers decide, and that makes impossible to determine who is actually good, especially if there is a gear differential.
I will take an example from WoW. Back when the level cap was 60, there were a bunch of elite players with great gear who rocked in raids. Guilds were beating Blackwing Layer and getting decked out while other players who failed to get into these elite guilds failed to get good gear. These said players would PVP for lesser gear but would always get rocked in duels by the PVE players.
There was a lot of hot debate on which players were good, all based on these duels or battleground encounters. Top guilds were sitting pretty and bragging. Being condecending to all the 'noobs'.
Finally WoW introduced Arena, and raiding guilds began to fall left and right. People formed smaller groups to battle it out on even ground. These former superstars turned out to be for the most part horrible PVPers and many of them keyboard turners lol. Most people didn't even macro or keybind correctly. It just shows how much better stats can totally distort the course of a battle in PVP, and how horrible world PVP is to compare e-peen.
You have never PVPed competitively in an MMO until you've tried WoW arena, and you can never prove for a fact that you are a better player than another person until you've beat them consistently on even ground.
Anybody who thinks they are good at PVP b/c they can win most of the time in world PVP is deluding themselves.
Like I mentioned before, I don't think that your stance applies universally to MMORPG's. In EVE, I spent my early noob time in a group called EVE University. We cut our teeth in PVP by chasing around better players in bigger ships, armed with a horde of tiny, cheap ships. We'd get murdered, of course, but if we took down just one of the enemy, it outweighed the total loss of all our cheap ships.
Now, I think that's a totally legitimate strategy, and one of the fun aspects of EVE. I think that world PVP was quite balanced in that regard, so you didn't really have the "guy with elite gear owns everyone else" thing that you're speaking of. There was a counter to every ship, and every strategy. You just had to be prepared.
But I think the real issue is probably that we're coming at this from totally different approaches. For some people, it's about having a quantitative, definitive way of gauging which player or group is better. I'm more interested in "messier" kinds of PVP, perhaps? Somehow, I think it's less important to me to establish who is the consistently better player.