This is really interesting, and I think I’ve got to the bottom of this. In this case, there are basically two transformations that the code can perform, one I’ll call (to hide its complexity) the “triage”, while the other is the well-known “bleaching”.
The “triage” is a multi-step process responsible for, among other things, injecting missing section ID’s, and it basically parses, modifies (if necessary), and re-generates the HTML. It is responsible for at least some of the HTML attribute re-ordering.
When doing manual edits, I discovered that the editor starts with the content of the current revision, which has been “triaged” but not “bleached”. It’s also interesting that the CK editor’s “source” view is actually a modification of that content (for example, it performs its own re-ordering of the HTML attributes). So when a new revision is created from the manual edit session, its content has not been “bleached”.
However, when using the PUT API, we’re starting with either the raw content of the page (e.g., a GET of https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox?raw) or the content returned by the document API (e.g., a GET of https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox$api). Both return the exact same content (the document’s “html” attribute, not the “content” attribute of the current revision of the document), and that content has not only been “triaged” but “bleached” as well. So when the PUT is performed to the document API (e.g., PUT to https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox$api), the content of the new revision that is created has been “bleached”.
Also, it’s important to know that what the user sees on a document page (e.g., https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox) is always derived from the document’s “html” attribute. The document can have one or more revisions, each which have their own “content” attribute, but when one of those revisions is made the current revision, it’s “content” attribute is first “bleached” and then stored in the document’s “html” attribute. So the content shown to users is always “bleached”, while the content stored in a revision is usually not “bleached” since it’s usually created manually and not through the PUT API.
So when we compare two revisions (e.g., https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox$compare?locale=en-US&to=1389936&from=1389656) we’re usually comparing unbleached content with unbleached content, but after using the PUT API, we’re suddenly comparing “bleached” content with “unbleached”.
So I think we can safely, confidently use the PUT API. One thing though, is that I’d recommend starting with the content returned by the document API (e.g., a GET of https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox$api) rather than the raw document content (e.g., a GET of https://developer.mozilla.org/en-US/docs/Tools/Tools_Toolbox?raw), and only because it returns the “ETag” header which can be used for making the PUT safer by adding an “If-Match” header to avoid collisions with other editors.