How can I sanitize dynamically loaded scripts?

ZeroUnderscoreOu · November 15, 2015, 3:42pm

Hi everybody.
I wrote a userscript for Steam, and I try to get it approved on AMO, but currently it’s rejected because of “potentially unsanitized data”. The funny thing is that the data it loads is actually sanitized, at least if there would be any problems with it the user would get it anyway, I just break it down to smaller filterable parts.
Here are examples of JS that is in the loaded data:

new CEmoticonPopup([":bpheal:",":bphere:",":bpswim:",":tornbanner:",":csgoa:",":csgoanarchist:",":csgob:",":csgocross:",":csgoct:",":csgoglobe:",":csgogun:",":csgohelmet:",":csgoskull:",":csgostar:",":csgox:",":choke:",":fraud:",":orangelily:",":dwayneelf:",":happyelf:",":LIS_flower:",":excite:",":umbrella:",":koi:",":destiny:",":duel:",":might:",":happy_creep:",":P:",":TheD:",":Y:",":galley:",":barreloffun:",":sunspeed:",":rd_crown:",":rd_power:",":kitatus:",":kitatus_b:",":kitatus_g:",":kitatus_p:",":kitatus_r:",":kitatus_y:",":halloweener:",":Safe_House:",":photon:",":spacehelmet:",":swapperorb:",":moon:",":TheShark:",":steambored:",":steamfacepalm:",":steamhappy:",":steammocking:",":steamsad:",":steamsalty:"],$J('#emoticonbtn_564894a5e7052'),$J('#commentthread_UserReceivedNewGame_76561198045959858_1447595929_0_textarea'));

$J(function() {InitializeCommentThread("UserReceivedNewGame","UserReceivedNewGame_76561198045959858_1447595929_0",{"feature":"1447595929","feature2":0,"owner":"76561198045959858","total_count":0,"start":0,"pagesize":3,"has_upvoted":1,"upvotes":6,"votecountid":"vote_count_nowowns_76561198045959858","voteupid":"vote_up_nowowns_76561198045959858","commentcountid":"commentcount_nowowns_76561198045959858","subscribed":false,"newestfirstpagination":true},'http://steamcommunity.com/comment/UserReceivedNewGame/',28);});

// this global will contain the full set of screenshots to show
g_BlotterGalleries['gallery_402305544743065942'] = {};
g_BlotterGalleries['gallery_402305544743065942'].m_screenshotActive = '402305544743065942';
g_BlotterGalleries['gallery_402305544743065942'].shots = {};
// record another entry in the global screenshot array
g_BlotterGalleries['gallery_402305544743065942'].shots['402305544743065942'] = {"m_id":"402305544743065942","m_imageHeight":768,"m_imageWidth":1280,"m_caption":"\u0414\u043e\u043b\u0431\u0438\u043c \u043a\u043d\u043e\u043f\u043a\u0438 \u043d\u0430 \u043a\u043e\u043d\u0442\u0440\u043e\u043b\u043b\u0435\u0440\u0435, \u0447\u0442\u043e \u0435\u0441\u0442\u044c \u0441\u0438\u043b\u044b)","m_commentCount":3,"m_baseurl":"http:\/\/images.akamai.steamusercontent.com\/ugc\/402305544743065942\/8FBFE243D09BAB1F849731623BC3D761CBC22EA6\/"

(the last one is cut).
So how can it be sanitized? If I simply take parameters from loaded JS and use them with local function calls? Or should I parse all JS and check for parameters’ length & contents too? Or it can’t be sanitized to AMO standards at all? I already know another way to do it, but I still want to figure it out, particularly to find out what AMO standards are in this regard.

erosman · November 15, 2015, 4:13pm

Remote script injection is not allowed on AMO since there is no control over what the remote script will do. There is no sanitization for that.

If Addon needs to grab DATA (not script), then that can be allowed as long as the data inserted in such a way that it can NOT be executed.

Particle · November 15, 2015, 4:11pm

If it is code that is going to be executed then it will never be approved. You need to include that code along with your add-on for it to be accepted.

If it is code that only contains data that you want to convert to text or markup then you have to use the method that they linked in the review. And they will only accept if you use that exact method and no other method, they don’t care if you are using your own sanitizing method, if it isn’t the one they linked then it will be rejected.

Their position is they can see that your data is safe, for now. Their problem is if after approving the add-on you decide to change the remote data into malicious code.

EDIT: Erosman beat me to it. By the way, erosman you should correct that last part “as long as the data inserted in such a way that it can (should be can’t) be executed.”

erosman · November 15, 2015, 4:12pm

yes… sorry about the typo (can not) … and thanks @Particle for mentioning it

ZeroUnderscoreOu · November 15, 2015, 4:33pm

Even if this code is somehow parsed and executed only partially, or if I use it only as a source of parameters for local functions?

BTW erosman, as you happend to reply to this topic and at the same time you’re the one who keeps rejecting me, and AMO doesn’t have any particular discussion method for add-ons reviewing, I should mention that you should include line numbers with rejection reasons, so I won’t have to hecking guess which code exactly is ment by a pretty general description.

Particle · November 15, 2015, 4:46pm

Even if one small portion is executed then it won’t be accepted, the rule is absolute on this because, as erosman said better:

there is no control over what the remote script will do

Trust me, as a developer that went through this for quite some time (erosman was with me too helping during these difficult discussions) don’t execute any remote data at all if you want your add-on accepted on AMO.

ZeroUnderscoreOu · November 15, 2015, 4:56pm

I still don’t get what exactly you mean by “executed”.

Here’s a possible example, I take the first quoted script (new CEmoticonPopup()), then parse it, taking emotions (":bpheal:",":bphere:",":bpswim:") and element IDs from it. Then, locally in the addon, I call new CEmoticonPopup() with these parameters. So basically, I don’t execute anything external, I just pass parameters over.
Or, in a slightly another way, I access elements by those got IDs and insert got emoticons there. Again, I don’t execute, but is the data I get from this parsing considered unsafe too? I still try to understand access limitations for sanitization.

erosman · November 15, 2015, 7:04pm

I thought it was obvious, since developer should know when remote content is inserted.
For example:

Version 1.2.2

TempElem = TempElem.appendChild(new Element("Script")); // missing scripts for calendar
TempElem.src = "http://steamcommunity-a.akamaihd.net/public/javascript/calendar.js?v=.SRHlwwlZP-Ie";
TempElem.type = "Text/JavaScript";
TempElem = TempElem.parentElement.appendChild(new Element("Script"));
TempElem.src = "http://steamcommunity-a.akamaihd.net/public/javascript/group_admin_functions.js?v=18ccuSc9Dzhv&l=english";
TempElem.type = "Text/JavaScript";
TempElem = TempElem.parentElement;
ActivityContainer.insertBefore(TempElem,ActivityContainer.firstChild);

Version 1.2.4

new Ajax.Request(CalendarURL+"events?action=calendarFeed",{ // adding day length in seconds
    method: "post",
    parameters: PostData,
    onSuccess: function(Data) {
      var Response = Data.responseXML.documentElement;
      var Results = Response.getElementsByTagName("results")[0].firstChild.nodeValue;
      if (Results=="OK") { // more checks probably should be here, including for Response
        ActivityCalendarFill(Response);
      } else {
        alert("Error\r\n"+Results);
      };
    },
    
function ActivityCalendarFill(Response) {
// ....
  var CalendarID = Response.getElementsByTagName("calendarID")[0].firstChild.nodeValue;
  var HTMLDays = document.getElementById("days_"+CalendarID);
  var XMLDays = Response.getElementsByTagName("day");
  var DayCounter = 0;
  document.getElementById("monthNav_"+CalendarID).innerHTML = Response.getElementsByTagName("monthNav")[0].firstChild.nodeValue.replace(/calChangeMonth/g,"ActivityCalendarLoad");
// ....
}

The use of innerHTML is always cause for concern normally and if used with remote content then it is a security issue.

Executed: Means JavaScript
JavaScript can be passed in many forms in addition to script tags eg with on* attributes, href etc

Any remote content that has the potential to be executed (run as JavaScript) is a security problem.

Particle · November 15, 2015, 7:49pm

Besides the external libraries erosman pointed out, you are doing basically the same as I did in my add-on a couple of months ago. Using the remote data (doesn’t matter what it is, all it matters is if it is remote or not) and parsing it with innerHTML. This method runs any code inside the string, meaning that any functions present in the string will run. You will have to use a different method if all you want is to use static data and nothing else. (Text and elements with no on* attributes).

Lithopsian · November 15, 2015, 8:36pm

You will need to very closely control the fragments that are inserted into your own javascript. You say the fragments are just emoticons, but can you be sure that someone hasn’t managed to sneak a huge string of malware between two colons? Maybe you think you can, but you need to make that very clear to the addon reviewer.

Do you have a known list of allowed emoticons? Could you get one? If you can just match the parsed emoticons against a known list, then you can build your scripts entirely from known pieces bundled within your addon. Any string of javascript constructed and executed out of pieces is always going to be very closely scrutinised, but if it can be shown to be constructed entirely from hard-coded strings then you have a chance. Maybe only a small chance, though, since executing even a single hard-coded string as javascript is potential cause for rejection (eg. setTimeout(“alert(‘hello world’)”, 5000) would be rejected).

Another path to consider is to have your known “safe” javascript study the emoticons as data and decide what to do. For example, if you could simply pass the emoticon string as a parameter (or post message?) into the script and then have that passed into a text field somewhere. I think you can see what I’m getting at, and design something.

ZeroUnderscoreOu · November 15, 2015, 10:02pm

I have several places where it is, and not all of them were written by me, as a part of the script is already existing on the site and was added to satisfy only local scripts rule. Additionally, to me it’s all pretty safe and, as I already said, I don’t know AMO standards, so it’s hard to tell which part exactly is considered unsafe and which isn’t.

That’s the fun thing with this script. The data I parse and insert is the same data the user normally gets, so I’m not adding something from aside. I just split it in parts. Particularly with emoticons it is what the server returns, so if somebody can modify them they can do it for all users despite the add-on.

Lithopsian · November 15, 2015, 11:46pm

You keep saying “the data the user normally gets” without appearing to understand what it is you are doing. Additionally, the code snippets that have been posted don’t do what you describe.

You appear to be downloading a string of data from an external source over which you have no control. It could be anything. Then you are manipulating it under the assumption that it is what you expect, yielding another string over which you have very limited control. Then you slap it wholesale into a web page and somehow seem to think that is safe. You appear to perform this action with elevated privileges from an addon, although you call it a userscript, bypassing any possible sanitisation there might have been if the original string had been part of a web page. You seem to expect that replacing innerHTML on a node cannot possibly be dangerous, whereas it can in fact rewrite unlimited and arbitrary code under that location. The code in the snippets posted here isn’t close to safe. Other code you have written may or may not be any safer, it is difficult to tell. If you don’t know what is in your addon, then how can we be expected to?

In your situation, the shortest response to the question “How can I sanitise dynamically loaded scripts?” would be that you can’t. I’ve suggested writing code that uses downloaded data only as data, but it seems like your addon may be doing a whole lot more than you originally described.

erosman · November 16, 2015, 6:36am

I explained it in my first reply

ZeroUnderscoreOu · November 17, 2015, 6:33am

I do the same thing as [StartLoadingBlotter()][1] does, just modified a bit. That’s why:

If you mean 3 functions I posted, then it’s not part of my script, it’s what I need to get executed from server response. Those calls are dynamic, so I can’t simply include them in my script.

I don’t see how executing a script in page context is elevated. I would understand if I was passing it to browser context.

By calling it so I want to emphasise that it can be just pasted in the browser console and work and that it’s not a proper addon, just a wrapper. Yea, if you mean that it’s not executed in content script context then it’s so, but I differentiate those 2 terms. Though, if you can argue that such usage of terms is confusing or wrong, I’ll change that.

No, it doesn’t. I do some tricks with JS I get, which are not too professional probably, but I don’t add any functionality to it. I do it to make that JS execute how it normally would.

Your explanation wasn’t clear and didn’t answer my specifying question:

For now I found a way to solve this, I’ll be working with already present elements, only passing parameters to the external downloading function. I wouldn’t be downloading anything myself, though it would lead to duplication of work. I think that is sanitized enough.
[1]: http://steamcommunity-a.akamaihd.net/public/javascript/blotter_functions.js?v=SeUC4xU7RtdT&l=english

erosman · November 17, 2015, 7:06am

There is a way to do it … but complicated.

The only way that you can use [":bpheal:",":bphere:", ... is that you use a RegEx to grab that which creates more complications and whatever you grab can not assign them to variables directly(and use of eval() is never allowed)

Anyway, until we see the actual code, it is difficult to advise properly.