1 | /* |
---|---|
2 | * Copyright 2019 WebAssembly Community Group participants |
3 | * |
4 | * Licensed under the Apache License, Version 2.0 (the "License"); |
5 | * you may not use this file except in compliance with the License. |
6 | * You may obtain a copy of the License at |
7 | * |
8 | * http://www.apache.org/licenses/LICENSE-2.0 |
9 | * |
10 | * Unless required by applicable law or agreed to in writing, software |
11 | * distributed under the License is distributed on an "AS IS" BASIS, |
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
13 | * See the License for the specific language governing permissions and |
14 | * limitations under the License. |
15 | */ |
16 | |
17 | // |
18 | // Asyncify: async/await style code transformation, that allows pausing |
19 | // and resuming. This lets a language support synchronous operations in |
20 | // an async manner, for example, you can do a blocking wait, and that will |
21 | // be turned into code that unwinds the stack at the "blocking" operation, |
22 | // then is able to rewind it back up when the actual async operation |
23 | // completes, so the code appears to have been running synchronously |
24 | // all the while. Use cases for this include coroutines, python generators, |
25 | // etc. |
26 | // |
27 | // The approach here is a third-generation design after Emscripten's original |
28 | // Asyncify and then Emterpreter-Async approaches: |
29 | // |
30 | // * Old Asyncify rewrote control flow in LLVM IR. A problem is that this |
31 | // needs to save all SSA registers as part of the local state, which can be |
32 | // very costly. A further increase can happen because of phis that are |
33 | // added because of control flow transformations. As a result we saw |
34 | // pathological cases where the code size increase was unacceptable. |
35 | // * Emterpreter-Async avoids any risk of code size increase by running it |
36 | // in a little bytecode VM, which also makes pausing/resuming trivial. |
37 | // Speed is the main downside here; however, the approach did prove that by |
38 | // *selectively* emterpretifying only certain code, many projects can run |
39 | // at full speed - often just the "main loop" must be emterpreted, while |
40 | // high-speed code that it calls, and in which cannot be an async operation, |
41 | // remain at full speed. |
42 | // |
43 | // New Asyncify's design learns from both of those: |
44 | // |
45 | // * The code rewrite is done in Binaryen, that is, at the wasm level. At |
46 | // this level we will only need to save wasm locals, which is a much smaller |
47 | // number than LLVM's SSA registers. |
48 | // * We aim for low but non-zero overhead. Low overhead is very important |
49 | // for obvious reasons, while Emterpreter-Async proved it is tolerable to |
50 | // have *some* overhead, if the transform can be applied selectively. |
51 | // |
52 | // The specific transform implemented here is nicknamed "Bysyncify" (as it is |
53 | // in BinarYen, and "B" comes after "A"). It is simpler than old Asyncify but |
54 | // has low overhead when properly optimized. Old Asyncify worked at the CFG |
55 | // level and added branches there; new Asyncify on the other hand works on the |
56 | // structured control flow of wasm and simply "skips over" code when rewinding |
57 | // the stack, and jumps out when unwinding. The transformed code looks |
58 | // conceptually like this: |
59 | // |
60 | // void foo(int x) { |
61 | // // new prelude |
62 | // if (rewinding) { |
63 | // loadLocals(); |
64 | // } |
65 | // // main body starts here |
66 | // if (!rewinding) { |
67 | // // some code we must skip while rewinding |
68 | // x = x + 1; |
69 | // x = x / 2; |
70 | // } |
71 | // // If rewinding, do this call only if it is the right one. |
72 | // if (!rewinding or nextCall == 0) { |
73 | // bar(x); |
74 | // if (unwinding) { |
75 | // noteUnWound(0); |
76 | // saveLocals(); |
77 | // return; |
78 | // } |
79 | // } |
80 | // if (!rewinding) { |
81 | // // more code we must skip while rewinding |
82 | // while (x & 7) { |
83 | // x = x + 1; |
84 | // } |
85 | // } |
86 | // [..] |
87 | // |
88 | // The general idea is that while rewinding we just "keep going forward", |
89 | // skipping code we should not run. That is, instead of jumping directly |
90 | // to the right place, we have ifs that skip along instead. The ifs for |
91 | // rewinding and unwinding should be well-predicted, so the overhead should |
92 | // not be very high. However, we do need to reach the right location via |
93 | // such skipping, which means that in a very large function the rewind |
94 | // takes some extra time. On the other hand, though, code that cannot |
95 | // unwind or rewind (like that loop near the end) can run at full speed. |
96 | // Overall, this should allow good performance with small overhead that is |
97 | // mostly noticed at rewind time. |
98 | // |
99 | // After this pass is run a new i32 global "__asyncify_state" is added, which |
100 | // has the following values: |
101 | // |
102 | // 0: normal execution |
103 | // 1: unwinding the stack |
104 | // 2: rewinding the stack |
105 | // |
106 | // There is also "__asyncify_data" which when rewinding and unwinding |
107 | // contains a pointer to a data structure with the info needed to rewind |
108 | // and unwind: |
109 | // |
110 | // { // offsets |
111 | // i32 - current asyncify stack location // 0 |
112 | // i32 - asyncify stack end // 4 |
113 | // } |
114 | // |
115 | // Or for wasm64: |
116 | // |
117 | // { // offsets |
118 | // i64 - current asyncify stack location // 0 |
119 | // i64 - asyncify stack end // 8 |
120 | // } |
121 | // |
122 | // The asyncify stack is a representation of the call frame, as a list of |
123 | // indexes of calls. In the example above, we saw index "0" for calling "bar" |
124 | // from "foo". When unwinding, the indexes are added to the stack; when |
125 | // rewinding, they are popped off; the current asyncify stack location is |
126 | // undated while doing both operations. The asyncify stack is also used to |
127 | // save locals. Note that the stack end location is provided, which is for |
128 | // error detection. |
129 | // |
130 | // Note: all pointers are assumed to be 4-byte aligned. |
131 | // |
132 | // When you start an unwinding operation, you must set the initial fields |
133 | // of the data structure, that is, set the current stack location to the |
134 | // proper place, and the end to the proper end based on how much memory |
135 | // you reserved. Note that asyncify will grow the stack up. |
136 | // |
137 | // The pass will also create five functions that let you control unwinding |
138 | // and rewinding: |
139 | // |
140 | // * asyncify_start_unwind(data : iPTR): call this to start unwinding the |
141 | // stack from the current location. "data" must point to a data |
142 | // structure as described above (with fields containing valid data). |
143 | // |
144 | // * asyncify_stop_unwind(): call this to note that unwinding has |
145 | // concluded. If no other code will run before you start to rewind, |
146 | // this is not strictly necessary, however, if you swap between |
147 | // coroutines, or even just want to run some normal code during a |
148 | // "sleep", then you must call this at the proper time. Otherwise, |
149 | // the code will think it is still unwinding when it should not be, |
150 | // which means it will keep unwinding in a meaningless way. |
151 | // |
152 | // * asyncify_start_rewind(data : iPTR): call this to start rewinding the |
153 | // stack back up to the location stored in the provided data. This prepares |
154 | // for the rewind; to start it, you must call the first function in the |
155 | // call stack to be unwound. |
156 | // |
157 | // * asyncify_stop_rewind(): call this to note that rewinding has |
158 | // concluded, and normal execution can resume. |
159 | // |
160 | // * asyncify_get_state(): call this to get the current value of the |
161 | // internal "__asyncify_state" variable as described above. |
162 | // It can be used to distinguish between unwinding/rewinding and normal |
163 | // calls, so that you know when to start an asynchronous operation and |
164 | // when to propagate results back. |
165 | // |
166 | // These four functions are exported so that you can call them from the |
167 | // outside. If you want to manage things from inside the wasm, then you |
168 | // couldn't have called them before they were created by this pass. To work |
169 | // around that, you can create imports to asyncify.start_unwind, |
170 | // asyncify.stop_unwind, asyncify.start_rewind, and asyncify.stop_rewind; |
171 | // if those exist when this pass runs then it will turn those into direct |
172 | // calls to the functions that it creates. Note that when doing everything |
173 | // in wasm like this, Asyncify must not instrument your "runtime" support |
174 | // code, that is, the code that initiates unwinds and rewinds and stops them. |
175 | // If it did, the unwind would not stop until you left the wasm module |
176 | // entirely, etc. Therefore we do not instrument a function if it has |
177 | // a call to the four asyncify_* methods. Note that you may need to disable |
178 | // inlining if that would cause code that does need to be instrumented |
179 | // show up in that runtime code. |
180 | // |
181 | // To use this API, call asyncify_start_unwind when you want to. The call |
182 | // stack will then be unwound, and so execution will resume in the JS or |
183 | // other host environment on the outside that called into wasm. When you |
184 | // return there after unwinding, call asyncify_stop_unwind. Then when |
185 | // you want to rewind, call asyncify_start_rewind, and then call the same |
186 | // initial function you called before, so that unwinding can begin. The |
187 | // unwinding will reach the same function from which you started, since |
188 | // we are recreating the call stack. At that point you should call |
189 | // asyncify_stop_rewind and then execution can resume normally. |
190 | // |
191 | // Note that the asyncify_* API calls will verify that the data structure |
192 | // is valid, checking if the current location is past the end. If so, they |
193 | // will throw a wasm unreachable. No check is done during unwinding or |
194 | // rewinding, to keep things fast - in principle, from when unwinding begins |
195 | // and until it ends there should be no memory accesses aside from the |
196 | // asyncify code itself (and likewise when rewinding), so there should be |
197 | // no chance of memory corruption causing odd errors. However, the low |
198 | // overhead of this approach does cause an error only to be shown when |
199 | // unwinding/rewinding finishes, and not at the specific spot where the |
200 | // stack end was exceeded. |
201 | // |
202 | // By default this pass is very careful: it assumes that any import and |
203 | // any indirect call may start an unwind/rewind operation. If you know more |
204 | // specific information you can inform asyncify about that, which can reduce |
205 | // a great deal of overhead, as it can instrument less code. The relevant |
206 | // options to wasm-opt etc. are: |
207 | // |
208 | // --pass-arg=asyncify-imports@module1.base1,module2.base2,module3.base3 |
209 | // |
210 | // Each module.base in that comma-separated list will be considered to |
211 | // be an import that can unwind/rewind, and all others are assumed not to |
212 | // (aside from the asyncify.* imports, which are always assumed to). |
213 | // Each entry can end in a '*' in which case it is matched as a prefix. |
214 | // |
215 | // The list of imports can be a response file (which is convenient if it |
216 | // is long, or you don't want to bother escaping it on the commandline |
217 | // etc.), e.g. --pass-arg=asyncify-imports@@file.txt will load the |
218 | // contents of file.txt (note the double "@@" - the first is the |
219 | // separator for --pass-arg, and the second is the usual convention for |
220 | // indicating a response file). |
221 | // |
222 | // --pass-arg=asyncify-ignore-imports |
223 | // |
224 | // Ignore all imports (except for bynsyncify.*), that is, assume none of |
225 | // them can start an unwind/rewind. (This is effectively the same as |
226 | // providing asyncify-imports with a list of non-existent imports.) |
227 | // |
228 | // --pass-arg=asyncify-ignore-indirect |
229 | // |
230 | // Ignore all indirect calls. This implies that you know an call stack |
231 | // will never need to be unwound with an indirect call somewhere in it. |
232 | // If that is true for your codebase, then this can be extremely useful |
233 | // as otherwise it looks like any indirect call can go to a lot of places. |
234 | // |
235 | // --pass-arg=asyncify-asserts |
236 | // |
237 | // This enables extra asserts in the output, like checking if we put in |
238 | // an unwind/rewind in an invalid place (this can be helpful for manual |
239 | // tweaking of the only-list / remove-list, see later). |
240 | // |
241 | // --pass-arg=asyncify-verbose |
242 | // |
243 | // Logs out instrumentation decisions to the console. This can help figure |
244 | // out why a certain function was instrumented. |
245 | // |
246 | // For manual fine-tuning of the list of instrumented functions, there are lists |
247 | // that you can set. These must be used carefully, as misuse can break your |
248 | // application - for example, if a function is called that should be |
249 | // instrumented but isn't because of these options, then bad things can happen. |
250 | // Note that even if you test this locally then users running your code in |
251 | // production may reach other code paths. Use these carefully! |
252 | // |
253 | // Each of the lists can be used with a response file (@filename, which is then |
254 | // loaded from the file). You can also use '*' wildcard matching in the lists. |
255 | // |
256 | // --pass-arg=asyncify-removelist@name1,name2,name3 |
257 | // |
258 | // If the "remove-list" is provided, then the functions in it will be |
259 | // *removed* from the list of instrumented functions. That is, they won't |
260 | // be instrumented even if it looks like they need to be. This can be |
261 | // useful if you know things the safe whole-program analysis doesn't, e.g. |
262 | // that certain code paths will not be taken, where certain indirect calls |
263 | // will go, etc. |
264 | // |
265 | // --pass-arg=asyncify-addlist@name1,name2,name3 |
266 | // |
267 | // If the "add-list" is provided, then the functions in the list will be |
268 | // *added* to the list of instrumented functions, that is, they will be |
269 | // instrumented even if otherwise we think they don't need to be. As by |
270 | // default everything will be instrumented in the safest way possible, |
271 | // this is only useful if you use ignore-indirect and use this to fix up |
272 | // some indirect calls that *do* need to be instrumented, or if you will |
273 | // do some later transform of the code that adds more call paths, etc. |
274 | // |
275 | // --pass-arg=asyncify-onlylist@name1,name2,name3 |
276 | // |
277 | // If the "only-list" is provided, then *only* the functions in the list |
278 | // will be instrumented, and nothing else. |
279 | // |
280 | // Note that there are two types of instrumentation that happen for each |
281 | // function: if foo() can be part of a pause/resume operation, then we need to |
282 | // instrument code inside it to support pausing and resuming, but also, we need |
283 | // callers of the function to instrument calls to it. Normally this is already |
284 | // taken care of as the callers need to be instrumented as well anyhow. However, |
285 | // the ignore-indirect option makes things more complicated, since we can't tell |
286 | // where all the calls to foo() are - all we see are indirect calls that do not |
287 | // refer to foo() by name. To make it possible for you to use the add-list or |
288 | // only-list with ignore-indirect, those lists affect *both* kinds of |
289 | // instrumentation. That is, if parent() calls foo() indirectly, and you add |
290 | // parent() to the add-list, then the indirect calls in parent() will be |
291 | // instrumented to support pausing/resuming, even if ignore-indirect is set. |
292 | // Typically you don't need to think about this, and just add all the functions |
293 | // that can be on the stack while pausing - what this means is that when you do |
294 | // so, indirect calls will just work. (The cost is that an instrumented function |
295 | // will check for pausing at all indirect calls, which may be unnecessary in |
296 | // some cases; but this is an instrumented function anyhow, and indirect calls |
297 | // are slower anyhow, so this simple model seems good enough - a more complex |
298 | // model where you can specify "instrument, but not indirect calls from me" |
299 | // would likely have little benefit.) |
300 | // |
301 | // TODO When wasm has GC, extending the live ranges of locals can keep things |
302 | // alive unnecessarily. We may want to set locals to null at the end |
303 | // of their original range. |
304 | // |
305 | |
306 | #include "asmjs/shared-constants.h" |
307 | #include "cfg/liveness-traversal.h" |
308 | #include "ir/effects.h" |
309 | #include "ir/find_all.h" |
310 | #include "ir/linear-execution.h" |
311 | #include "ir/literal-utils.h" |
312 | #include "ir/memory-utils.h" |
313 | #include "ir/module-utils.h" |
314 | #include "ir/names.h" |
315 | #include "ir/utils.h" |
316 | #include "pass.h" |
317 | #include "support/file.h" |
318 | #include "support/string.h" |
319 | #include "wasm-builder.h" |
320 | #include "wasm.h" |
321 | |
322 | namespace wasm { |
323 | |
324 | namespace { |
325 | |
326 | static const Name ASYNCIFY_STATE = "__asyncify_state"; |
327 | static const Name ASYNCIFY_GET_STATE = "asyncify_get_state"; |
328 | static const Name ASYNCIFY_DATA = "__asyncify_data"; |
329 | static const Name ASYNCIFY_START_UNWIND = "asyncify_start_unwind"; |
330 | static const Name ASYNCIFY_STOP_UNWIND = "asyncify_stop_unwind"; |
331 | static const Name ASYNCIFY_START_REWIND = "asyncify_start_rewind"; |
332 | static const Name ASYNCIFY_STOP_REWIND = "asyncify_stop_rewind"; |
333 | static const Name ASYNCIFY_UNWIND = "__asyncify_unwind"; |
334 | static const Name ASYNCIFY = "asyncify"; |
335 | static const Name START_UNWIND = "start_unwind"; |
336 | static const Name STOP_UNWIND = "stop_unwind"; |
337 | static const Name START_REWIND = "start_rewind"; |
338 | static const Name STOP_REWIND = "stop_rewind"; |
339 | static const Name ASYNCIFY_GET_CALL_INDEX = "__asyncify_get_call_index"; |
340 | static const Name ASYNCIFY_CHECK_CALL_INDEX = "__asyncify_check_call_index"; |
341 | |
342 | // TODO: having just normal/unwind_or_rewind would decrease code |
343 | // size, but make debugging harder |
344 | enum class State { Normal = 0, Unwinding = 1, Rewinding = 2 }; |
345 | |
346 | enum class DataOffset { BStackPos = 0, BStackEnd = 4, BStackEnd64 = 8 }; |
347 | |
348 | const auto STACK_ALIGN = 4; |
349 | |
350 | // A helper class for managing fake global names. Creates the globals and |
351 | // provides mappings for using them. |
352 | // Fake globals are used to stash and then use return values from calls. We need |
353 | // to store them somewhere that is valid Binaryen IR, but also will be ignored |
354 | // by the Asyncify instrumentation, so we don't want to use a local. What we do |
355 | // is replace the local used to receive a call's result with a fake global.set |
356 | // to stash it, then do a fake global.get to receive it afterwards. (We do it in |
357 | // two steps so that if we are async, we only do the first and not the second, |
358 | // i.e., we don't store to the target local if not running normally). |
359 | class FakeGlobalHelper { |
360 | Module& module; |
361 | |
362 | public: |
363 | FakeGlobalHelper(Module& module) : module(module) { |
364 | Builder builder(module); |
365 | std::string prefix = "asyncify_fake_call_global_"; |
366 | for (auto type : collectTypes()) { |
367 | auto global = prefix + Type(type).toString(); |
368 | map[type] = global; |
369 | rev[global] = type; |
370 | module.addGlobal(builder.makeGlobal( |
371 | global, type, LiteralUtils::makeZero(type, module), Builder::Mutable)); |
372 | } |
373 | } |
374 | |
375 | ~FakeGlobalHelper() { |
376 | for (auto& pair : map) { |
377 | auto name = pair.second; |
378 | module.removeGlobal(name); |
379 | } |
380 | } |
381 | |
382 | Name getName(Type type) { return map.at(type); } |
383 | |
384 | Type getTypeOrNone(Name name) { |
385 | auto iter = rev.find(name); |
386 | if (iter != rev.end()) { |
387 | return iter->second; |
388 | } |
389 | return Type::none; |
390 | } |
391 | |
392 | private: |
393 | std::unordered_map<Type, Name> map; |
394 | std::unordered_map<Name, Type> rev; |
395 | |
396 | // Collect the types returned from all calls for which call support globals |
397 | // may need to be generated. |
398 | using Types = std::unordered_set<Type>; |
399 | Types collectTypes() { |
400 | ModuleUtils::ParallelFunctionAnalysis<Types> analysis( |
401 | module, [&](Function* func, Types& types) { |
402 | if (!func->body) { |
403 | return; |
404 | } |
405 | struct TypeCollector : PostWalker<TypeCollector> { |
406 | Types& types; |
407 | TypeCollector(Types& types) : types(types) {} |
408 | void visitCall(Call* curr) { |
409 | if (curr->type.isConcrete()) { |
410 | types.insert(curr->type); |
411 | } |
412 | } |
413 | void visitCallIndirect(CallIndirect* curr) { |
414 | if (curr->type.isConcrete()) { |
415 | types.insert(curr->type); |
416 | } |
417 | } |
418 | }; |
419 | TypeCollector(types).walk(func->body); |
420 | }); |
421 | Types types; |
422 | for (auto& pair : analysis.map) { |
423 | Types& functionTypes = pair.second; |
424 | for (auto t : functionTypes) { |
425 | types.insert(t); |
426 | } |
427 | } |
428 | return types; |
429 | } |
430 | }; |
431 | |
432 | class PatternMatcher { |
433 | public: |
434 | std::string designation; |
435 | std::set<Name> names; |
436 | std::set<std::string> patterns; |
437 | std::set<std::string> patternsMatched; |
438 | std::map<std::string, std::string> unescaped; |
439 | |
440 | PatternMatcher(std::string designation, |
441 | Module& module, |
442 | const String::Split& list) |
443 | : designation(designation) { |
444 | // The lists contain human-readable strings. Turn them into the |
445 | // internal escaped names for later comparisons |
446 | for (auto& name : list) { |
447 | auto escaped = WasmBinaryReader::escape(name); |
448 | unescaped[escaped.toString()] = name; |
449 | if (name.find('*') != std::string::npos) { |
450 | patterns.insert(escaped.toString()); |
451 | } else { |
452 | auto* func = module.getFunctionOrNull(escaped); |
453 | if (!func) { |
454 | std::cerr << "warning: Asyncify "<< designation |
455 | << "list contained a non-existing function name: "<< name |
456 | << " ("<< escaped << ")\n"; |
457 | } else if (func->imported()) { |
458 | Fatal() << "Asyncify "<< designation |
459 | << "list contained an imported function name (use the import " |
460 | "list for imports): " |
461 | << name << '\n'; |
462 | } |
463 | names.insert(escaped.str); |
464 | } |
465 | } |
466 | } |
467 | |
468 | bool match(Name funcName) { |
469 | if (names.count(funcName) > 0) { |
470 | return true; |
471 | } else { |
472 | for (auto& pattern : patterns) { |
473 | if (String::wildcardMatch(pattern, funcName.toString())) { |
474 | patternsMatched.insert(pattern); |
475 | return true; |
476 | } |
477 | } |
478 | } |
479 | return false; |
480 | } |
481 | |
482 | void checkPatternsMatches() { |
483 | for (auto& pattern : patterns) { |
484 | if (patternsMatched.count(pattern) == 0) { |
485 | std::cerr << "warning: Asyncify "<< designation |
486 | << "list contained a non-matching pattern: " |
487 | << unescaped[pattern] << " ("<< pattern << ")\n"; |
488 | } |
489 | } |
490 | } |
491 | }; |
492 | |
493 | // Analyze the entire module to see which calls may change the state, that |
494 | // is, start an unwind or rewind), either in itself or in something called |
495 | // by it. |
496 | // Handles global module management, needed from the various parts of this |
497 | // transformation. |
498 | class ModuleAnalyzer { |
499 | Module& module; |
500 | bool canIndirectChangeState; |
501 | |
502 | struct Info |
503 | : public ModuleUtils::CallGraphPropertyAnalysis<Info>::FunctionInfo { |
504 | // The function name. |
505 | Name name; |
506 | // If this function can start an unwind/rewind. |
507 | bool canChangeState = false; |
508 | // If this function is part of the runtime that receives an unwinding |
509 | // and starts a rewinding. If so, we do not instrument it, see above. |
510 | // This is only relevant when handling things entirely inside wasm, |
511 | // as opposed to imports. |
512 | bool isBottomMostRuntime = false; |
513 | // If this function is part of the runtime that starts an unwinding |
514 | // and stops a rewinding. If so, we do not instrument it, see above. |
515 | // The difference between the top-most and bottom-most runtime is that |
516 | // the top-most part is still marked as changing the state; so things |
517 | // that call it are instrumented. This is not done for the bottom. |
518 | bool isTopMostRuntime = false; |
519 | bool inRemoveList = false; |
520 | bool addedFromList = false; |
521 | }; |
522 | |
523 | using Map = std::map<Function*, Info>; |
524 | Map map; |
525 | |
526 | public: |
527 | ModuleAnalyzer(Module& module, |
528 | std::function<bool(Name, Name)> canImportChangeState, |
529 | bool canIndirectChangeState, |
530 | const String::Split& removeListInput, |
531 | const String::Split& addListInput, |
532 | const String::Split& onlyListInput, |
533 | bool asserts, |
534 | bool verbose) |
535 | : module(module), canIndirectChangeState(canIndirectChangeState), |
536 | fakeGlobals(module), asserts(asserts), verbose(verbose) { |
537 | |
538 | PatternMatcher removeList("remove", module, removeListInput); |
539 | PatternMatcher addList("add", module, addListInput); |
540 | PatternMatcher onlyList("only", module, onlyListInput); |
541 | |
542 | // Rename the asyncify imports so their internal name matches the |
543 | // convention. This makes replacing them with the implementations |
544 | // later easier. |
545 | std::map<Name, Name> renamings; |
546 | for (auto& func : module.functions) { |
547 | if (func->module == ASYNCIFY) { |
548 | if (func->base == START_UNWIND) { |
549 | renamings[func->name] = ASYNCIFY_START_UNWIND; |
550 | } else if (func->base == STOP_UNWIND) { |
551 | renamings[func->name] = ASYNCIFY_STOP_UNWIND; |
552 | } else if (func->base == START_REWIND) { |
553 | renamings[func->name] = ASYNCIFY_START_REWIND; |
554 | } else if (func->base == STOP_REWIND) { |
555 | renamings[func->name] = ASYNCIFY_STOP_REWIND; |
556 | } else { |
557 | Fatal() << "call to unidenfied asyncify import: "<< func->base; |
558 | } |
559 | } |
560 | } |
561 | ModuleUtils::renameFunctions(module, renamings); |
562 | |
563 | // Scan to see which functions can directly change the state. |
564 | // Also handle the asyncify imports, removing them (as we will implement |
565 | // them later), and replace calls to them with calls to the later proper |
566 | // name. |
567 | ModuleUtils::CallGraphPropertyAnalysis<Info> scanner( |
568 | module, [&](Function* func, Info& info) { |
569 | info.name = func->name; |
570 | if (func->imported()) { |
571 | // The relevant asyncify imports can definitely change the state. |
572 | if (func->module == ASYNCIFY && |
573 | (func->base == START_UNWIND || func->base == STOP_REWIND)) { |
574 | info.canChangeState = true; |
575 | } else { |
576 | info.canChangeState = |
577 | canImportChangeState(func->module, func->base); |
578 | if (verbose && info.canChangeState) { |
579 | std::cout << "[asyncify] "<< func->name |
580 | << " is an import that can change the state\n"; |
581 | } |
582 | } |
583 | return; |
584 | } |
585 | struct Walker : PostWalker<Walker> { |
586 | Info& info; |
587 | Module& module; |
588 | bool canIndirectChangeState; |
589 | |
590 | Walker(Info& info, Module& module, bool canIndirectChangeState) |
591 | : info(info), module(module), |
592 | canIndirectChangeState(canIndirectChangeState) {} |
593 | |
594 | void visitCall(Call* curr) { |
595 | if (curr->isReturn) { |
596 | Fatal() << "tail calls not yet supported in asyncify"; |
597 | } |
598 | auto* target = module.getFunction(name: curr->target); |
599 | if (target->imported() && target->module == ASYNCIFY) { |
600 | // Redirect the imports to the functions we'll add later. |
601 | if (target->base == START_UNWIND) { |
602 | info.canChangeState = true; |
603 | info.isTopMostRuntime = true; |
604 | } else if (target->base == STOP_UNWIND) { |
605 | info.isBottomMostRuntime = true; |
606 | } else if (target->base == START_REWIND) { |
607 | info.isBottomMostRuntime = true; |
608 | } else if (target->base == STOP_REWIND) { |
609 | info.canChangeState = true; |
610 | info.isTopMostRuntime = true; |
611 | } else { |
612 | WASM_UNREACHABLE("call to unidenfied asyncify import"); |
613 | } |
614 | } |
615 | } |
616 | void visitCallIndirect(CallIndirect* curr) { |
617 | if (curr->isReturn) { |
618 | Fatal() << "tail calls not yet supported in asyncify"; |
619 | } |
620 | if (canIndirectChangeState) { |
621 | info.canChangeState = true; |
622 | } |
623 | // TODO optimize the other case, at least by type |
624 | } |
625 | }; |
626 | Walker walker(info, module, canIndirectChangeState); |
627 | walker.walk(func->body); |
628 | |
629 | if (info.isBottomMostRuntime) { |
630 | info.canChangeState = false; |
631 | // TODO: issue warnings on suspicious things, like a function in |
632 | // the bottom-most runtime also doing top-most runtime stuff |
633 | // like starting and unwinding. |
634 | } |
635 | if (verbose && info.canChangeState) { |
636 | std::cout << "[asyncify] "<< func->name |
637 | << " can change the state due to initial scan\n"; |
638 | } |
639 | }); |
640 | |
641 | // Functions in the remove-list are assumed to not change the state. |
642 | for (auto& [func, info] : scanner.map) { |
643 | if (removeList.match(func->name)) { |
644 | info.inRemoveList = true; |
645 | if (verbose && info.canChangeState) { |
646 | std::cout << "[asyncify] "<< func->name |
647 | << " is in the remove-list, ignore\n"; |
648 | } |
649 | info.canChangeState = false; |
650 | } |
651 | } |
652 | |
653 | // Remove the asyncify imports, if any, and any calls to them. |
654 | std::vector<Name> funcsToDelete; |
655 | for (auto& [func, info] : scanner.map) { |
656 | auto& callsTo = info.callsTo; |
657 | if (func->imported() && func->module == ASYNCIFY) { |
658 | funcsToDelete.push_back(func->name); |
659 | } |
660 | std::vector<Function*> callersToDelete; |
661 | for (auto* target : callsTo) { |
662 | if (target->imported() && target->module == ASYNCIFY) { |
663 | callersToDelete.push_back(target); |
664 | } |
665 | } |
666 | for (auto* target : callersToDelete) { |
667 | callsTo.erase(target); |
668 | } |
669 | } |
670 | for (auto name : funcsToDelete) { |
671 | module.removeFunction(name); |
672 | } |
673 | |
674 | scanner.propagateBack([](const Info& info) { return info.canChangeState; }, |
675 | [](const Info& info) { |
676 | return !info.isBottomMostRuntime && |
677 | !info.inRemoveList; |
678 | }, |
679 | [verbose](Info& info, Function* reason) { |
680 | if (verbose && !info.canChangeState) { |
681 | std::cout << "[asyncify] "<< info.name |
682 | << " can change the state due to " |
683 | << reason->name << "\n"; |
684 | } |
685 | info.canChangeState = true; |
686 | }, |
687 | scanner.IgnoreNonDirectCalls); |
688 | |
689 | map.swap(scanner.map); |
690 | |
691 | if (!onlyListInput.empty()) { |
692 | // Only the functions in the only-list can change the state. |
693 | for (auto& func : module.functions) { |
694 | if (!func->imported()) { |
695 | auto& info = map[func.get()]; |
696 | bool matched = onlyList.match(func->name); |
697 | info.canChangeState = matched; |
698 | if (matched) { |
699 | info.addedFromList = true; |
700 | } |
701 | if (verbose) { |
702 | std::cout << "[asyncify] "<< func->name |
703 | << "'s state is set based on the only-list to "<< matched |
704 | << '\n'; |
705 | } |
706 | } |
707 | } |
708 | } |
709 | |
710 | if (!addListInput.empty()) { |
711 | for (auto& func : module.functions) { |
712 | if (!func->imported() && addList.match(func->name)) { |
713 | auto& info = map[func.get()]; |
714 | if (verbose && !info.canChangeState) { |
715 | std::cout << "[asyncify] "<< func->name |
716 | << " is in the add-list, add\n"; |
717 | } |
718 | info.canChangeState = true; |
719 | info.addedFromList = true; |
720 | } |
721 | } |
722 | } |
723 | |
724 | removeList.checkPatternsMatches(); |
725 | addList.checkPatternsMatches(); |
726 | onlyList.checkPatternsMatches(); |
727 | } |
728 | |
729 | bool needsInstrumentation(Function* func) { |
730 | auto& info = map[func]; |
731 | return info.canChangeState && !info.isTopMostRuntime; |
732 | } |
733 | |
734 | bool canChangeState(Expression* curr, Function* func) { |
735 | // Look inside to see if we call any of the things we know can change the |
736 | // state. |
737 | // TODO: caching, this is O(N^2) |
738 | struct Walker : PostWalker<Walker> { |
739 | void visitCall(Call* curr) { |
740 | // We only implement these at the very end, but we know that they |
741 | // definitely change the state. |
742 | if (curr->target == ASYNCIFY_START_UNWIND || |
743 | curr->target == ASYNCIFY_STOP_REWIND || |
744 | curr->target == ASYNCIFY_GET_CALL_INDEX || |
745 | curr->target == ASYNCIFY_CHECK_CALL_INDEX) { |
746 | canChangeState = true; |
747 | return; |
748 | } |
749 | if (curr->target == ASYNCIFY_STOP_UNWIND || |
750 | curr->target == ASYNCIFY_START_REWIND) { |
751 | isBottomMostRuntime = true; |
752 | return; |
753 | } |
754 | // The target may not exist if it is one of our temporary intrinsics. |
755 | auto* target = module->getFunctionOrNull(name: curr->target); |
756 | if (target && (*map)[target].canChangeState) { |
757 | canChangeState = true; |
758 | } |
759 | } |
760 | void visitCallIndirect(CallIndirect* curr) { hasIndirectCall = true; } |
761 | Module* module; |
762 | ModuleAnalyzer* analyzer; |
763 | Map* map; |
764 | bool hasIndirectCall = false; |
765 | bool canChangeState = false; |
766 | bool isBottomMostRuntime = false; |
767 | }; |
768 | Walker walker; |
769 | walker.module = &module; |
770 | walker.analyzer = this; |
771 | walker.map = ↦ |
772 | walker.walk(curr); |
773 | // An indirect call is normally ignored if we are ignoring indirect calls. |
774 | // However, see the docs at the top: if the function we are inside was |
775 | // specifically added by the user (in the only-list or the add-list) then we |
776 | // instrument indirect calls from it (this allows specifically allowing some |
777 | // indirect calls but not others). |
778 | if (walker.hasIndirectCall && |
779 | (canIndirectChangeState || map[func].addedFromList)) { |
780 | walker.canChangeState = true; |
781 | } |
782 | // The bottom-most runtime can never change the state. |
783 | return walker.canChangeState && !walker.isBottomMostRuntime; |
784 | } |
785 | |
786 | FakeGlobalHelper fakeGlobals; |
787 | bool asserts; |
788 | bool verbose; |
789 | }; |
790 | |
791 | // Checks if something performs a call: either a direct or indirect call, |
792 | // and perhaps it is dropped or assigned to a local. This captures all the |
793 | // cases of a call in flat IR. |
794 | static bool doesCall(Expression* curr) { |
795 | if (auto* set = curr->dynCast<LocalSet>()) { |
796 | curr = set->value; |
797 | } else if (auto* drop = curr->dynCast<Drop>()) { |
798 | curr = drop->value; |
799 | } |
800 | return curr->is<Call>() || curr->is<CallIndirect>(); |
801 | } |
802 | |
803 | class AsyncifyBuilder : public Builder { |
804 | public: |
805 | Module& wasm; |
806 | Type pointerType; |
807 | Name asyncifyMemory; |
808 | |
809 | AsyncifyBuilder(Module& wasm, Type pointerType, Name asyncifyMemory) |
810 | : Builder(wasm), wasm(wasm), pointerType(pointerType), |
811 | asyncifyMemory(asyncifyMemory) {} |
812 | |
813 | Expression* makeGetStackPos() { |
814 | return makeLoad(pointerType.getByteSize(), |
815 | false, |
816 | int(DataOffset::BStackPos), |
817 | pointerType.getByteSize(), |
818 | makeGlobalGet(ASYNCIFY_DATA, pointerType), |
819 | pointerType, |
820 | asyncifyMemory); |
821 | } |
822 | |
823 | Expression* makeIncStackPos(int32_t by) { |
824 | if (by == 0) { |
825 | return makeNop(); |
826 | } |
827 | auto literal = Literal::makeFromInt64(x: by, type: pointerType); |
828 | return makeStore(pointerType.getByteSize(), |
829 | int(DataOffset::BStackPos), |
830 | pointerType.getByteSize(), |
831 | makeGlobalGet(ASYNCIFY_DATA, pointerType), |
832 | makeBinary(Abstract::getBinary(pointerType, Abstract::Add), |
833 | makeGetStackPos(), |
834 | makeConst(literal)), |
835 | pointerType, |
836 | asyncifyMemory); |
837 | } |
838 | |
839 | Expression* makeStateCheck(State value) { |
840 | return makeBinary(op: EqInt32, |
841 | left: makeGlobalGet(ASYNCIFY_STATE, Type::i32), |
842 | right: makeConst(Literal(int32_t(value)))); |
843 | } |
844 | |
845 | Expression* makeNegatedStateCheck(State value) { |
846 | return makeUnary(Abstract::getUnary(type: pointerType, op: Abstract::EqZ), |
847 | makeStateCheck(value)); |
848 | } |
849 | }; |
850 | |
851 | // Proxy that runs wrapped pass for instrumented functions only |
852 | struct InstrumentedProxy : public Pass { |
853 | std::unique_ptr<Pass> create() override { |
854 | return std::make_unique<InstrumentedProxy>(analyzer, pass->create()); |
855 | } |
856 | |
857 | InstrumentedProxy(ModuleAnalyzer* analyzer, std::unique_ptr<Pass> pass) |
858 | : analyzer(analyzer), pass(std::move(pass)) {} |
859 | |
860 | bool isFunctionParallel() override { return pass->isFunctionParallel(); } |
861 | |
862 | void runOnFunction(Module* module, Function* func) override { |
863 | if (!analyzer->needsInstrumentation(func)) { |
864 | return; |
865 | } |
866 | if (pass->getPassRunner() == nullptr) { |
867 | pass->setPassRunner(getPassRunner()); |
868 | } |
869 | pass->runOnFunction(module, func); |
870 | } |
871 | |
872 | bool modifiesBinaryenIR() override { return pass->modifiesBinaryenIR(); } |
873 | |
874 | bool invalidatesDWARF() override { return pass->invalidatesDWARF(); } |
875 | |
876 | bool requiresNonNullableLocalFixups() override { |
877 | return pass->requiresNonNullableLocalFixups(); |
878 | } |
879 | |
880 | private: |
881 | ModuleAnalyzer* analyzer; |
882 | std::unique_ptr<Pass> pass; |
883 | }; |
884 | |
885 | struct InstrumentedPassRunner : public PassRunner { |
886 | InstrumentedPassRunner(Module* wasm, ModuleAnalyzer* analyzer) |
887 | : PassRunner(wasm), analyzer(analyzer) {} |
888 | |
889 | protected: |
890 | void doAdd(std::unique_ptr<Pass> pass) override { |
891 | PassRunner::doAdd( |
892 | std::unique_ptr<Pass>(new InstrumentedProxy(analyzer, std::move(pass)))); |
893 | } |
894 | |
895 | private: |
896 | ModuleAnalyzer* analyzer; |
897 | }; |
898 | |
899 | // Instrument control flow, around calls and adding skips for rewinding. |
900 | struct AsyncifyFlow : public Pass { |
901 | bool isFunctionParallel() override { return true; } |
902 | |
903 | ModuleAnalyzer* analyzer; |
904 | Type pointerType; |
905 | Name asyncifyMemory; |
906 | |
907 | std::unique_ptr<Pass> create() override { |
908 | return std::make_unique<AsyncifyFlow>( |
909 | analyzer, pointerType, asyncifyMemory); |
910 | } |
911 | |
912 | AsyncifyFlow(ModuleAnalyzer* analyzer, Type pointerType, Name asyncifyMemory) |
913 | : analyzer(analyzer), pointerType(pointerType), |
914 | asyncifyMemory(asyncifyMemory) {} |
915 | |
916 | void runOnFunction(Module* module_, Function* func_) override { |
917 | module = module_; |
918 | func = func_; |
919 | builder = |
920 | std::make_unique<AsyncifyBuilder>(*module, pointerType, asyncifyMemory); |
921 | // If the function cannot change our state, we have nothing to do - |
922 | // we will never unwind or rewind the stack here. |
923 | if (!analyzer->needsInstrumentation(func)) { |
924 | return; |
925 | } |
926 | // Rewrite the function body. |
927 | // Each function we enter will pop one from the stack, which is the index |
928 | // of the next call to make. |
929 | auto* block = builder->makeBlock( |
930 | {builder->makeIf(builder->makeStateCheck( |
931 | State::Rewinding), // TODO: such checks can be !normal |
932 | makeCallIndexPop()), |
933 | process(func->body)}); |
934 | if (func->getResults() != Type::none) { |
935 | // Rewriting control flow may alter things; make sure the function ends in |
936 | // something valid (which the optimizer can remove later). |
937 | block->list.push_back(builder->makeUnreachable()); |
938 | } |
939 | block->finalize(); |
940 | func->body = block; |
941 | // Making things like returns conditional may alter types. |
942 | ReFinalize().walkFunctionInModule(func, module); |
943 | } |
944 | |
945 | private: |
946 | std::unique_ptr<AsyncifyBuilder> builder; |
947 | Module* module; |
948 | Function* func; |
949 | |
950 | // Each call in the function has an index, noted during unwind and checked |
951 | // during rewind. |
952 | Index callIndex = 0; |
953 | |
954 | Expression* process(Expression* curr) { |
955 | // The IR is in flat form, which makes this much simpler: there are no |
956 | // unnecessarily nested side effects or control flow, so we can add |
957 | // skips for rewinding in an easy manner, putting a single if around |
958 | // whole chunks of code. Also calls are separated out so that it is easy |
959 | // to add the necessary checks for them for unwinding/rewinding support. |
960 | // |
961 | // Aside from that, we must "linearize" all control flow so that we can |
962 | // reach the right part when rewinding, which is done by always skipping |
963 | // forward. For example, for an if we do this: |
964 | // |
965 | // if (condition()) { |
966 | // side1(); |
967 | // } else { |
968 | // side2(); |
969 | // } |
970 | // => |
971 | // if (!rewinding) { |
972 | // temp = condition(); |
973 | // } |
974 | // if (rewinding || temp) { |
975 | // side1(); |
976 | // } |
977 | // if (rewinding || !temp) { |
978 | // side2(); |
979 | // } |
980 | // |
981 | // This way we will linearly get through all the code in the function, |
982 | // if we are rewinding. In a similar way we skip over breaks, etc.; just |
983 | // "keep on truckin'". |
984 | // |
985 | // Note that temp locals added in this way are just normal locals, and in |
986 | // particular they are saved and loaded. That way if we resume from the |
987 | // first if arm we will avoid the second. |
988 | |
989 | // To avoid recursion, we use stacks here. We process the work for each |
990 | // node in two phases as follows: |
991 | // |
992 | // 1. The "Scan" phase finds children we need to process (ones that may |
993 | // change the state), and adds Scan tasks for them to the work stack. |
994 | // 2. The "Finish" phase runs after all children have been Scanned and |
995 | // Finished. It pops the children's results from the results stack (if |
996 | // there were relevant children), and then it pushes its own result. |
997 | // |
998 | struct Work { |
999 | Expression* curr; |
1000 | enum { Scan, Finish } phase; |
1001 | }; |
1002 | |
1003 | std::vector<Work> work; |
1004 | std::vector<Expression*> results; |
1005 | std::unordered_set<Expression*> processed; |
1006 | |
1007 | work.push_back(Work{curr, Work::Scan}); |
1008 | |
1009 | while (!work.empty()) { |
1010 | auto item = work.back(); |
1011 | work.pop_back(); |
1012 | processed.insert(item.curr); |
1013 | auto* curr = item.curr; |
1014 | auto phase = item.phase; |
1015 | |
1016 | if (phase == Work::Scan && !analyzer->canChangeState(curr: curr, func)) { |
1017 | results.push_back(makeMaybeSkip(curr)); |
1018 | continue; |
1019 | } |
1020 | |
1021 | if (auto* block = curr->dynCast<Block>()) { |
1022 | auto& list = block->list; |
1023 | |
1024 | // Find the children we need to process. They will be Scanned and |
1025 | // Finished before we reach our own Finish phase. |
1026 | if (phase == Work::Scan) { |
1027 | work.push_back(Work{curr, Work::Finish}); |
1028 | // Add Scan tasks in reverse order, so that we process them in |
1029 | // execution order. |
1030 | for (size_t i = list.size(); i > 0; i--) { |
1031 | auto* child = list[i - 1]; |
1032 | if (analyzer->canChangeState(curr: child, func)) { |
1033 | work.push_back(Work{child, Work::Scan}); |
1034 | } |
1035 | } |
1036 | continue; |
1037 | } |
1038 | Index i = list.size() - 1; |
1039 | // At least one of our children may change the state. Clump them as |
1040 | // necessary. |
1041 | while (1) { |
1042 | if (processed.count(list[i])) { |
1043 | list[i] = results.back(); |
1044 | results.pop_back(); |
1045 | } else { |
1046 | Index begin = i; |
1047 | while (begin > 0 && !processed.count(list[begin - 1])) { |
1048 | begin--; |
1049 | } |
1050 | // We have a range of [begin, i] in which the state cannot change, |
1051 | // so all we need to do is skip it if rewinding. |
1052 | if (begin == i) { |
1053 | list[i] = makeMaybeSkip(curr: list[i]); |
1054 | } else { |
1055 | auto* block = builder->makeBlock(); |
1056 | for (auto j = begin; j <= i; j++) { |
1057 | block->list.push_back(list[j]); |
1058 | } |
1059 | block->finalize(); |
1060 | list[begin] = makeMaybeSkip(curr: block); |
1061 | for (auto j = begin + 1; j <= i; j++) { |
1062 | list[j] = builder->makeNop(); |
1063 | } |
1064 | } |
1065 | i = begin; |
1066 | } |
1067 | if (i == 0) { |
1068 | break; |
1069 | } else { |
1070 | i--; |
1071 | } |
1072 | } |
1073 | results.push_back(block); |
1074 | continue; |
1075 | } else if (auto* iff = curr->dynCast<If>()) { |
1076 | // The state change cannot be in the condition due to flat form, so it |
1077 | // must be in one of the children. |
1078 | assert(!analyzer->canChangeState(curr: iff->condition, func)); |
1079 | if (item.phase == Work::Scan) { |
1080 | work.push_back(Work{curr, Work::Finish}); |
1081 | // Add ifTrue later so that we process it first. |
1082 | if (iff->ifFalse) { |
1083 | work.push_back(Work{iff->ifFalse, Work::Scan}); |
1084 | } |
1085 | work.push_back(Work{iff->ifTrue, Work::Scan}); |
1086 | continue; |
1087 | } |
1088 | // We must linearize this, which means we pass through both arms if we |
1089 | // are rewinding. |
1090 | if (!iff->ifFalse) { |
1091 | iff->condition = builder->makeBinary( |
1092 | OrInt32, iff->condition, builder->makeStateCheck(State::Rewinding)); |
1093 | iff->ifTrue = results.back(); |
1094 | results.pop_back(); |
1095 | iff->finalize(); |
1096 | results.push_back(iff); |
1097 | continue; |
1098 | } |
1099 | auto* newIfFalse = results.back(); |
1100 | results.pop_back(); |
1101 | auto* newIfTrue = results.back(); |
1102 | results.pop_back(); |
1103 | auto conditionTemp = builder->addVar(func, Type::i32); |
1104 | // TODO: can avoid pre if the condition is a get or a const |
1105 | auto* pre = |
1106 | makeMaybeSkip(builder->makeLocalSet(conditionTemp, iff->condition)); |
1107 | iff->condition = builder->makeLocalGet(conditionTemp, Type::i32); |
1108 | iff->condition = builder->makeBinary( |
1109 | OrInt32, iff->condition, builder->makeStateCheck(State::Rewinding)); |
1110 | iff->ifTrue = newIfTrue; |
1111 | iff->ifFalse = nullptr; |
1112 | iff->finalize(); |
1113 | // Add support for the second arm as well. |
1114 | auto* otherIf = builder->makeIf( |
1115 | builder->makeBinary( |
1116 | OrInt32, |
1117 | builder->makeUnary(EqZInt32, |
1118 | builder->makeLocalGet(conditionTemp, Type::i32)), |
1119 | builder->makeStateCheck(State::Rewinding)), |
1120 | newIfFalse); |
1121 | otherIf->finalize(); |
1122 | results.push_back(builder->makeBlock({pre, iff, otherIf})); |
1123 | continue; |
1124 | } else if (auto* loop = curr->dynCast<Loop>()) { |
1125 | if (item.phase == Work::Scan) { |
1126 | work.push_back(Work{curr, Work::Finish}); |
1127 | work.push_back(Work{loop->body, Work::Scan}); |
1128 | continue; |
1129 | } |
1130 | loop->body = results.back(); |
1131 | results.pop_back(); |
1132 | results.push_back(loop); |
1133 | continue; |
1134 | } else if (doesCall(curr)) { |
1135 | results.push_back(makeCallSupport(curr)); |
1136 | continue; |
1137 | } |
1138 | // We must handle all control flow above, and all things that can change |
1139 | // the state, so there should be nothing that can reach here - add it |
1140 | // earlier as necessary. |
1141 | // std::cout << *curr << '\n'; |
1142 | WASM_UNREACHABLE("unexpected expression type"); |
1143 | } |
1144 | assert(results.size() == 1); |
1145 | return results.back(); |
1146 | } |
1147 | |
1148 | // Possibly skip some code, if rewinding. |
1149 | Expression* makeMaybeSkip(Expression* curr) { |
1150 | return builder->makeIf(builder->makeStateCheck(State::Normal), curr); |
1151 | } |
1152 | |
1153 | Expression* makeCallSupport(Expression* curr) { |
1154 | // TODO: stop doing this after code can no longer reach a call that may |
1155 | // change the state |
1156 | assert(doesCall(curr)); |
1157 | assert(curr->type == Type::none); |
1158 | // The case of a set is tricky: we must *not* execute the set when |
1159 | // unwinding, since at that point we have a fake value for the return, |
1160 | // and if we applied it to the local, it would end up saved and then |
1161 | // potentially used later - and with coalescing, that may interfere |
1162 | // with other things. Note also that at this point in the process we |
1163 | // have not yet written the load saving/loading code, so the optimizer |
1164 | // doesn't see that locals will have another use at the beginning and |
1165 | // end of the function. We don't do that yet because we don't want it |
1166 | // to force the final number of locals to be too high, but we also |
1167 | // must be careful to never touch a local unnecessarily to avoid bugs. |
1168 | // To implement this, write to a fake global; we'll fix it up after |
1169 | // AsyncifyLocals locals adds local saving/restoring. |
1170 | auto* set = curr->dynCast<LocalSet>(); |
1171 | if (set) { |
1172 | auto name = analyzer->fakeGlobals.getName(set->value->type); |
1173 | curr = builder->makeGlobalSet(name, set->value); |
1174 | set->value = builder->makeGlobalGet(name, set->value->type); |
1175 | } |
1176 | // Instrument the call itself (or, if a drop, the drop+call). |
1177 | auto index = callIndex++; |
1178 | // Execute the call, if either normal execution, or if rewinding and this |
1179 | // is the right call to go into. |
1180 | // TODO: we can read the next call index once in each function (but should |
1181 | // avoid saving/restoring that local later) |
1182 | curr = builder->makeIf( |
1183 | builder->makeIf(builder->makeStateCheck(State::Normal), |
1184 | builder->makeConst(int32_t(1)), |
1185 | makeCallIndexPeek(index)), |
1186 | builder->makeSequence(curr, makePossibleUnwind(index, set))); |
1187 | return curr; |
1188 | } |
1189 | |
1190 | Expression* makePossibleUnwind(Index index, Expression* ifNotUnwinding) { |
1191 | // In this pass we emit an "unwind" as a call to a fake intrinsic. We |
1192 | // will implement it in the later pass. (We can't create the unwind block |
1193 | // target here, as the optimizer would remove it later; we can only add |
1194 | // it when we add its contents, later.) |
1195 | return builder->makeIf( |
1196 | builder->makeStateCheck(State::Unwinding), |
1197 | builder->makeCall( |
1198 | ASYNCIFY_UNWIND, {builder->makeConst(int32_t(index))}, Type::none), |
1199 | ifNotUnwinding); |
1200 | } |
1201 | |
1202 | Expression* makeCallIndexPeek(Index index) { |
1203 | // Emit an intrinsic for this, as we store the index into a local, and |
1204 | // don't want it to be seen by asyncify itself. |
1205 | return builder->makeCall(ASYNCIFY_CHECK_CALL_INDEX, |
1206 | {builder->makeConst(int32_t(index))}, |
1207 | Type::i32); |
1208 | } |
1209 | |
1210 | Expression* makeCallIndexPop() { |
1211 | // Emit an intrinsic for this, as we store the index into a local, and |
1212 | // don't want it to be seen by asyncify itself. |
1213 | return builder->makeCall(ASYNCIFY_GET_CALL_INDEX, {}, Type::none); |
1214 | } |
1215 | }; |
1216 | |
1217 | // Add asserts in non-instrumented code. |
1218 | struct AsyncifyAssertInNonInstrumented : public Pass { |
1219 | bool isFunctionParallel() override { return true; } |
1220 | |
1221 | ModuleAnalyzer* analyzer; |
1222 | Type pointerType; |
1223 | Name asyncifyMemory; |
1224 | |
1225 | std::unique_ptr<Pass> create() override { |
1226 | return std::make_unique<AsyncifyAssertInNonInstrumented>( |
1227 | analyzer, pointerType, asyncifyMemory); |
1228 | } |
1229 | |
1230 | AsyncifyAssertInNonInstrumented(ModuleAnalyzer* analyzer, |
1231 | Type pointerType, |
1232 | Name asyncifyMemory) |
1233 | : analyzer(analyzer), pointerType(pointerType), |
1234 | asyncifyMemory(asyncifyMemory) {} |
1235 | |
1236 | void runOnFunction(Module* module_, Function* func) override { |
1237 | // FIXME: This looks like it was never right, as it should ignore the top- |
1238 | // most runtime, but it will actually instrument it (as it needs no |
1239 | // instrumentation, like random code - but the top-most runtime is |
1240 | // actually a place that needs neither instrumentation *nor* |
1241 | // assertions, as the assertions will error when it changes the |
1242 | // state). |
1243 | if (!analyzer->needsInstrumentation(func)) { |
1244 | module = module_; |
1245 | builder = |
1246 | std::make_unique<AsyncifyBuilder>(*module, pointerType, asyncifyMemory); |
1247 | addAssertsInNonInstrumented(func); |
1248 | } |
1249 | } |
1250 | |
1251 | // Given a function that is not instrumented - because we proved it doesn't |
1252 | // need it, or depending on the only-list / remove-list - add assertions that |
1253 | // verify that property at runtime. |
1254 | // Note that it is ok to run code while sleeping (if you are careful not |
1255 | // to break assumptions in the program!) - so what is actually |
1256 | // checked here is if the state *changes* in an uninstrumented function. |
1257 | // That is, if in an uninstrumented function, a sleep should not begin |
1258 | // from any call. |
1259 | void addAssertsInNonInstrumented(Function* func) { |
1260 | auto oldState = builder->addVar(func, Type::i32); |
1261 | // Add a check at the function entry. |
1262 | func->body = builder->makeSequence( |
1263 | builder->makeLocalSet(oldState, |
1264 | builder->makeGlobalGet(ASYNCIFY_STATE, Type::i32)), |
1265 | func->body); |
1266 | // Add a check around every call. |
1267 | struct Walker : PostWalker<Walker> { |
1268 | void visitCall(Call* curr) { |
1269 | // Tail calls will need another type of check, as they wouldn't reach |
1270 | // this assertion. |
1271 | assert(!curr->isReturn); |
1272 | handleCall(curr); |
1273 | } |
1274 | void visitCallIndirect(CallIndirect* curr) { |
1275 | // Tail calls will need another type of check, as they wouldn't reach |
1276 | // this assertion. |
1277 | assert(!curr->isReturn); |
1278 | handleCall(curr); |
1279 | } |
1280 | void handleCall(Expression* call) { |
1281 | auto* check = builder->makeIf( |
1282 | builder->makeBinary(op: NeInt32, |
1283 | left: builder->makeGlobalGet(ASYNCIFY_STATE, Type::i32), |
1284 | right: builder->makeLocalGet(oldState, Type::i32)), |
1285 | builder->makeUnreachable()); |
1286 | Expression* rep; |
1287 | if (call->type.isConcrete()) { |
1288 | auto temp = builder->addVar(func, type: call->type); |
1289 | rep = builder->makeBlock({ |
1290 | builder->makeLocalSet(temp, call), |
1291 | check, |
1292 | builder->makeLocalGet(temp, call->type), |
1293 | }); |
1294 | } else { |
1295 | rep = builder->makeSequence(call, check); |
1296 | } |
1297 | replaceCurrent(rep); |
1298 | } |
1299 | Function* func; |
1300 | AsyncifyBuilder* builder; |
1301 | Index oldState; |
1302 | }; |
1303 | Walker walker; |
1304 | walker.func = func; |
1305 | walker.builder = builder.get(); |
1306 | walker.oldState = oldState; |
1307 | walker.walk(func->body); |
1308 | } |
1309 | |
1310 | private: |
1311 | std::unique_ptr<AsyncifyBuilder> builder; |
1312 | Module* module; |
1313 | }; |
1314 | |
1315 | // Instrument local saving/restoring. |
1316 | struct AsyncifyLocals : public WalkerPass<PostWalker<AsyncifyLocals>> { |
1317 | bool isFunctionParallel() override { return true; } |
1318 | |
1319 | ModuleAnalyzer* analyzer; |
1320 | Type pointerType; |
1321 | Name asyncifyMemory; |
1322 | |
1323 | std::unique_ptr<Pass> create() override { |
1324 | return std::make_unique<AsyncifyLocals>( |
1325 | analyzer, pointerType, asyncifyMemory); |
1326 | } |
1327 | |
1328 | AsyncifyLocals(ModuleAnalyzer* analyzer, |
1329 | Type pointerType, |
1330 | Name asyncifyMemory) |
1331 | : analyzer(analyzer), pointerType(pointerType), |
1332 | asyncifyMemory(asyncifyMemory) {} |
1333 | |
1334 | void visitCall(Call* curr) { |
1335 | // Replace calls to the fake intrinsics. |
1336 | if (curr->target == ASYNCIFY_UNWIND) { |
1337 | replaceCurrent(builder->makeBreak(ASYNCIFY_UNWIND, curr->operands[0])); |
1338 | } else if (curr->target == ASYNCIFY_GET_CALL_INDEX) { |
1339 | replaceCurrent(builder->makeSequence( |
1340 | builder->makeIncStackPos(-4), |
1341 | builder->makeLocalSet(rewindIndex, |
1342 | builder->makeLoad(4, |
1343 | false, |
1344 | 0, |
1345 | 4, |
1346 | builder->makeGetStackPos(), |
1347 | Type::i32, |
1348 | asyncifyMemory)))); |
1349 | } else if (curr->target == ASYNCIFY_CHECK_CALL_INDEX) { |
1350 | replaceCurrent(builder->makeBinary( |
1351 | EqInt32, |
1352 | builder->makeLocalGet(rewindIndex, Type::i32), |
1353 | builder->makeConst( |
1354 | Literal(int32_t(curr->operands[0]->cast<Const>()->value.geti32()))))); |
1355 | } |
1356 | } |
1357 | |
1358 | void visitGlobalSet(GlobalSet* curr) { |
1359 | auto type = analyzer->fakeGlobals.getTypeOrNone(curr->name); |
1360 | if (type != Type::none) { |
1361 | replaceCurrent( |
1362 | builder->makeLocalSet(getFakeCallLocal(type), curr->value)); |
1363 | } |
1364 | } |
1365 | |
1366 | void visitGlobalGet(GlobalGet* curr) { |
1367 | auto type = analyzer->fakeGlobals.getTypeOrNone(curr->name); |
1368 | if (type != Type::none) { |
1369 | replaceCurrent(builder->makeLocalGet(getFakeCallLocal(type), type)); |
1370 | } |
1371 | } |
1372 | |
1373 | Index getFakeCallLocal(Type type) { |
1374 | auto iter = fakeCallLocals.find(type); |
1375 | if (iter != fakeCallLocals.end()) { |
1376 | return iter->second; |
1377 | } |
1378 | return fakeCallLocals[type] = builder->addVar(getFunction(), type); |
1379 | } |
1380 | |
1381 | void doWalkFunction(Function* func) { |
1382 | // If the function cannot change our state, we have nothing to do - |
1383 | // we will never unwind or rewind the stack here. |
1384 | if (!analyzer->needsInstrumentation(func)) { |
1385 | return; |
1386 | } |
1387 | // Find the locals that we actually need to load and save: any local that is |
1388 | // alive at a relevant call site must be handled, but others can be ignored. |
1389 | findRelevantLiveLocals(func); |
1390 | // The new function body has a prelude to load locals if rewinding, |
1391 | // then the actual main body, which does all its unwindings by breaking |
1392 | // to the unwind block, which then handles pushing the call index, as |
1393 | // well as saving the locals. |
1394 | // An index is needed for getting the unwinding and rewinding call indexes |
1395 | // around TODO: can this be the same index? |
1396 | auto unwindIndex = builder->addVar(func, Type::i32); |
1397 | rewindIndex = builder->addVar(func, Type::i32); |
1398 | // Rewrite the function body. |
1399 | builder = std::make_unique<AsyncifyBuilder>( |
1400 | *getModule(), pointerType, asyncifyMemory); |
1401 | walk(func->body); |
1402 | // After the normal function body, emit a barrier before the postamble. |
1403 | Expression* barrier; |
1404 | if (func->getResults() == Type::none) { |
1405 | // The function may have ended without a return; ensure one. |
1406 | barrier = builder->makeReturn(); |
1407 | } else { |
1408 | // The function must have returned or hit an unreachable, but emit one |
1409 | // to make possible bugs easier to figure out (as this should never be |
1410 | // reached). The optimizer can remove this anyhow. |
1411 | barrier = builder->makeUnreachable(); |
1412 | } |
1413 | auto* newBody = builder->makeBlock( |
1414 | {builder->makeIf(builder->makeStateCheck(State::Rewinding), |
1415 | makeLocalLoading()), |
1416 | builder->makeLocalSet( |
1417 | unwindIndex, |
1418 | builder->makeBlock(ASYNCIFY_UNWIND, |
1419 | builder->makeSequence(func->body, barrier))), |
1420 | makeCallIndexPush(unwindIndex), |
1421 | makeLocalSaving()}); |
1422 | if (func->getResults() != Type::none) { |
1423 | // If we unwind, we must still "return" a value, even if it will be |
1424 | // ignored on the outside. |
1425 | newBody->list.push_back( |
1426 | LiteralUtils::makeZero(type: func->getResults(), wasm&: *getModule())); |
1427 | newBody->finalize(func->getResults()); |
1428 | } |
1429 | func->body = newBody; |
1430 | // Making things like returns conditional may alter types. |
1431 | ReFinalize().walkFunctionInModule(func, getModule()); |
1432 | } |
1433 | |
1434 | private: |
1435 | std::unique_ptr<AsyncifyBuilder> builder; |
1436 | |
1437 | Index rewindIndex; |
1438 | std::unordered_map<Type, Index> fakeCallLocals; |
1439 | std::set<Index> relevantLiveLocals; |
1440 | |
1441 | void findRelevantLiveLocals(Function* func) { |
1442 | struct RelevantLiveLocalsWalker |
1443 | : public LivenessWalker<RelevantLiveLocalsWalker, |
1444 | Visitor<RelevantLiveLocalsWalker>> { |
1445 | // Basic blocks that have a possible unwind/rewind in them. |
1446 | std::set<BasicBlock*> relevantBasicBlocks; |
1447 | |
1448 | void visitCall(Call* curr) { |
1449 | if (!currBasicBlock) { |
1450 | return; |
1451 | } |
1452 | // Note blocks where we might unwind/rewind, all of which have a |
1453 | // possible call to ASYNCIFY_CHECK_CALL_INDEX emitted right before the |
1454 | // actual call. |
1455 | // Note that each relevant original call was turned into a sequence of |
1456 | // instructions, one of which is an if and then a call to this special |
1457 | // intrinsic. We rely on the fact that if a local was live at the |
1458 | // original call, it also would be in all that sequence of instructions, |
1459 | // and in particular at the call we look for here (which is right before |
1460 | // the call, and so anything that has its final use at the call is still |
1461 | // live here). |
1462 | if (curr->target == ASYNCIFY_CHECK_CALL_INDEX) { |
1463 | relevantBasicBlocks.insert(currBasicBlock); |
1464 | } |
1465 | } |
1466 | }; |
1467 | |
1468 | RelevantLiveLocalsWalker walker; |
1469 | walker.setFunction(func); |
1470 | walker.walkFunctionInModule(func, getModule()); |
1471 | // The relevant live locals are ones that are alive at an unwind/rewind |
1472 | // location. TODO look more precisely inside basic blocks, as one might stop |
1473 | // being live in the middle |
1474 | for (auto* block : walker.liveBlocks) { |
1475 | if (walker.relevantBasicBlocks.count(block)) { |
1476 | for (auto local : block->contents.start) { |
1477 | relevantLiveLocals.insert(local); |
1478 | } |
1479 | } |
1480 | } |
1481 | } |
1482 | |
1483 | Expression* makeLocalLoading() { |
1484 | if (relevantLiveLocals.empty()) { |
1485 | return builder->makeNop(); |
1486 | } |
1487 | auto* func = getFunction(); |
1488 | auto numLocals = func->getNumLocals(); |
1489 | Index total = 0; |
1490 | for (Index i = 0; i < numLocals; i++) { |
1491 | if (!relevantLiveLocals.count(i)) { |
1492 | continue; |
1493 | } |
1494 | total += getByteSize(type: func->getLocalType(i)); |
1495 | } |
1496 | auto* block = builder->makeBlock(); |
1497 | block->list.push_back(builder->makeIncStackPos(-total)); |
1498 | auto tempIndex = builder->addVar(func, builder->pointerType); |
1499 | block->list.push_back( |
1500 | builder->makeLocalSet(tempIndex, builder->makeGetStackPos())); |
1501 | Index offset = 0; |
1502 | for (Index i = 0; i < numLocals; i++) { |
1503 | if (!relevantLiveLocals.count(i)) { |
1504 | continue; |
1505 | } |
1506 | auto localType = func->getLocalType(i); |
1507 | SmallVector<Expression*, 1> loads; |
1508 | for (const auto& type : localType) { |
1509 | auto size = getByteSize(type); |
1510 | assert(size % STACK_ALIGN == 0); |
1511 | // TODO: higher alignment? |
1512 | loads.push_back(builder->makeLoad( |
1513 | size, |
1514 | true, |
1515 | offset, |
1516 | STACK_ALIGN, |
1517 | builder->makeLocalGet(tempIndex, builder->pointerType), |
1518 | type, |
1519 | asyncifyMemory)); |
1520 | offset += size; |
1521 | } |
1522 | Expression* load; |
1523 | if (loads.size() == 1) { |
1524 | load = loads[0]; |
1525 | } else if (localType.size() > 1) { |
1526 | load = builder->makeTupleMake(std::move(loads)); |
1527 | } else { |
1528 | WASM_UNREACHABLE("Unexpected empty type"); |
1529 | } |
1530 | block->list.push_back(builder->makeLocalSet(i, load)); |
1531 | } |
1532 | block->finalize(); |
1533 | return block; |
1534 | } |
1535 | |
1536 | Expression* makeLocalSaving() { |
1537 | if (relevantLiveLocals.empty()) { |
1538 | return builder->makeNop(); |
1539 | } |
1540 | auto* func = getFunction(); |
1541 | auto numLocals = func->getNumLocals(); |
1542 | auto* block = builder->makeBlock(); |
1543 | auto tempIndex = builder->addVar(func, builder->pointerType); |
1544 | block->list.push_back( |
1545 | builder->makeLocalSet(tempIndex, builder->makeGetStackPos())); |
1546 | Index offset = 0; |
1547 | for (Index i = 0; i < numLocals; i++) { |
1548 | if (!relevantLiveLocals.count(i)) { |
1549 | continue; |
1550 | } |
1551 | auto localType = func->getLocalType(i); |
1552 | size_t j = 0; |
1553 | for (const auto& type : localType) { |
1554 | auto size = getByteSize(type); |
1555 | Expression* localGet = builder->makeLocalGet(i, localType); |
1556 | if (localType.size() > 1) { |
1557 | localGet = builder->makeTupleExtract(localGet, j); |
1558 | } |
1559 | assert(size % STACK_ALIGN == 0); |
1560 | // TODO: higher alignment? |
1561 | block->list.push_back(builder->makeStore( |
1562 | size, |
1563 | offset, |
1564 | STACK_ALIGN, |
1565 | builder->makeLocalGet(tempIndex, builder->pointerType), |
1566 | localGet, |
1567 | type, |
1568 | asyncifyMemory)); |
1569 | offset += size; |
1570 | ++j; |
1571 | } |
1572 | } |
1573 | block->list.push_back(builder->makeIncStackPos(offset)); |
1574 | block->finalize(); |
1575 | return block; |
1576 | } |
1577 | |
1578 | Expression* makeCallIndexPush(Index tempIndex) { |
1579 | // TODO: add a check against the stack end here |
1580 | return builder->makeSequence( |
1581 | builder->makeStore(4, |
1582 | 0, |
1583 | 4, |
1584 | builder->makeGetStackPos(), |
1585 | builder->makeLocalGet(tempIndex, Type::i32), |
1586 | Type::i32, |
1587 | asyncifyMemory), |
1588 | builder->makeIncStackPos(4)); |
1589 | } |
1590 | |
1591 | unsigned getByteSize(Type type) { |
1592 | if (!type.hasByteSize()) { |
1593 | Fatal() << "Asyncify does not yet support non-number types, like " |
1594 | "references (see " |
1595 | "https://github.com/WebAssembly/binaryen/issues/3739)"; |
1596 | } |
1597 | return type.getByteSize(); |
1598 | } |
1599 | }; |
1600 | |
1601 | } // anonymous namespace |
1602 | |
1603 | static std::string getFullImportName(Name module, Name base) { |
1604 | return std::string(module.str) + '.' + base.toString(); |
1605 | } |
1606 | |
1607 | struct Asyncify : public Pass { |
1608 | void run(Module* module) override { |
1609 | auto& options = getPassOptions(); |
1610 | bool optimize = options.optimizeLevel > 0; |
1611 | |
1612 | // Find which things can change the state. |
1613 | auto stateChangingImports = String::trim(read_possible_response_file( |
1614 | options.getArgumentOrDefault("asyncify-imports", ""))); |
1615 | auto ignoreImports = |
1616 | options.getArgumentOrDefault("asyncify-ignore-imports", ""); |
1617 | bool allImportsCanChangeState = |
1618 | stateChangingImports == ""&& ignoreImports == ""; |
1619 | String::Split listedImports(stateChangingImports, ","); |
1620 | // canIndirectChangeState is the default. asyncify-ignore-indirect sets it |
1621 | // to false. |
1622 | auto canIndirectChangeState = |
1623 | !options.hasArgument("asyncify-ignore-indirect"); |
1624 | std::string removeListInput = |
1625 | options.getArgumentOrDefault("asyncify-removelist", ""); |
1626 | if (removeListInput.empty()) { |
1627 | // Support old name for now to avoid immediate breakage TODO remove |
1628 | removeListInput = options.getArgumentOrDefault("asyncify-blacklist", ""); |
1629 | } |
1630 | String::Split removeList( |
1631 | String::trim(read_possible_response_file(removeListInput)), ","); |
1632 | String::Split addList( |
1633 | String::trim(read_possible_response_file( |
1634 | options.getArgumentOrDefault("asyncify-addlist", ""))), |
1635 | ","); |
1636 | std::string onlyListInput = |
1637 | options.getArgumentOrDefault("asyncify-onlylist", ""); |
1638 | if (onlyListInput.empty()) { |
1639 | // Support old name for now to avoid immediate breakage TODO remove |
1640 | onlyListInput = options.getArgumentOrDefault("asyncify-whitelist", ""); |
1641 | } |
1642 | String::Split onlyList( |
1643 | String::trim(read_possible_response_file(onlyListInput)), ","); |
1644 | auto asserts = options.hasArgument("asyncify-asserts"); |
1645 | auto verbose = options.hasArgument("asyncify-verbose"); |
1646 | auto relocatable = options.hasArgument("asyncify-relocatable"); |
1647 | auto secondaryMemory = options.hasArgument("asyncify-in-secondary-memory"); |
1648 | |
1649 | // Ensure there is a memory, as we need it. |
1650 | if (secondaryMemory) { |
1651 | auto secondaryMemorySizeString = |
1652 | options.getArgumentOrDefault("asyncify-secondary-memory-size", "1"); |
1653 | Address secondaryMemorySize = std::stoi(secondaryMemorySizeString); |
1654 | asyncifyMemory = createSecondaryMemory(module, secondaryMemorySize); |
1655 | } else { |
1656 | MemoryUtils::ensureExists(wasm: module); |
1657 | asyncifyMemory = module->memories[0]->name; |
1658 | } |
1659 | pointerType = |
1660 | module->getMemory(asyncifyMemory)->is64() ? Type::i64 : Type::i32; |
1661 | |
1662 | removeList = handleBracketingOperators(split: removeList); |
1663 | addList = handleBracketingOperators(split: addList); |
1664 | onlyList = handleBracketingOperators(split: onlyList); |
1665 | |
1666 | if (!onlyList.empty() && (!removeList.empty() || !addList.empty())) { |
1667 | Fatal() << "It makes no sense to use both an asyncify only-list together " |
1668 | "with another list."; |
1669 | } |
1670 | |
1671 | auto canImportChangeState = [&](Name module, Name base) { |
1672 | if (allImportsCanChangeState) { |
1673 | return true; |
1674 | } |
1675 | auto full = getFullImportName(module, base); |
1676 | for (auto& listedImport : listedImports) { |
1677 | if (String::wildcardMatch(listedImport, full)) { |
1678 | return true; |
1679 | } |
1680 | } |
1681 | return false; |
1682 | }; |
1683 | |
1684 | // Scan the module. |
1685 | ModuleAnalyzer analyzer(*module, |
1686 | canImportChangeState, |
1687 | canIndirectChangeState, |
1688 | removeList, |
1689 | addList, |
1690 | onlyList, |
1691 | asserts, |
1692 | verbose); |
1693 | |
1694 | // Add necessary globals before we emit code to use them. |
1695 | addGlobals(module, imported: relocatable); |
1696 | |
1697 | // Instrument the flow of code, adding code instrumentation and |
1698 | // skips for when rewinding. We do this on flat IR so that it is |
1699 | // practical to add code around each call, without affecting |
1700 | // anything else. |
1701 | { |
1702 | InstrumentedPassRunner runner(module, &analyzer); |
1703 | runner.add("flatten"); |
1704 | // Dce is useful here, since AsyncifyFlow makes control flow conditional, |
1705 | // which may make unreachable code look reachable. It also lets us ignore |
1706 | // unreachable code here. |
1707 | runner.add("dce"); |
1708 | if (optimize) { |
1709 | // Optimizing before BsyncifyFlow is crucial, especially coalescing, |
1710 | // because the flow changes add many branches, break up if-elses, etc., |
1711 | // all of which extend the live ranges of locals. In other words, it is |
1712 | // not possible to coalesce well afterwards. |
1713 | runner.add("remove-unused-names"); |
1714 | runner.add("simplify-locals-nonesting"); |
1715 | runner.add("reorder-locals"); |
1716 | runner.add("coalesce-locals"); |
1717 | runner.add("simplify-locals-nonesting"); |
1718 | runner.add("reorder-locals"); |
1719 | runner.add("merge-blocks"); |
1720 | } |
1721 | runner.add( |
1722 | std::make_unique<AsyncifyFlow>(&analyzer, pointerType, asyncifyMemory)); |
1723 | runner.setIsNested(true); |
1724 | runner.setValidateGlobally(false); |
1725 | runner.run(); |
1726 | } |
1727 | if (asserts) { |
1728 | // Add asserts in non-instrumented code. Note we do not use an |
1729 | // instrumented pass runner here as we do want to run on all functions. |
1730 | PassRunner runner(module); |
1731 | runner.add(std::make_unique<AsyncifyAssertInNonInstrumented>( |
1732 | &analyzer, pointerType, asyncifyMemory)); |
1733 | runner.setIsNested(true); |
1734 | runner.setValidateGlobally(false); |
1735 | runner.run(); |
1736 | } |
1737 | // Next, add local saving/restoring logic. We optimize before doing this, |
1738 | // to undo the extra code generated by flattening, and to arrive at the |
1739 | // minimal amount of locals (which is important as we must save and |
1740 | // restore those locals). We also and optimize after as well to simplify |
1741 | // the code as much as possible. |
1742 | { |
1743 | InstrumentedPassRunner runner(module, &analyzer); |
1744 | if (optimize) { |
1745 | runner.addDefaultFunctionOptimizationPasses(); |
1746 | } |
1747 | runner.add(std::make_unique<AsyncifyLocals>( |
1748 | &analyzer, pointerType, asyncifyMemory)); |
1749 | if (optimize) { |
1750 | runner.addDefaultFunctionOptimizationPasses(); |
1751 | } |
1752 | runner.setIsNested(true); |
1753 | runner.setValidateGlobally(false); |
1754 | runner.run(); |
1755 | } |
1756 | // Finally, add function support (that should not have been seen by |
1757 | // the previous passes). |
1758 | addFunctions(module); |
1759 | } |
1760 | |
1761 | private: |
1762 | void addGlobals(Module* module, bool imported) { |
1763 | Builder builder(*module); |
1764 | |
1765 | auto asyncifyState = builder.makeGlobal(ASYNCIFY_STATE, |
1766 | Type::i32, |
1767 | builder.makeConst(x: int32_t(0)), |
1768 | Builder::Mutable); |
1769 | if (imported) { |
1770 | asyncifyState->module = ENV; |
1771 | asyncifyState->base = ASYNCIFY_STATE; |
1772 | } |
1773 | module->addGlobal(std::move(asyncifyState)); |
1774 | |
1775 | auto asyncifyData = builder.makeGlobal(ASYNCIFY_DATA, |
1776 | pointerType, |
1777 | builder.makeConst(x: pointerType), |
1778 | Builder::Mutable); |
1779 | if (imported) { |
1780 | asyncifyData->module = ENV; |
1781 | asyncifyData->base = ASYNCIFY_DATA; |
1782 | } |
1783 | module->addGlobal(std::move(asyncifyData)); |
1784 | } |
1785 | |
1786 | void addFunctions(Module* module) { |
1787 | Builder builder(*module); |
1788 | auto makeFunction = [&](Name name, bool setData, State state) { |
1789 | std::vector<Type> params; |
1790 | if (setData) { |
1791 | params.push_back(pointerType); |
1792 | } |
1793 | auto* body = builder.makeBlock(); |
1794 | body->list.push_back(builder.makeGlobalSet( |
1795 | name: ASYNCIFY_STATE, value: builder.makeConst(x: int32_t(state)))); |
1796 | if (setData) { |
1797 | body->list.push_back(builder.makeGlobalSet( |
1798 | name: ASYNCIFY_DATA, value: builder.makeLocalGet(index: 0, type: pointerType))); |
1799 | } |
1800 | // Verify the data is valid. |
1801 | auto* stackPos = |
1802 | builder.makeLoad(pointerType.getByteSize(), |
1803 | false, |
1804 | int(DataOffset::BStackPos), |
1805 | pointerType.getByteSize(), |
1806 | builder.makeGlobalGet(ASYNCIFY_DATA, pointerType), |
1807 | pointerType, |
1808 | asyncifyMemory); |
1809 | auto* stackEnd = |
1810 | builder.makeLoad(pointerType.getByteSize(), |
1811 | false, |
1812 | int(pointerType == Type::i64 ? DataOffset::BStackEnd64 |
1813 | : DataOffset::BStackEnd), |
1814 | pointerType.getByteSize(), |
1815 | builder.makeGlobalGet(ASYNCIFY_DATA, pointerType), |
1816 | pointerType, |
1817 | asyncifyMemory); |
1818 | body->list.push_back(builder.makeIf( |
1819 | builder.makeBinary( |
1820 | op: Abstract::getBinary(type: pointerType, op: Abstract::GtU), left: stackPos, right: stackEnd), |
1821 | builder.makeUnreachable())); |
1822 | body->finalize(); |
1823 | auto func = builder.makeFunction( |
1824 | name, Signature(Type(params), Type::none), {}, body); |
1825 | module->addFunction(std::move(func)); |
1826 | module->addExport(curr: builder.makeExport(name, name, ExternalKind::Function)); |
1827 | }; |
1828 | |
1829 | makeFunction(ASYNCIFY_START_UNWIND, true, State::Unwinding); |
1830 | makeFunction(ASYNCIFY_STOP_UNWIND, false, State::Normal); |
1831 | makeFunction(ASYNCIFY_START_REWIND, true, State::Rewinding); |
1832 | makeFunction(ASYNCIFY_STOP_REWIND, false, State::Normal); |
1833 | |
1834 | module->addFunction( |
1835 | curr: builder.makeFunction(ASYNCIFY_GET_STATE, |
1836 | Signature(Type::none, Type::i32), |
1837 | {}, |
1838 | builder.makeGlobalGet(name: ASYNCIFY_STATE, type: Type::i32))); |
1839 | module->addExport(curr: builder.makeExport( |
1840 | ASYNCIFY_GET_STATE, ASYNCIFY_GET_STATE, ExternalKind::Function)); |
1841 | } |
1842 | |
1843 | Name createSecondaryMemory(Module* module, Address secondaryMemorySize) { |
1844 | Name name = Names::getValidMemoryName(module&: *module, root: "asyncify_memory"); |
1845 | auto secondaryMemory = |
1846 | Builder::makeMemory(name, secondaryMemorySize, secondaryMemorySize); |
1847 | module->addMemory(std::move(secondaryMemory)); |
1848 | return name; |
1849 | } |
1850 | |
1851 | Type pointerType; |
1852 | Name asyncifyMemory; |
1853 | }; |
1854 | |
1855 | Pass* createAsyncifyPass() { return new Asyncify(); } |
1856 | |
1857 | // Helper passes that can be run after Asyncify. |
1858 | |
1859 | template<bool neverRewind, bool neverUnwind, bool importsAlwaysUnwind> |
1860 | struct ModAsyncify |
1861 | : public WalkerPass<LinearExecutionWalker< |
1862 | ModAsyncify<neverRewind, neverUnwind, importsAlwaysUnwind>>> { |
1863 | bool isFunctionParallel() override { return true; } |
1864 | |
1865 | std::unique_ptr<Pass> create() override { |
1866 | return std::make_unique< |
1867 | ModAsyncify<neverRewind, neverUnwind, importsAlwaysUnwind>>(); |
1868 | } |
1869 | |
1870 | void doWalkFunction(Function* func) { |
1871 | // Find the asyncify state name. |
1872 | auto* unwind = this->getModule()->getExport(ASYNCIFY_STOP_UNWIND); |
1873 | auto* unwindFunc = this->getModule()->getFunction(unwind->value); |
1874 | FindAll<GlobalSet> sets(unwindFunc->body); |
1875 | assert(sets.list.size() == 1); |
1876 | asyncifyStateName = sets.list[0]->name; |
1877 | // Walk and optimize. |
1878 | this->walk(func->body); |
1879 | } |
1880 | |
1881 | // Note that we don't just implement GetGlobal as we may know the value is |
1882 | // *not* 0, 1, or 2, but not know the actual value. So what we can say depends |
1883 | // on the comparison being done on it, and so we implement Binary and |
1884 | // Select. |
1885 | |
1886 | void visitBinary(Binary* curr) { |
1887 | // Check if this is a comparison of the asyncify state to a specific |
1888 | // constant, which we may know is impossible. |
1889 | bool flip = false; |
1890 | if (curr->op == NeInt32) { |
1891 | flip = true; |
1892 | } else if (curr->op != EqInt32) { |
1893 | return; |
1894 | } |
1895 | auto* c = curr->right->dynCast<Const>(); |
1896 | if (!c) { |
1897 | return; |
1898 | } |
1899 | auto* get = curr->left->dynCast<GlobalGet>(); |
1900 | if (!get || get->name != asyncifyStateName) { |
1901 | return; |
1902 | } |
1903 | // This is a comparison of the state to a constant, check if we know the |
1904 | // value. |
1905 | int32_t value; |
1906 | auto checkedValue = c->value.geti32(); |
1907 | if ((checkedValue == int(State::Unwinding) && neverUnwind) || |
1908 | (checkedValue == int(State::Rewinding) && neverRewind)) { |
1909 | // We know the state is checked against an impossible value. |
1910 | value = 0; |
1911 | } else if (checkedValue == int(State::Unwinding) && this->unwinding) { |
1912 | // We know we are in fact unwinding right now. |
1913 | value = 1; |
1914 | unsetUnwinding(); |
1915 | } else { |
1916 | return; |
1917 | } |
1918 | if (flip) { |
1919 | value = 1 - value; |
1920 | } |
1921 | Builder builder(*this->getModule()); |
1922 | this->replaceCurrent(builder.makeConst(x: int32_t(value))); |
1923 | } |
1924 | |
1925 | void visitSelect(Select* curr) { |
1926 | auto* get = curr->condition->dynCast<GlobalGet>(); |
1927 | if (!get || get->name != asyncifyStateName) { |
1928 | return; |
1929 | } |
1930 | // This is a comparison of the state to zero, which means we are checking |
1931 | // "if running normally, run this code, but if rewinding, ignore it". If |
1932 | // we know we'll never rewind, we can optimize this. |
1933 | if (neverRewind) { |
1934 | Builder builder(*this->getModule()); |
1935 | curr->condition = builder.makeConst(x: int32_t(0)); |
1936 | } |
1937 | } |
1938 | |
1939 | void visitCall(Call* curr) { |
1940 | unsetUnwinding(); |
1941 | if (!importsAlwaysUnwind) { |
1942 | return; |
1943 | } |
1944 | auto* target = this->getModule()->getFunction(curr->target); |
1945 | if (!target->imported()) { |
1946 | return; |
1947 | } |
1948 | // This is an import that definitely unwinds. Await the next check of |
1949 | // the state in this linear execution trace, which we can turn into a |
1950 | // constant. |
1951 | this->unwinding = true; |
1952 | } |
1953 | |
1954 | void visitCallIndirect(CallIndirect* curr) { unsetUnwinding(); } |
1955 | |
1956 | static void doNoteNonLinear( |
1957 | ModAsyncify<neverRewind, neverUnwind, importsAlwaysUnwind>* self, |
1958 | Expression**) { |
1959 | // When control flow branches, stop tracking an unwinding. |
1960 | self->unsetUnwinding(); |
1961 | } |
1962 | |
1963 | void visitGlobalSet(GlobalSet* set) { |
1964 | // TODO: this could be more precise |
1965 | unsetUnwinding(); |
1966 | } |
1967 | |
1968 | private: |
1969 | Name asyncifyStateName; |
1970 | |
1971 | // Whether we just did a call to an import that indicates we are unwinding. |
1972 | bool unwinding = false; |
1973 | |
1974 | void unsetUnwinding() { this->unwinding = false; } |
1975 | }; |
1976 | |
1977 | // |
1978 | // Assume imports that may unwind will always unwind, and that rewinding never |
1979 | // happens. |
1980 | // |
1981 | |
1982 | Pass* createModAsyncifyAlwaysOnlyUnwindPass() { |
1983 | return new ModAsyncify<true, false, true>(); |
1984 | } |
1985 | |
1986 | // |
1987 | // Assume that we never unwind, but may still rewind. |
1988 | // |
1989 | struct ModAsyncifyNeverUnwind : public Pass { |
1990 | void run(Module* module) override {} |
1991 | }; |
1992 | |
1993 | Pass* createModAsyncifyNeverUnwindPass() { |
1994 | return new ModAsyncify<false, true, false>(); |
1995 | } |
1996 | |
1997 | } // namespace wasm |
1998 |
Definitions
- ASYNCIFY_STATE
- ASYNCIFY_GET_STATE
- ASYNCIFY_DATA
- ASYNCIFY_START_UNWIND
- ASYNCIFY_STOP_UNWIND
- ASYNCIFY_START_REWIND
- ASYNCIFY_STOP_REWIND
- ASYNCIFY_UNWIND
- ASYNCIFY
- START_UNWIND
- STOP_UNWIND
- START_REWIND
- STOP_REWIND
- ASYNCIFY_GET_CALL_INDEX
- ASYNCIFY_CHECK_CALL_INDEX
- State
- DataOffset
- STACK_ALIGN
- FakeGlobalHelper
- FakeGlobalHelper
- ~FakeGlobalHelper
- getName
- getTypeOrNone
- collectTypes
- PatternMatcher
- PatternMatcher
- match
- checkPatternsMatches
- ModuleAnalyzer
- Info
- ModuleAnalyzer
- needsInstrumentation
- canChangeState
- doesCall
- AsyncifyBuilder
- AsyncifyBuilder
- makeGetStackPos
- makeIncStackPos
- makeStateCheck
- makeNegatedStateCheck
- InstrumentedProxy
- create
- InstrumentedProxy
- isFunctionParallel
- runOnFunction
- modifiesBinaryenIR
- invalidatesDWARF
- requiresNonNullableLocalFixups
- InstrumentedPassRunner
- InstrumentedPassRunner
- doAdd
- AsyncifyFlow
- isFunctionParallel
- create
- AsyncifyFlow
- runOnFunction
- process
- makeMaybeSkip
- makeCallSupport
- makePossibleUnwind
- makeCallIndexPeek
- makeCallIndexPop
- AsyncifyAssertInNonInstrumented
- isFunctionParallel
- create
- AsyncifyAssertInNonInstrumented
- runOnFunction
- addAssertsInNonInstrumented
- AsyncifyLocals
- isFunctionParallel
- create
- AsyncifyLocals
- visitCall
- visitGlobalSet
- visitGlobalGet
- getFakeCallLocal
- doWalkFunction
- findRelevantLiveLocals
- makeLocalLoading
- makeLocalSaving
- makeCallIndexPush
- getByteSize
- getFullImportName
- Asyncify
- run
- addGlobals
- addFunctions
- createSecondaryMemory
- createAsyncifyPass
- ModAsyncify
- isFunctionParallel
- create
- doWalkFunction
- visitBinary
- visitSelect
- visitCall
- visitCallIndirect
- doNoteNonLinear
- visitGlobalSet
- unsetUnwinding
- createModAsyncifyAlwaysOnlyUnwindPass
- ModAsyncifyNeverUnwind
- run
Learn more about Flutter for embedded and desktop on industrialflutter.com