"The Definitive Guide to WebAssembly" (7) WebAssembly Table

4e6ad10a3ad2ea804b92467c6fe3f843.gif

This article is the seventh article in the series "The Definitive Guide to WebAssembly". List of articles in the series:

Translator's Note: This article is Chapter 7 of the book "The Definitive Guide to WebAssembly", which introduces the concept and usage of WebAssembly tables. A table is a data structure that stores function pointers, allowing modules to dynamically call each other's functions. The article analyzes the characteristics of table types, elements, imports and exports, and gives several sample codes for using tables, including interoperability between C/C++ and JavaScript. The article ends with a discussion of the limitations and future development directions of tables.

People often share their thoughts and stories at the dinner table. Eating with others is more fun than eating alone. If you bring a group of people from all walks of life together, there's probably endless topics to talk about. No one can cover everything. Some may share aspects of the same story. Others may have their own versions. Yet there must be a certain amount of decorum, restraint, and a willingness to accept what the other participants have to offer. Guests who misbehave, chatter, or step out of line with each other can ruin everyone's dinner.

Tables are a feature of WebAssembly becoming a modern software system, and its functional dependencies will be satisfied by additional modules. Compared with static link libraries, it provides capabilities equivalent to dynamic shared libraries. Not every module needs to have all features to work. This would be horribly inefficient. Instead, it is written with the promise that some other module will satisfy the requirements at runtime. This is called dynamic linking in the C and C++ world. Obviously, the dining table theory is just a play on the word table. Just like etiquette is needed when eating, sharing between libraries also needs standards. Let's explore this idea more closely and see how WebAssembly supports it.

Static linking and dynamic linking

Anyone who follows me on Twitter knows what an amazing cook my wife is. She comes from a family of great chefs and had the opportunity to learn from many masters. People often see my posts about her own cooking and ask me for recipes. This is usually not as easy as sending a link, as she often combines ideas from multiple sources and then puts her own spin on it.

In our house, she can rely on her arsenal of recipes. She could say, "Use the sauce from that book to make this. Prepare the beef using the technique described in the other book. After the beef reaches your desired doneness, adding these I think will make it even better of additional ingredients".

In our house, she can refer to steps and ingredient lists from known sources and correct the process with her extra steps. But when she wanted to give the recipes to others, she couldn't acquiesce that people had the books. In this case, she will have to copy the recipe from her source into the complete recipe file. At this point, all the steps and ingredients are defined in one place, and the recipe can be sent to others.

This is basically the difference between static linking and dynamic linking. A typical program needs to read and write the contents of files, open windows, collect user input, or send messages over a network. These are common tasks, and they are often available as functions in libraries provided by the operating system. When you wish to use one of these functions, you tell the linker to allow runtime linking. Otherwise, it will complain about missing symbol references.

At runtime, the operating system will search its configuration path to tell it where to find these shared libraries. Before starting the program, it maps the functionality in the library to a memory location that can be dynamically linked to the rest of the code. There are many reasons for this. The first is the issue of efficiency. Let's say you have a  a () function called that is referenced by a dozen other programs. With static linking, each executable program has its own copy. The program takes up more disk space. Their memory footprint at runtime also becomes larger. This will waste disk and memory space.

If the dynamic library is loaded into a shared memory space, then we only need one copy of the file on disk. Depending on the complexity of your operating system, you may only need one copy in memory.

Dynamic link libraries usually have their own release cycle. If you are using a system library from an executable program, you may update the operating system and get a new version of the library with security patches. As long as the numbering mechanism works and is backwards compatible, you can harden the security of your application by using the patched version without doing anything else.

Please look at Example 7-1. This is an independent function and has no  main () functions. It is intended to be used as a library. We could compile this into a static library, but now we just create the object code and  main () link our program with it. Note that this function also depends on  the header printf (), so it must import  stdio.h it.

Example 7-1. A library with function calls

#include <stdio.h>

void sayHello (char *message) {
  printf ("% s\n", message);
}

In Example 7-2, you'll see  main () that the function is called first  printf (), and then our function is called, which is also called  printf ().

Example 7-2.  main () An example of how to call a library function

#include <stdio.h>

extern void sayHello (char *message);

int main () {
  printf ("Hello, world.\n"); 
  sayHello ("How are you?"); 
  return 0;
}

By default, if you compile these two files with clang, it will generate an output file. We use the default name. When we run it, we see the behavior we expect. By default, the compiler will use dynamic linking for system libraries, meeting all the requirements I've listed.

brian@tweezer ~/g/w/s/ch07> clang main.c library.c brian@tweezer ~/g/w/s/ch07> ls
a.out* library.c main.c
brian@tweezer ~/g/w/s/ch07> ./a.out
Hello, world.
How are you?

You can verify that dynamic linking is used with the nm command. First, we see that our binary provides  definitions for main () and  sayHello () but does not  printf (). This is a reused function from the standard library:

brian@tweezer ~/g/w/s/ch07> nm a.out 
0000000100008008 d __dyld_private 
0000000100000000 T __mh_execute_header
0000000100003f10 T _main
                 U _printf
0000000100003f50 T _sayHello
                 U dyld_stub_binder

On Linux, you can see that the same build step produces a binary with additional functionality. This is natural since it's a different operating system with a different runtime and a different binary format. What stands out is that our method is provided in the binary printf () but not available.

brian@bbfcfm:~/src/hello$ nm a.out
0000000000404030 B __bss_start
0000000000404030 b completed.8060
0000000000404020 D __data_start
0000000000404020 W data_start
0000000000401080 t deregister_tm_clones
0000000000401070 T _dl_relocate_static_pie
00000000004010f0 t __do_global_dtors_aux
0000000000403e08 d __do_global_dtors_aux_fini_array_entry 0000000000404028 D __dso_handle
0000000000403e10 d _DYNAMIC
0000000000404030 D _edata
0000000000404038 B _end
0000000000401218 T _fini
0000000000401120 t frame_dummy
0000000000403e00 d __frame_dummy_init_array_entry
000000000040216c r __FRAME_END__
0000000000404000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000402024 r __GNU_EH_FRAME_HDR
0000000000401000 T _init
0000000000403e08 d __init_array_end
0000000000403e00 d __init_array_start
0000000000402000 R _IO_stdin_used
0000000000401210 T __libc_csu_fini
00000000004011a0 T __libc_csu_init
                 U __libc_start_main@@GLIBC_2.2.5
0000000000401130 T main
                 U printf@@GLIBC_2.2.5
00000000004010b0 t register_tm_clones
0000000000401170 T sayHello
0000000000401040 T _start
0000000000404030 D __TMC_END__

The otool command is another command available on macOS that shows which dynamic libraries are required to successfully execute your binary. Shown is the macOS version of the system library:

brian@tweezer ~/g/w/s/ch07> otool -L a.out 
a.out:
   /usr/lib/libSystem.B.dylib (compatibility vers 1.0.0, current vers 1292.60.1)

otool does not exist on Linux, but we can see similar results by using objdump. I've removed parts of the output to save space, but the relevant parts are shown in the snippet below. There are similar tools on Windows to check your DLL dependencies. As you can see, we need  libc.so.6 to satisfy the needs of our binary.

brian@bbfcfm:~/src/hello$ objdump -x a.out
    a.out:     file format elf64-x86-64
    a.out
    architecture: i386:x86-64, flags 0x00000112:
    EXEC_P, HAS_SYMS, D_PAGED
    start address 0x0000000000401040
...
Dynamic Section:
  NEEDED         libc.so.6
  INIT           0x0000000000401000
  FINI           0x0000000000401218
  INIT_ARRAY     0x0000000000403e00
  INIT_ARRAYSZ   0x0000000000000008
  FINI_ARRAY     0x0000000000403e08
  FINI_ARRAYSZ   0x0000000000000008
  HASH           0x00000000004002e8
  GNU_HASH       0x0000000000400310
  STRTAB         0x0000000000400390
  SYMTAB         0x0000000000400330
  STRSZ          0x000000000000003f
  SYMENT         0x0000000000000018
  DEBUG           0x0000000000000000
  PLTGOT         0x0000000000404000
  PLTRELSZ       0x0000000000000018
  PLTREL         0x0000000000000007
  JMPREL         0x0000000000400428
  RELA           0x00000000004003f8
  RELASZ         0x0000000000000030
  RELAENT         0x0000000000000018
  VERNEED         0x00000000004003d8
  VERNEEDNUM     0x0000000000000001
  VERSYM         0x00000000004003d0
Version References:
  required from libc.so.6:
    0x09691a75 0x00 02 GLIBC_2.2.5
...

WebAssembly is obviously not the same thing as an operating system, but it benefits from similar concepts. Our options are the same: put all function definitions into a module so it can stand alone, or call the behavior from another module to suit our needs. Considering that we often download WebAssembly modules over the network, it is desirable to keep them small. This also affects disk storage, module validation, loading instances in memory, etc. For this we have Table instance.

Create table in module

Table instances have some characteristics similar to the Memory instances we introduced in Chapter 4 [1] . Currently there can only be one per module, but it can be defined in the module or passed in via an imported object. The one-instance-per-module restriction may be lifted in the future, but for now we must abide by it.

We use this structure in WebAssembly, rather than just using Memory instances, in part because the latter can be manipulated by modules. Having a dinner conversation, we don't want any individual participant to rewrite the code of conduct. The same goes for shared modules. If we've loaded and verified a module that exports functions via table instances, we don't want another module to cause trouble for others. So all you can do is make indirect function calls to function references stored in the table. Currently, function references are the only things that can be stored in table instances, but this is also expected to change in the future.

At this point, I don't want to overcomplicate things and go back to a simple function definition in Wat to demonstrate how to create table instances and export them.

In Example 7-3, I created two functions. $add The function takes two arguments, adds them, and returns the result. $sub The function takes two parameters, subtracts the second parameter from the first parameter, and returns the result. So what? This is just a review of the previous chapters. The difference here is what happens next.

Example 7-3. A module that exports its table instances

(module
  (func $add (param $a i32) (param $b i32) (result i32)
      local.get $a
      local.get $b
      i32.add)

  (func $sub (param $a i32) (param $b i32) (result i32)
      local.get $a
      local.get $b
      i32.sub)

  (table (export "tbl") funcref (elem $add $sub))
)

We have introduced a new Wat keyword - table. This defines a collection of function references. Note the inline export command. This allows host environment calls  $add and  $sub functions, but not by function name. The host can only call these two functions through table instances. The Anyfunc type is currently the only type allowed for this structure, as we pointed out before. According to the ordering in the elem reference, $add it will be at the 0th position, $sub which will be at the 1st position [^1].

We can turn our Wat file into a Wasm module and inspect its contents as shown below. Pay attention to the table section, type section, and export section.

brian@tweezer ~/g/w/s/ch07> wat2wasm math.wat 
brian@tweezer ~/g/w/s/ch07> wasm-objdump -x math.wasm
    math.wasm:      file format wasm 0x1
    Section Details:
    Type [1]:
     - type [0] (i32, i32) -> i32
    Function [2]:
     - func [0] sig=0
     - func [1] sig=0
    Table [1]:
     - table [0] type=funcref initial=2 max=2
    Export [1]:
     - table [0] -> "tbl"
    Elem [1]:
     - segment [0] flags=0 table=0 count=2 - init i32=0
      - elem [0] = func [0]
      - elem [1] = func [1]
    Code [2]:
     - func [0] size=7
     - func [1] size=7

The JavaScript in Example 7-4 instantiates our module, just like we did in the previous chapter. From there, it extracts the Table instance from the exported section of the module.

Example 7-4. Using a table instance exported from JavaScript

<!doctype html>

<html>
  <head>
      <title>WASM Table test</title>
      <meta charset="utf-8">
      <script src="utils.js"></script>
      <link rel="icon" href="data:;base64,=">
  </head>

  <body>
    <script>
      var t;

      fetchAndInstantiate ('math.wasm').then (function (instance) {
      var tbl = instance.exports.tbl;
      t = tbl;
      console.log ("3 + 1 =" + tbl.get (0)(3,1));
      console.log ("3 - 1 =" + tbl.get (1)(3,1));
      });
    </script>
  </body>
</html>

After we get the reference, we can retrieve the function associated with position 0 and call it. Remember, get () what comes back from the call is a reference to a function. To call it, we submit the second set of parameters in parentheses and then print the results to the console. Then we do the same for the function at position 1.

Send HTML over HTTP and open the JavaScript console. When your browser executes this code, it should look like Figure 7-1.

cc302f8dddd722625e700b50069af587.png

Figure 7-1. Output of calling a method on a table instance

A table instance can only have two references. If you try to access a  tbl.length location that is exceeded, an exception will be raised.

Dynamic links in WebAssembly

Our final example is using dynamic linking in WebAssembly. We will define two modules. One will contain our predefined  $add and  $sub methods. The first module is in Example 7-5. The main difference from what we saw before is that this module imports a table from the host. We use the elem instruction to put arithmetic functions into this table. The addition function is stored in position 0, and the subtraction function is stored in position 1.

Example 7-5. A dynamically linked module

(module
  (import "js" "table" (table 2 funcref))

  (func $add (param $a i32) (param $b i32) (result i32)
      local.get $a
      local.get $b
      i32.add)

  (func $sub (param $a i32) (param $b i32) (result i32)
      local.get $a
      local.get $b
      i32.sub)

  (elem (i32.const 0) $add)
  (elem (i32.const 1) $sub)
)

Our second module will export two functions, myadd and mysub. It advertises to its customers the ability to add and subtract two numbers. Internally, it will call the function reference in the imported table instance, which we also imported from the host's JavaScript environment.

An implementation of the functionality we are advertising is shown in Example 7-6. Both functions  call_indirect call instructions. In the previous chapters, we saw the use of the call directive to call functions defined in the current module. The call_indirect directive calls a function by determining which element of the table you want to call.

Example 7-6. A module that depends on a dynamically linked module

(module
  (import "js" "table" (table 2 funcref))

  (type $sig (func (param $a i32) (param $b i32) (result i32)))

  (func (export "myadd") (param $a i32) (param $b i32) (result i32)
      (call_indirect (type $sig) (local.get $a) (local.get $b) (i32.const 0))
  )

  (func (export "mysub") (param $a i32) (param $b i32) (result i32)
      (call_indirect (type $sig) (local.get $a) (local.get $b) (i32.const 1))
  )
)

One of the things that will jump out at you is the use of type directives. This defines the signature of a function to provide a degree of type safety in WebAssembly. The idea is that the imported table function should have the signature you want to call.

In this case, we define a function signature that takes two i32s and returns one i32. When we call these methods through the table, it indicates that this is the type we expected. After signing, we push the function's parameters onto the stack and finally to the table position number. For addition, its constant value is 0, which represents the first position in the table. For subtraction, it will be the second position.

We put it all together in Example 7-7. The first thing we do is create a shared table instance. This will be  importObject passed to both modules. The difference is that math2.wat the module writes its functions  $add and  $sub in positions 0 and 1 respectively. mymath.wat Modules are called from the host JavaScript environment  myadd and  mysub are called indirectly from these locations. As part of the call, they will also pass the arguments they were assigned to the dynamically linked function.

Because we are dealing with two modules, our instantiation mechanism is slightly different. Promise.all () Instead of waiting for a single Promise, we call  the method, which prevents all dependent Promises from being satisfied. In this case, it means both modules are loaded and ready.

Example 7-7. Instantiate two modules and establish a dynamic link between them

<!doctype html>

<html>
  <head>
      <title>WASM Dynamic Linking test</title>
      <meta charset="utf-8">
      <script src="utils.js"></script>
      <link rel="icon" href="data:;base64,=">
  </head>

  <body>
    <script>
      var importObject = {
      js: {
          memory: new WebAssembly.Memory({ initial: 1 }),
          table: new WebAssembly.Table({ initial:2, element:"anyfunc" })
      }
      };

      Promise.all([
          fetchAndInstantiate('math2.wasm', importObject),
      fetchAndInstantiate('mymath.wasm', importObject)
      ]).then(function(instances) {
      console.log("4 + 3 = " + instances[1].exports.myadd(4,3));
          console.log("4 - 3 = " + instances[1].exports.mysub(4,3));
      });

    </script>
  </body>
</html>

Once the modules are available, this code calls the  myadd and  mysub methods with some parameters. Notice that we are selecting the second module instance, representing our behavioral version. This is an array of instances, not a single instance.

After serving over HTTP, the result in the browser is shown in Figure 7-2. One module indirectly calls behavior implemented in another module through a shared Table instance.

10e5d9c04ecf1ac2e7c7d4c0cc4a0f4f.png

Figure 7-2. Output from calling our dynamic link function

This concludes our introduction to the main functional elements of WebAssembly as a platform. The rest of the book will build on these basics, showing you several examples of how to use WebAssembly, and what its future holds. Includes some more advanced features that we haven't covered yet.

To get more information about the cloud native community, join the WeChat group. Please join the cloud native community and click to read the original article to learn more.

Guess you like

Origin blog.csdn.net/weixin_38754564/article/details/129512034