Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

System Call Matching Language (SCML)

SCML specifies matching patterns for system‑call invocations. Asterinas developers can easily write SCML rules to describe supported patterns. Likewise, users and developers can intuitively read these rules to understand which system calls and features are available.

SCML is designed to integrate seamlessly with strace, the standard Linux system‑call tracer. Strace emits each invocation in a C‑style syntax; given a set of SCML rules, a tool can automatically determine whether a strace log entry conforms to the supported patterns. This paves the way for an SCML‑based analyzer that reports unsupported calls in any application's trace.

Strace: A Quick Example

To illustrate, run strace on a simple "Hello, World!" program:

$ strace ./hello_world

A typical trace might look like this:

execve("./hello_world", ["./hello_world"], 0xffffffd3f710 /* 4 vars */) = 0
brk(NULL)                               = 0xaaaabdc1b000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff890f4000
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\360\206\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1722920, ...}) = 0
…
write(1, "Hello, World!\n", 14)         = 14
exit_group(0)                           = ?

Key points of this output:

  • System calls are rendered as name(arg1, …, argN).
  • Flags appear as FLAG1|FLAG2|…|FLAGN.
  • Structs use {field1=value1, …}.
  • Arrays are shown as [value1, …].

SCML's syntax draws directly from these conventions.

SCML by Example

SCML is intentionally simple: most Linux system‑call semantics hinge on bitflags. SCML rules act as templates: you define a rule once, and a human or an analyzer uses it to check if a syscall invocation matches it or not.

Imagine you're developing a Linux-compatible OS (like Asterinas) that supports just a restricted subset of syscalls and their options. We will use SCML to describe the restricted functionality.

Matching Rules for System Calls

For example, your OS supports the open system call with one or more of the four flags: O_RDONLY, O_WRONLY, O_RDWR, and O_CLOEXEC: This constraint can be expressed in the following system call matching rule.

open(path, flags = O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC);

To allow file creation, you add another matching rule that includes the O_CREAT flag and requires a mode argument:

open(path, flags = O_CREAT | O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC, mode);

To support the O_PATH flag (only valid with O_CLOEXEC, not with O_RDONLY, O_WRONLY, or O_RDWR), you add a third matching rule:

open(path, flags = O_PATH | O_CLOEXEC);

SCML rules constrain only the flagged arguments; other parameters (like path and mode) accept any value.

C-Style Comments

SCML also supports C‑style comments:

// All matching rules for the open syscall.
// A supported invocation of the open syscall must match at least one of the rules.
open(path, flags = O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC);
open(path, flags = O_CREAT | O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC, mode);
open(path, flags = O_PATH | O_CLOEXEC);

Matching Rules for Bitflags

Above, we embedded flag combinations directly within individual system‑call rules, which can lead to duplication and make maintenance harder. SCML allows you to define named bitflag rules that can be reused across multiple rules. This reduces repetition and centralizes your flag definitions. For example:

// Define a reusable bitflags rule
access_mode = O_RDONLY | O_WRONLY | O_RDWR;

open(path, flags = <access_mode> | O_CLOEXEC);
open(path, flags = O_CREAT | <access_mode> | O_CLOEXEC, mode);
open(path, flags = O_PATH | O_CLOEXEC);

Matching Rules for Structs

SCML can match flags inside struct fields. Consider sigaction:

struct sigaction = {
    sa_flags: SA_NOCLDSTOP | SA_NOCLDWAIT,
    ..
};

Here, .. is a wildcard for remaining fields that we do not care.

Then, we can write a system call rule that refers to the struct rule using the <struct_rule> syntax.

sigaction(signum, act = <sigaction>, oldact = <sigaction>);

Matching Rules for Arrays

SCML can describe how to match flags embedded inside the struct values of an array. This is the case of the poll system call. It takes an array of values of struct pollfd, whose event and revents fields are bitflags.

// Support all but the POLLPRI flags
events = POLLIN | POLLOUT | POLLRDHUP | POLLERR | POLLHUP | POLLNVAL;

struct pollfd = {
    events  = <events>,
    revents = <events>,
    ..
};

poll(fds = [ <pollfd> ], nfds, timeout);

Notice how SCML denotes an array with the [ <struct_rule> ] syntax.

Advanced Usage

Just like you can write multiple rules of the same system call, you may define multiple rules for the same struct:

// Rules for control message header
struct cmsghdr = {
    cmsg_level = SOL_SOCKET,
    cmsg_type  = SO_TIMESTAMP_OLD | SCM_RIGHTS | SCM_CREDENTIALS,
    ..
};
struct cmsghdr = {
    cmsg_level = SOL_IP,
    cmsg_type  = IP_TTL,
    ..
};

A cmsghdr value matches if it satisfies any one rule.

Struct rules may also be nested:

// Rule for message header, which refers to the rules for control message header
struct msghdr = {
    msg_control = [ <cmsghdr> ],
    ..
};

recvmsg(socket, message = <msghdr>, flags);

Formal Syntax

Below is the formal syntax of SCML, expressed in Extended Backus–Naur Form (EBNF). Non‑terminals are in angle brackets, terminals in quotes.

<scml>           ::= { <rule> }
<rule>           ::= <syscall-rule> ';' 
                   | <struct-rule> ';'
                   | <bitflags-rule> ';'

<syscall-rule>   ::= <identifier> '(' [ <param-list> ] ')'
<param-list>     ::= <param> { ',' <param> }
<param>          ::= <identifier> '=' <expr>
                   | <identifier>

<expr>           ::= <expr> '|' <expr>
                   | <term>
<term>           ::= <identifier>
                   | '<' <identifier> '>'

<array>          ::= '[' '<' <identifier> '>' ']'  

<struct-rule>    ::= 'struct' <identifier> '=' '{' <field-list> [ ',' '..' ] '}'
<field-list>     ::= <field> { ',' <field> }
<field>          ::= <identifier>
                   | <identifier> ':' <expr>
                   | <identifier> ':' <array>

<bitflags-rule>  ::= <identifier> '=' <expr>

<identifier>     ::= letter { letter | digit | '_' }

comment          ::= '//' { any-char }