The computer algebra system package

Symbolic expressions are provided by several classes found in module cas/expressions:

  • Constant cst, which represents immediate (signed or unsigned) value of fixed size (bitvector),

  • Symbol sym, a Constant equipped with a reference string (non-external symbol),

  • Register reg, a fixed size CPU register location,

  • External ext, a reference to an external location (external symbol),

  • Floats cfp, constant (fixed size) floating-point values,

  • Composite comp, a bitvector composed of several elements,

  • Pointer ptr, a memory location in a segment, with possible displacement,

  • Memory mem, a Pointer to represent a value of fixed size in memory,

  • Slice slc, a bitvector slice of any element,

  • Test tst, a conditional expression, (see below.)

  • Operator uop, an unary operator expression,

  • Operator op, a binary operator expression. The list of supported operations is not fixed althrough several predefined operators allow to build expressions directly from Python expressions: say, you don’t need to write op('+',x,y), but can write x+y. Supported operators are:

    • +, -, * (multiply low), ** (multiply extended), /
    • &, |, ^, ~
    • ==, !=, <=, >=, <, >
    • >>, <<, // (arithmetic shift right), >>> and <<< (rotations).

    See cas.expressions._operator for more details.

All elements inherit from the exp class which defines all default methods/properties. Common attributes and methods for all elements are:

  • size, a Python integer representing the size in bits,
  • sf, the True/False sign-flag.
  • length (size/8)
  • mask (1<<size)-1
  • extend methods (signextend(newsize), zeroextend(newsize))
  • bytes(sta,sto,endian=1) method to retreive the expression of extracted bytes from sta to sto indices.

All manipulation of an expression object usually result in a new expression object except for simplify() which performs a few in-place elementary simplifications.

cas/expressions.py

The expressions module implements all above exp classes. All symbolic representation of data in amoco rely on these expressions.

class cas.expressions.exp(size=0, sf=False)[source]

the core class for all expressions. It defines mandatory attributes, shared methods like dumps/loads etc.

size

the bit size of the expression (default is 0.)

Type:int
sf

the sign flag of the expression (default is False: unsigned.)

Type:Bool
length

the byte size of the expression.

Type:int
mask

the bit mask of the expression.

Type:int

Note

len(exp) returns the byte size, assuming that size is a multiple of 8.

signed()[source]

consider expression as signed

unsigned()[source]

consider expression as unsigned

eval(env)[source]

evalute expression in given mapper env

simplify(**kargs)[source]

simplify expression based on predefined heuristics

depth()[source]

depth size of the expression tree

dumps()[source]

pickle expression

loads(s)[source]

unpickle expression

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

pp(**kargs)[source]

pretty-printed string of the expression

bit(i)[source]

extract i-th bit expression of the expression

bytes(sta=0, sto=None, endian=1)[source]

returns the expression slice located at bytes [sta,sto] taking into account given endianess 1 (little) or -1 (big). Defaults to little endian.

extend(sign, size)[source]

extend expression to given size, taking sign into account

signextend(size)[source]

sign extend expression to given size

zeroextend(size)[source]

zero extend expression to given size

to_smtlib(solver=None)[source]

translate expression to its smt form

class cas.expressions.top(size=0, sf=False)[source]

top expression represents symbolic values that have reached a high complexity threshold.

Note: This expression is an absorbing element of the algebra. Any expression that involves a top expression results in a top expression.

depth()[source]

depth size of the expression tree

class cas.expressions.cst(v, size=32)[source]

cst expression represents concrete values (constants).

value

get the integer of the expression, taking into account the sign flag.

Type:int
toks(**kargs)[source]

returns list of pretty printing tokens of the expression

to_sym(ref)[source]

cast into a symbol expression associated to name ref

eval(env)[source]

evalute expression in given mapper env

zeroextend(size)[source]

zero extend expression to given size

signextend(size)[source]

sign extend expression to given size

class cas.expressions.sym(ref, v, size=32)[source]

symbol expression extends cst with a reference name for pretty printing

class cas.expressions.cfp(v, size=32)[source]

floating point concrete value expression

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

eval(env)[source]

evalute expression in given mapper env

class cas.expressions.reg(refname, size=32)[source]

symbolic register expression

etype

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-‘ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

eval(env)[source]

evalute expression in given mapper env

class cas.expressions.regtype(t)[source]

decorator and context manager (with…) for associating a register to a specific category among STD (standard), PC (program counter), FLAGS, STACK, OTHER.

class cas.expressions.ext(refname, **kargs)[source]

external reference to a dynamic (lazy or non-lazy) symbol

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

call(env, **kargs)[source]

explicit call to the ext’s stub

class cas.expressions.lab(refname, **kargs)[source]

label expression used by the assembler

cas.expressions.composer(parts)[source]

composer returns a comp object (see below) constructed with parts from low significant bits parts to most significant bits parts. The last part sf flag propagates to the resulting comp.

class cas.expressions.comp(s)[source]

composite expression, represents an expression made of several parts.

parts

expressions parts dictionary. Each key is a tuple (pos,sz) and value is the exp part. pos is the bit position for this part, and sz is its size.

Type:dict
smask

mapping of bit index to the part’s key that defines this bit.

Type:list

Note

Each part can be accessed by ‘slicing’ the comp to obtain another comp or the part if the given slice indices match the part position.

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

eval(env)[source]

evalute expression in given mapper env

simplify(**kargs)[source]

simplify expression based on predefined heuristics

cut(start, stop)[source]

cut will scan the parts dict to find those spanning over start and/or stop bounds then it will split them and remove their inner parts.

Note

cut is in in-place method (affects self).

restruct()[source]

restruct will aggregate consecutive cst expressions in order to minimize the number of parts.

depth()[source]

depth size of the expression tree

class cas.expressions.mem(a, size=32, seg=None, disp=0, mods=None, endian=1)[source]

memory expression represents a symbolic value of length size, in segment seg, at given address expression.

a

a pointer expression that represents the address.

Type:ptr
endian

1 means little, -1 means big.

Type:int
mods

list of possibly aliasing operations affecting this exp.

Type:list

Note

The mods list allows to handle aliasing issues detected at fetching time and adjust the eval result accordingly.

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

eval(env)[source]

evalute expression in given mapper env

simplify(**kargs)[source]

simplify expression based on predefined heuristics

bytes(sta=0, sto=None, endian=0)[source]

returns the expression slice located at bytes [sta,sto] taking into account given endianess 1 (little) or -1 (big). Defaults to little endian.

class cas.expressions.ptr(base, seg=None, disp=0)[source]

ptr holds memory addresses with segment, base expressions and displacement integer (offset relative to base).

base

symbolic expression for the base of pointer address.

Type:exp
disp

offset relative to base for the pointer address.

Type:int
seg

segment register (or None if unused.)

Type:reg
toks(**kargs)[source]

returns list of pretty printing tokens of the expression

simplify(**kargs)[source]

simplify expression based on predefined heuristics

eval(env)[source]

evalute expression in given mapper env

cas.expressions.slicer(x, pos, size)[source]

wrapper of slc class that returns a simplified version of x[pos:pos+size].

class cas.expressions.slc(x, pos, size, ref=None)[source]

slice expression, represents an expression part.

x

reference to the symbolic expression

Type:exp
pos

start bit for the part.

Type:int
ref

an alternative symbolic name for this part.

Type:str
etype

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-‘ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4

raw()[source]

returns the raw symbolic name (ignore the ref attribute.)

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

depth()[source]

depth size of the expression tree

eval(env)[source]

evalute expression in given mapper env

simplify(**kargs)[source]

simplify expression based on predefined heuristics

class cas.expressions.tst(t, l, r)[source]

Conditional expression.

tst

the boolean expression that represents the condition.

Type:exp
l

the resulting expression if test == bit1.

Type:exp
r

the resulting expression if test == bit0.

Type:exp
toks(**kargs)[source]

returns list of pretty printing tokens of the expression

eval(env)[source]

evalute expression in given mapper env

simplify(**kargs)[source]

simplify expression based on predefined heuristics

depth()[source]

depth size of the expression tree

cas.expressions.oper(opsym, l, r=None)[source]

wrapper of the operator expression that detects unary operations

class cas.expressions.op(op, l, r)[source]

op holds binary integer arithmetic and bitwise logic expressions

op

binary operator

Type:_operator
prop

type of operator (ARITH, LOGIC, CONDT, SHIFT)

Type:int
l

left-hand expression of the operator

Type:exp
r

right-hand expression of the operator

Type:exp
eval(env)[source]

evalute expression in given mapper env

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

simplify(**kargs)[source]

simplify expression based on predefined heuristics

depth()[source]

depth size of the expression tree

class cas.expressions.uop(op, r)[source]

uop holds unary integer arithmetic and bitwise logic expressions

op

unary operator

Type:_operator
prop

type of operator (ARITH, LOGIC, CONDT, SHIFT)

Type:int
l

returns None in case uop is treated as an op instance.

Type:None
r

right-hand expression of the operator

Type:exp
eval(env)[source]

evalute expression in given mapper env

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

simplify(**kargs)[source]

simplify expression based on predefined heuristics

depth()[source]

depth size of the expression tree

cas.expressions.ror(x, n)[source]

high-level rotate right n bits

cas.expressions.rol(x, n)[source]

high-level rotate left n bits

cas.expressions.ltu(x, y)[source]

high-level less-than-unsigned operation

cas.expressions.geu(x, y)[source]

high level greater-or-equal-unsigned operation

cas.expressions.symbols_of(e)[source]

returns all symbols contained in expression e

cas.expressions.locations_of(e)[source]

returns all locations contained in expression e

cas.expressions.complexity(e)[source]

evaluate the complexity of expression e

cas.expressions.eqn1_helpers(e, **kargs)[source]

helpers for simplifying unary expressions

cas.expressions.eqn2_helpers(e, bitslice=False, widening=False)[source]

helpers for simplifying binary expressions

cas.expressions.extract_offset(e)[source]

separate expression e into (e’ + C) with C cst offset.

class cas.expressions.vec(l=None)[source]

vec holds a list of expressions each being a possible representation of the current expression. A vec object is obtained by merging several execution paths using the merge function in the mapper module. The simplify method uses the complexity measure to eventually “reduce” the expression to top with a hard-limit currently set to op.threshold.

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

simplify(**kargs)[source]

simplify expression based on predefined heuristics

eval(env)[source]

evalute expression in given mapper env

depth()[source]

depth size of the expression tree

class cas.expressions.vecw(v)[source]

vecw is a widened vec expression: it allows to limit the list of possible values to a fixed range and acts as a top (absorbing) expression.

toks(**kargs)[source]

returns list of pretty printing tokens of the expression

eval(env)[source]

evalute expression in given mapper env

cas/smt.py

The smt module defines the amoco interface to the SMT solver. Currently, only z3 is supported. This module allows to translate any amoco expression into its z3 equivalent formula, as well as getting the z3 solver results back as cas.mapper.mapper instances.

cas.smt.newvar(pfx, e, slv)[source]

return a new z3 BitVec of size e.size, with name prefixed by slv argument

cas.smt.top_to_z3(e, slv=None)[source]

translate top expression into a new _topN BitVec variable

cas.smt.cst_to_z3(e, slv=None)[source]

translate cst expression into its z3 BitVecVal form

cas.smt.cfp_to_z3(e, slv=None)[source]

translate cfp expression into its z3 RealVal form

cas.smt.reg_to_z3(e, slv=None)[source]

translate reg expression into its z3 BitVec form

cas.smt.comp_to_z3(e, slv=None)[source]

translate comp expression into its z3 Concat form

cas.smt.slc_to_z3(e, slv=None)[source]

translate slc expression into its z3 Extract form

cas.smt.ptr_to_z3(e, slv=None)[source]

translate ptr expression into its z3 form

cas.smt.mem_to_z3(e, slv=None)[source]

translate mem expression into z3 a Concat of BitVec bytes

cas.smt.cast_z3_bool(x, slv=None)[source]

translate boolean expression into its z3 bool form

cas.smt.cast_z3_bv(x, slv=None)[source]

translate expression x to its z3 form, if x.size==1 the returned formula is (If x ? 1 : 0).

cas.smt.tst_to_z3(e, slv=None)[source]

translate tst expression into a z3 If form

cas.smt.op_to_z3(e, slv=None)[source]

translate op expression into its z3 form

cas.smt.uop_to_z3(e, slv=None)[source]

translate uop expression into its z3 form

cas.smt.vec_to_z3(e, slv=None)[source]

translate vec expression into z3 Or form

cas.smt.to_smtlib(e, slv=None)[source]

return the z3 smt form of expression e

cas.smt.model_to_mapper(r, locs)[source]

return an amoco mapper based on given locs for the z3 model r

cas/mapper.py

The mapper module essentially implements the mapper class and the associated merge() function which allows to get a symbolic representation of the union of two mappers.

class cas.mapper.mapper(instrlist=None, csi=None)[source]

A mapper is a symbolic functional representation of the execution of a set of instructions.

Parameters:
  • instrlist (list[instruction]) – a list of instructions that are symbolically executed within the mapper.
  • csi (Optional[object]) – the optional csi attribute that provide a concrete initial state
__map

is an ordered list of mappings of expressions associated with a location (register or memory pointer). The order is relevant only to reflect the order of write-to-memory instructions in case of pointer aliasing.

__Mem

is a memory model where symbolic memory pointers are addressing separated memory zones. See MemoryMap and MemoryZone classes.

conds

is the list of conditions that must be True for the mapper

csi

is the optional interface to a concrete state

conds
csi
view
inputs()[source]

list antecedent locations (used in the mapping)

outputs()[source]

list image locations (modified in the mapping)

has(loc)[source]

check if the given location expression is touched by the mapper

history(loc)[source]
delayed(k, v)[source]
update_delayed()[source]
rw()[source]

get the read sizes and written sizes tuple

clear()[source]

clear the current mapper, reducing it to the identity transform

getmemory()[source]

get the local MemoryMap associated to the mapper

setmemory(mmap)[source]

set the local MemoryMap associated to the mapper

mmap

get the local MemoryMap associated to the mapper

generation()[source]
R(x)[source]

get the expression of register x

M(k)[source]

get the expression of a memory location expression k

aliasing(k)[source]

check if location k is possibly aliased in the mapper: i.e. the mapper writes to some other symbolic location expression after writing to k which might overlap with k.

update(instr)[source]

opportunistic update of the self mapper with instruction

safe_update(instr)[source]

update of the self mapper with instruction only if no exception occurs

restruct()[source]
eval(m)[source]

return a new mapper instance where all input locations have been replaced by there corresponding values in m.

rcompose(m)[source]

composition operator returns a new mapper corresponding to function x -> self(m(x))

interact()[source]
use(*args, **kargs)[source]

return a new mapper corresponding to the evaluation of the current mapper where all key symbols found in kargs are replaced by their values in all expressions. The kargs “size=value” allows for adjusting symbols/values sizes for all arguments. if kargs is empty, a copy of the result is just a copy of current mapper.

usemmap(mmap)[source]

return a new mapper corresponding to the evaluation of the current mapper where all memory locations of the provided mmap are used by the current mapper.

assume(conds)[source]
cas.mapper.merge(m1, m2, **kargs)[source]

union of two mappers