The computer algebra system package¶
Symbolic expressions are provided by several classes found
in module cas/expressions
:
Constant
cst
, which represents immediate (signed or unsigned) value of fixed size (bitvector),Symbol
sym
, a Constant equipped with a reference string (non-external symbol),Register
reg
, a fixed size CPU register location,External
ext
, a reference to an external location (external symbol),Floats
cfp
, constant (fixed size) floating-point values,Composite
comp
, a bitvector composed of several elements,Pointer
ptr
, a memory location in a segment, with possible displacement,Memory
mem
, a Pointer to represent a value of fixed size in memory,Slice
slc
, a bitvector slice of any element,Test
tst
, a conditional expression, (see below.)Operator
uop
, an unary operator expression,Operator
op
, a binary operator expression. The list of supported operations is not fixed althrough several predefined operators allow to build expressions directly from Python expressions: say, you don’t need to writeop('+',x,y)
, but can writex+y
. Supported operators are:+
,-
,*
(multiply low),**
(multiply extended),/
&
,|
,^
,~
==
,!=
,<=
,>=
,<
,>
>>
,<<
,//
(arithmetic shift right),>>>
and<<<
(rotations).
See cas.expressions._operator for more details.
All elements inherit from the exp
class which defines all default methods/properties.
Common attributes and methods for all elements are:
size
, a Python integer representing the size in bits,sf
, the True/False sign-flag.length
(size/8)mask
(1<<size)-1- extend methods (
signextend(newsize)
,zeroextend(newsize)
) bytes(sta,sto,endian=1)
method to retreive the expression of extracted bytes from sta to sto indices.
All manipulation of an expression object usually result in a new expression object except for
simplify()
which performs a few in-place elementary simplifications.
cas/expressions.py¶
The expressions module implements all above exp
classes.
All symbolic representation of data in amoco rely on these expressions.
-
class
cas.expressions.
exp
(size=0, sf=False)[source]¶ the core class for all expressions. It defines mandatory attributes, shared methods like dumps/loads etc.
-
sf
¶ the sign flag of the expression (default is False: unsigned.)
Type: Bool
Note
len(exp) returns the byte size, assuming that size is a multiple of 8.
-
-
class
cas.expressions.
top
(size=0, sf=False)[source]¶ top expression represents symbolic values that have reached a high complexity threshold.
Note: This expression is an absorbing element of the algebra. Any expression that involves a top expression results in a top expression.
-
class
cas.expressions.
cst
(v, size=32)[source]¶ cst expression represents concrete values (constants).
-
class
cas.expressions.
sym
(ref, v, size=32)[source]¶ symbol expression extends cst with a reference name for pretty printing
-
class
cas.expressions.
reg
(refname, size=32)[source]¶ symbolic register expression
-
etype
¶ int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-‘ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4
-
-
class
cas.expressions.
regtype
(t)[source]¶ decorator and context manager (with…) for associating a register to a specific category among STD (standard), PC (program counter), FLAGS, STACK, OTHER.
-
class
cas.expressions.
ext
(refname, **kargs)[source]¶ external reference to a dynamic (lazy or non-lazy) symbol
-
cas.expressions.
composer
(parts)[source]¶ composer returns a comp object (see below) constructed with parts from low significant bits parts to most significant bits parts. The last part sf flag propagates to the resulting comp.
-
class
cas.expressions.
comp
(s)[source]¶ composite expression, represents an expression made of several parts.
-
parts
¶ expressions parts dictionary. Each key is a tuple (pos,sz) and value is the exp part. pos is the bit position for this part, and sz is its size.
Type: dict
Note
Each part can be accessed by ‘slicing’ the comp to obtain another comp or the part if the given slice indices match the part position.
-
cut
(start, stop)[source]¶ cut will scan the parts dict to find those spanning over start and/or stop bounds then it will split them and remove their inner parts.
Note
cut is in in-place method (affects self).
-
-
class
cas.expressions.
mem
(a, size=32, seg=None, disp=0, mods=None, endian=1)[source]¶ memory expression represents a symbolic value of length size, in segment seg, at given address expression.
Note
The mods list allows to handle aliasing issues detected at fetching time and adjust the eval result accordingly.
-
class
cas.expressions.
ptr
(base, seg=None, disp=0)[source]¶ ptr holds memory addresses with segment, base expressions and displacement integer (offset relative to base).
-
cas.expressions.
slicer
(x, pos, size)[source]¶ wrapper of slc class that returns a simplified version of x[pos:pos+size].
-
class
cas.expressions.
slc
(x, pos, size, ref=None)[source]¶ slice expression, represents an expression part.
-
etype
¶ int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-‘ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4
-
-
cas.expressions.
oper
(opsym, l, r=None)[source]¶ wrapper of the operator expression that detects unary operations
-
class
cas.expressions.
op
(op, l, r)[source]¶ op holds binary integer arithmetic and bitwise logic expressions
-
op
¶ binary operator
Type: _operator
-
-
class
cas.expressions.
uop
(op, r)[source]¶ uop holds unary integer arithmetic and bitwise logic expressions
-
op
¶ unary operator
Type: _operator
-
-
cas.expressions.
eqn2_helpers
(e, bitslice=False, widening=False)[source]¶ helpers for simplifying binary expressions
-
class
cas.expressions.
vec
(l=None)[source]¶ vec holds a list of expressions each being a possible representation of the current expression. A vec object is obtained by merging several execution paths using the merge function in the mapper module. The simplify method uses the complexity measure to eventually “reduce” the expression to top with a hard-limit currently set to op.threshold.
cas/smt.py¶
The smt module defines the amoco interface to the SMT solver.
Currently, only z3 is supported. This module allows to translate
any amoco expression into its z3 equivalent formula, as well as
getting the z3 solver results back as cas.mapper.mapper
instances.
-
cas.smt.
newvar
(pfx, e, slv)[source]¶ return a new z3 BitVec of size e.size, with name prefixed by slv argument
cas/mapper.py¶
The mapper module essentially implements the mapper
class
and the associated merge()
function which allows to get a
symbolic representation of the union of two mappers.
-
class
cas.mapper.
mapper
(instrlist=None, csi=None)[source]¶ A mapper is a symbolic functional representation of the execution of a set of instructions.
Parameters: - instrlist (list[instruction]) – a list of instructions that are symbolically executed within the mapper.
- csi (Optional[object]) – the optional csi attribute that provide a concrete initial state
-
__map
¶ is an ordered list of mappings of expressions associated with a location (register or memory pointer). The order is relevant only to reflect the order of write-to-memory instructions in case of pointer aliasing.
-
__Mem
¶ is a memory model where symbolic memory pointers are addressing separated memory zones. See MemoryMap and MemoryZone classes.
-
conds
¶ is the list of conditions that must be True for the mapper
-
csi
¶ is the optional interface to a concrete state
-
conds
-
csi
-
view
¶
-
mmap
¶ get the local
MemoryMap
associated to the mapper
-
aliasing
(k)[source]¶ check if location k is possibly aliased in the mapper: i.e. the mapper writes to some other symbolic location expression after writing to k which might overlap with k.
-
eval
(m)[source]¶ return a new mapper instance where all input locations have been replaced by there corresponding values in m.
-
rcompose
(m)[source]¶ composition operator returns a new mapper corresponding to function x -> self(m(x))
-
use
(*args, **kargs)[source]¶ return a new mapper corresponding to the evaluation of the current mapper where all key symbols found in kargs are replaced by their values in all expressions. The kargs “size=value” allows for adjusting symbols/values sizes for all arguments. if kargs is empty, a copy of the result is just a copy of current mapper.