This document provides detailed interfaces and semantics of operators of KunQuant.
There are two basic categories for operators of KunQuant:
user ops
that are designed to be directly used by the users of KunQaunt to compose factor expressionsinternal ops
that are defined for internal use of KunQuant that is not recommended used for the end-users
Currently there are several kinds of User ops
:
- Basic ops
- Elementwise ops
- Compositive ops
- Cross Sectional ops
- Miscellaneous ops
We will discuss about them in this section.
This is the base class of all ops. All operators inherit from this class.
Defined at KunQuant.Op
Interfaces
class OpBase:
# the inputs of the op
inputs: List['OpBase']
# the attributes
attrs: OrderedDict[str, object]
def __init__(self, inputs: List['OpBase'], attrs: Union[List[Tuple[str, object]], OrderedDict, None]) -> None:
'''
Construct an Op from a list of inputs and attributes.
'''
pass
def replace_inputs(self, replace_map: Dict['OpBase', 'OpBase']):
'''
Replace ops in self.inputs with the ops in replace_map
'''
pass
def set_parent(self, loop: 'ForeachBackWindow') -> None:
'''
Set parent loop
'''
pass
def get_parent(self) -> 'ForeachBackWindow':
'''
Get parent loop
'''
pass
def to_string(self, indent: int, identity: bool, **kwargs) -> str:
'''
Get the string representation of the op and its dependency ops.
indent: the level of indention
identity: only print related inputs in ForeachBackWindow
'''
pass
def __str__(self) -> str:
'''
Get the string representation of the op and its dependency ops.
'''
pass
def fast_str(self) -> str:
'''
Get the string representation of the op. Don't print the dependency ops.
'''
pass
def hash_hex(self) -> str:
'''
Get the hex hash string for the op and all its dependencies
'''
pass
def verify(self, func: 'KunQuant.Stage.Function') -> None:
'''
Verify the op
'''
pass
OpBase
also overloads + - * / >= > <= < & |
operators, so users can use a + b
or a + 2
to represent Add(a, b)
or Add(a, ConstantOp(2))
ops, if a
and b
are ops.
This op represents a constant value that is known at compile time.
Defined at KunQuant.Op
Interfaces
class ConstantOp(OpBase, GraphSourceTrait):
def __init__(self, v: float) -> None:
pass
This op represents a named input (for example, "open", "high").
Defined at KunQuant.Op
class Input(WindowedDataSourceOp, GraphSourceTrait):
def __init__(self, name: str) -> None:
super().__init__([], [("name", name)])
This op represents a named output (for example, "alpha101"). The input of this op are marked as output of the factor library with the given name.
Defined at KunQuant.Op
class Output(WindowedDataSourceOp, SinkOpTrait):
def __init__(self, inp: OpBase, name: str = "") -> None:
super().__init__([inp], [("name", name)])
Elementwise ops are operations whose an output value only depends on the newest values at time series of the inputs. We further define base classes UnaryElementwiseOp
and BinaryElementwiseOp
for Elementwise ops that have 1 or 2 inputs.
Defined at KunQuant.Op
class UnaryElementwiseOp(OpBase):
def __init__(self, lhs: OpBase, attrs: Union[List[Tuple[str, object]], OrderedDict, None] = None) -> None:
super().__init__([lhs], attrs)
class BinaryElementwiseOp(OpBase):
def __init__(self, lhs: OpBase, rhs: OpBase, attrs: Union[List[Tuple[str, object]], OrderedDict, None] = None) -> None:
super().__init__([lhs, rhs], attrs)
Subclasses Ops of BinaryElementwiseOp
, defined in KunQuant.ops.ElewiseOp
(also can be found in KunQuant.ops
):
- Math ops: Add, Sub, Mul, Div, Or, And, Min, Max
- Comparison ops: GreaterThan, GreaterEqual, LessThan, LessEqual, Equals
Subclasses Ops of UnaryElementwiseOp
, defined in KunQuant.ops.ElewiseOp
(also can be found in KunQuant.ops
):
- Math ops: Sqrt (square root), Log (logarithm to the base of the mathematical constant e), Abs (absolute value), Sign (+1/0/-1 for positive/zero/negative inputs), Not (logical not), Exp (exponential function to the base of the mathematical constant e)
- SetInfOrNanToValue (If input is inf or Nan, set output to a given constant value)
KunQuant also supports Select
op, defined in KunQuant.ops.ElewiseOp
(also can be found in KunQuant.ops
). It accepts a boolean predicate cond
and two values true_v
and false_v
. When a value of cond
is true, the output of the Select
op will be the value of true_v
. Otherwise, the output of the op will be false_v
.
class Select(OpBase):
def __init__(self, cond: OpBase, true_v: OpBase, false_v: OpBase) -> None:
pass
There are some deprecated ops defined in KunQuant.ElewiseOp
, which are based on BinaryConstOp
, to represent a "binary elementwise" op, whose one of the input is constant. Users should avoid using these ops, and use subclasses of BinaryElementwiseOp
and ConstantOp
instead.
Interface of BinaryConstOp
class BinaryConstOp(UnaryElementwiseOp):
def __init__(self, inp: OpBase, v: float, swap: bool = False) -> None:
'''
Deprecated. The base class binary ops whose one of the input is constant.
inp: the input
v: the constant input
swap: if false, the constant `v` is on the right hand side. Otherwise, swap the two inputs
'''
pass
Subclasses of it AddConst
, SubConst
, MulConst
, DivConst
, GreaterThanConst
, LessThanConst
.
This kind of ops are complex ops that can be further decomposed into smaller ops. They are defined in KunQuant.ops.CompOp
and can also be imported in KunQuant.ops
.
Most of these ops are windowed ops, whose output depends on a window of previous data in time series.
Single operand windowed ops:
class WindowedSum(WindowedReduce):
'''
Sum of a rolling look back window, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).sum()
'''
class WindowedProduct(WindowedReduce):
'''
Product of a rolling look back window, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).product()
'''
class WindowedMin(WindowedReduce):
'''
Min of a rolling look back window, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).min()
'''
class WindowedMax(WindowedReduce):
'''
Max of a rolling look back window, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).max()
'''
class WindowedAvg(WindowedCompositiveOp):
'''
Average of a rolling look back window, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).mean()
'''
class WindowedStddev(WindowedCompositiveOp):
'''
Unbiased standard deviation of a rolling look back window, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).std()
'''
class TsArgMax(WindowedCompositiveOp):
'''
ArgMax in a rolling look back window, including the current newest data.
The result should be the index of the max element in the rolling window. The index of the oldest element of the rolling window is 1.
Similar to df.rolling(window).apply(np.argmax) + 1
'''
class TsArgMin(WindowedCompositiveOp):
'''
ArgMin in a rolling look back window, including the current newest data.
The result should be the index of the min element in the rolling window. The index of the oldest element of the rolling window is 1.
Similar to df.rolling(window).apply(np.argmin) + 1
'''
class TsRank(WindowedCompositiveOp):
'''
Time series rank of the newest data in a rolling look back window, including the current newest data.
Let num_values_less = the number of values in rolling window that is less than the current newest data.
Let num_values_eq = the number of values in rolling window that is equal to the current newest data.
rank = num_values_less + (num_values_eq + 1) / 2
Similar to df.rolling(window).rank()
'''
class DecayLinear(WindowedCompositiveOp):
'''
Weighted average in a rolling look back window, including the current newest data.
The weight decreases linearly and the newer value has the higher weight.
step_size = 1.0 / ((1.0 + window) * window / 2)
weight[i] = (i+1) * step_size
DecayLinear = sum([weight[i] for i in 0 to window])
'''
class WindowedLinearRegressionRSqaure(WindowedLinearRegressionBase):
'''
Rsquare value of windowed linear regression. Implementation see
https://github.com/microsoft/qlib/blob/a7d5a9b500de5df053e32abf00f6a679546636eb/qlib/data/_libs/rolling.pyx#L137
'''
class WindowedLinearRegressionSlope(WindowedLinearRegressionBase):
'''
Slope value of windowed linear regression. Implementation see
https://github.com/microsoft/qlib/blob/a7d5a9b500de5df053e32abf00f6a679546636eb/qlib/data/_libs/rolling.pyx#L49
'''
class WindowedLinearRegressionResi(WindowedLinearRegressionBase):
'''
Residuals of windowed linear regression. Implementation see
https://github.com/microsoft/qlib/blob/a7d5a9b500de5df053e32abf00f6a679546636eb/qlib/data/_libs/rolling.pyx#L91
'''
Multiple operands windowed ops:
class WindowedCovariance(WindowedCompositiveOp):
'''
Unbiased estimated covariance of a rolling look back window of two inputs, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).cov(y)
'''
class WindowedCorrelation(WindowedCompositiveOp):
'''
Correlation of a rolling look back window of two inputs, including the current newest data.
For indices < window-1, the output will be NaN
similar to pandas.DataFrame.rolling(n).corr(y)
'''
Note that the above to construct above ops, users should write the code like WindowedCorrelation(input1, window, input2)
(note the order of the operands).
Elementwise ops
class Clip(CompositiveOp):
'''
Elementwisely clip the input value `v` with a given positive constant `eps`:
Clip(v) = max(min(v, eps), -eps)
The output will be between [-eps, +eps]
'''
class Pow(CompositiveOp):
'''
elementwise math function power: base ** expo
'''
Most ops in KunQuant are time-series ops. That is, the output of these ops only depends on the current or previous data in time-series of the same stock. However, KunQuant also supports Cross Sectional ops, that output of these ops depends on the data across different stocks at the same time.
Defined at KunQuant.Op
class Rank(CrossSectionalOp):
'''
the cross sectional rank among different stocks. Between [0, 1]
Similar to df.rank(axis=1, pct=True, method="average")
'''
pass
class Scale(CrossSectionalOp):
'''
the cross sectionally scale different stocks, to make sum([abs(stock[i]) for i in stock]) == 1
Similar to df.div(df.abs().sum(axis=1), axis=0)
'''
pass
class GenericCrossSectionalOp(CrossSectionalOp):
'''
Cross sectional op with customized C++ implementation.
generate_body() should return a C++ source code string. The C++ code should iterate on the
stocks at the same point of "time" to compute the output.
'''
def generate_body(self) -> str:
'''
Predefined types and variables:
`T`: the datatype, float or double
`num_stocks`: the number of stocks
`input_{N}`: the array-like accessor for the input data at the current point of time. `input_0` should be the first input
To access the data of a stock, use `input_X[i]` for `i` in 0 to `num_stocks`
`output_0`: the array-like accessor for the output data
'''
raise NotImplementedError("GenericCrossSectionalOp must be specialized")
def generate_head(self) -> str:
'''
Predefined types and variables:
`T`: the datatype, float or double
`num_stocks`: the number of stocks
'''
raise NotImplementedError("GenericCrossSectionalOp must be specialized")
You can define your own cross-sectional operator by injecting C++ code via GenericCrossSectionalOp
. DiffWithWeightedSum
is an example for defining a custom cross-sectional operator by extending GenericCrossSectionalOp
.
Defined at KunQuant.Op.MiscOp
(also found in KunQuant.Op
)
class BackRef(OpBase, WindowedTrait):
'''
Gets the data in `window` rows ago
'''
def __init__(self, v: OpBase, window: int) -> None:
pass
class WindowedQuantile(OpBase, WindowedTrait):
'''
Quantile in `window` rows ago.
Similar to pd.rolling(window).quantile(q, interpolation='linear')
'''
def __init__(self, v: OpBase, window: int, q: float) -> None:
pass
class ExpMovingAvg(OpBase, GloablStatefulOpTrait):
'''
Exponential Moving Average (EMA)
Similar to pd.DataFrame.ewm(span=window, adjust=False, ignore_na=True).mean()
'''
def __init__(self, v: OpBase, window: int) -> None:
pass
class DiffWithWeightedSum(GenericCrossSectionalOp):
'''
Compute cross sectional weighted sum (of all stocks) and compute the difference
of each stock data and the sum. Similar to numpy code:
v2 = np.sum(v * w, axis=1)
result = v - v2.reshape((-1, 1))
'''
def __init__(self, v: OpBase, w: OpBase) -> None:
super().__init__([v, w], None)
This section is for internal development of KunQuant.
A trait is marked on class of ops to indicate some special properties of the ops.
Traits defined in KunQuant.Op
:
class GraphSourceTrait:
'''
The "source" of a graph, like input and constant ops. They have no inputs.
'''
pass
class WindowedDataSourceOp(OpBase):
'''
The ops that can be an input of WindowedTrait. It provides a window of data
'''
pass
class SinkOpTrait:
'''
The "sink" of a graph, like "output" op. Should keep ops extending this class even if no reference to these ops
'''
pass
class WindowedTrait:
'''
The ops that require a window of inputs of previous data.
'''
def verify(self, func: 'KunQuant.Stage.Function') -> None:
pass
def required_input_window(self) -> int:
return self.attrs["window"]
class StatefulOpTrait:
'''
The ops that have an internal state
'''
pass
class CrossSectionalOp(OpBase):
def __init__(self, v: OpBase) -> None:
pass
Ops defined in KunQuant.Op
:
class WindowedTempOutput(WindowedDataSourceOp):
'''
Mark that we need a windowed buffer of previous data of the input
'''
def __init__(self, inp: OpBase, window: int) -> None:
super().__init__([inp], [("window", window)])
class ForeachBackWindow(OpBase, WindowedTrait):
'''
A for-loop to iterate the input ops (must be windowed inputs) and reduce outputs
inp: A windowed input
window: for-loop length in window size
args: optional other windowed inputs to iterate
'''
def __init__(self, inp: WindowedTrait, window: int, *args) -> None:
pass
def print_args(self, indent: int, identity: bool, **kwargs) -> str:
pass
class IterValue(OpBase):
'''
Gets the current iteration value of the ForeachBackWindow
loop: the loop
src: the specific input of the loop to iterate. For example,
itr = ForeachBackWindow(X, window = 10, Y)
xItr = IterValue(itr, X) # the current value of X in the window in this iteration
yItr = IterValue(itr, Y) # the current value of Y in the window in this iteration
'''
def __init__(self, loop: ForeachBackWindow, src: OpBase) -> None:
pass
def print_args(self, indent: int, identity: bool, **kwargs) -> str:
pass
def verify(self, func: 'KunQuant.Stage.Function') -> None:
pass
class ReductionOp(OpBase, StatefulOpTrait):
'''
Base class of all reduction ops. A reduction op takes inputs that is originated from a IterValue. The input must be in a loop (v.get_parent() is a loop). The data produced
by a ReductionOp should be used outside of the loop
'''
def __init__(self, v: OpBase, init_val: OpBase = None, attrs: Union[List[Tuple[str, object]], OrderedDict, None] = None) -> None:
pass
def verify(self, func: 'KunQuant.Stage.Function') -> None:
pass
Ops defined in KunQuant.ops.ReduceOp
:
class ReduceAdd(ReductionOp):
pass
class ReduceMul(ReductionOp):
pass
class ReduceMin(ReductionOp):
pass
class ReduceMax(ReductionOp):
pass
class ReduceArgMax(ReductionOp):
pass
class ReduceArgMin(ReductionOp):
pass
class ReduceRank(ReductionOp):
pass
class ReduceDecayLinear(ReductionOp):
pass
Ops defined in KunQuant.ops.MiscOp
:
class FastWindowedSum(OpBase, WindowedTrait, GloablStatefulOpTrait):
'''
Fast sum for windowed sum without reduction loop
'''
def __init__(self, v: OpBase, window: int) -> None:
pass
def required_input_window(self) -> int:
pass
class WindowedLinearRegression(OpBase, WindowedTrait, GloablStatefulOpTrait):
'''
Compute states of Windowed Linear Regression
'''
def __init__(self, v: OpBase, window: int) -> None:
pass
def required_input_window(self) -> int:
pass
class WindowedLinearRegressionImplBase(OpBase):
def __init__(self, v: OpBase) -> None:
pass
def verify(self, func: 'KunQuant.Stage.Function') -> None:
pass
class WindowedLinearRegressionRSqaureImpl(OpBase):
'''
Compute RSqaure of Windowed Linear Regression
'''
pass
class WindowedLinearRegressionSlopeImpl(OpBase):
'''
Compute RSqaure of Windowed Linear Regression
'''
pass
class WindowedLinearRegressionResiImpl(OpBase):
'''
Compute RSqaure of Windowed Linear Regression
'''
pass