Note: You have to read this using VSCode or some other latex-enabled markdown viewer. GitHub does not render the latex equations correctly.
A positive integer
For a dimension
For a shape
For a shape
A tuple
The set of all shapes is denoted by $\mathbb{N}^+\text{shape} = \mathbb{N}^+ \times \mathbb{N}^+ \times \cdots \times \mathbb{N}^+$. A shape is a tuple of dimensions, that is, . A shape is a tuple of dimensions, that is, the set of all shapes is denoted by $\mathbb{N}^+\text{shape} = \mathbb{N}^+ \times \mathbb{N}^+ \times \cdots \times \mathbb{N}^+$.
Suppose you want to raise the elements of a tensor
where
where .pow(a, b)
.
You have two options to implement this in ChAI: the easy way (less performant) and the preferred way (more performant), which is recommended.
In either case, you will be implementing at least three instance methods for ndarray
, staticTensor
, and dynamicTensor
:
lib/NDarray.chpl
proc ndarray.pow(a: eltType, b: eltType): ndarray(rank,eltType) { ... }
lib/StaticTensor.chpl
proc staticTensor.pow(a: eltType, b: eltType): staticTensor(rank,eltType) { ... }
lib/DynamicTensor.chpl
proc dynamicTensor.pow(a: eltType, b: eltType): dynamicTensor(eltType) { ... }
The prefered way to implement the power operation is to write the numerically efficient element-wise kernel completely in lib/NDArray.chpl
as the function
proc ndarray.pow(a: eltType, b: eltType): ndarray(rank,eltType) {
const dom = this.domain; // dom(T)
var u = new ndarray(dom,eltType);
const ref tData = this.data; // T
ref uData = u.data; // U
forall i in dom.every() {
const ref ti = tData[i]; // t_i
aData[i] = b * (ti ** a); // u_i = b * t_i^a
}
return u;
}
The **
operator is the exponentiation operator in Chapel. The forall
loop is a parallel loop that will be executed in parallel on the CPU (or GPU, if available). Each iteration
Next, you will need to implement the pow
method for staticTensor
and dynamicTensor
in lib/StaticTensor.chpl
and lib/DynamicTensor.chpl
, respectively. To do this, you need to create a representation for pow
within the autograd system. This is done by creating a new powOp
record/struct in lib/Autograd.chpl
:
record powOp : serializable {
var input: shared BaseTensorResource(?);
var a: input.eltType;
var b: input.eltType;
proc children do return (input,);
proc forward() do
return input.array.pow(a, b);
proc backward(grad: ndarray(?rank,?eltType)): ndarray(rank,eltType)
where rank == input.rank
&& eltType == input.eltType {
const ref dLdU = grad.data;
const ref U = input.array.data;
const dom = input.array.domain;
var dLdT = new ndarray(dom, eltType);
// const aMinusOne = a - 1;
// const abProd = a * b;
forall i in dom.every() {
const ref ui = U[i];
const ref dldui = dLdU[i];
// const duidti = abProd * (ui ** aMinusOne);
const duidti = (a * b) * (ui ** (a - 1));
dLdT[i] = dldui * duidti;
}
return dLdT;
}
proc spec : GradOpSpec do
return new dict(
("operation","Pow"),
("a",a),
("b",b)
);
}
The powOp
struct is a record that contains the input tensor, the exponent forward
method computes the forward pass of the power operation, while the backward
method computes the backward pass. The spec
method returns a dictionary that contains the operation name and the values of
The backward pass is computed via
where grad
to powOp.backward
, so then we are to compute
so the hard part is to find
Since
so
Then since each element
Therefore, we can write
The derivative
so we have
Finally, you need to add the pow
method to the staticTensor
and dynamicTensor
records:
lib/StaticTensor.chpl
proc staticTensor.pow(a: eltType, b: eltType): staticTensor(rank,eltType) {
const ctx = new powOp(meta,a,b);
return new tensorFromCtx(rank,eltType,ctx);
}
- and
lib/DynamicTensor.chpl
proc dynamicTensor.hardTanh(a: eltType, b: eltType): dynamicTensor(eltType) {
for param rank in 1..maxRank {
if this.checkRank(rank) then
return this.forceRank(rank).pow(a,b).eraseRank();
}
halt("Could not determine rank in dynamicTensor.pow.");
return new dynamicTensor(eltType);
}