Could the algorithm be used on DDPG? And in case action is continuious, how to calculate the loss function of the forward model? Thanks.