I am using a 16 core cpu for the same document using a local model same as in the github repo. How to get the output faster?