As people mentioned before LLaMa is very slow due it's huge size... So why don't we try different models for example
"phi3" from microsoft.... in the context of day to day uses llama and phi3 won't have big difference.... but phi3 is lightwieght and is much more quick than llama. It has 3.8B parameters.... If we wan't to make to useable then LLaMa won't be good choice...
try "ollama run phi3"... You will get an idea
As people mentioned before LLaMa is very slow due it's huge size... So why don't we try different models for example
"phi3" from microsoft.... in the context of day to day uses llama and phi3 won't have big difference.... but phi3 is lightwieght and is much more quick than llama. It has 3.8B parameters.... If we wan't to make to useable then LLaMa won't be good choice...
try "ollama run phi3"... You will get an idea