Test the local HF inference engine on the following:

1) Normal small model on a benchmark
﻿2) Big models - 2 GPU
﻿3) Lora or similar models
﻿4) Run small models with long context like open rag on both single gpu an 2 gpu

Both on ccc interactive job and non-interactive job (this sometime requires different config)