karminski · March 2, 2025 02:56
diff --git a/gistfile1.txt b/gistfile1.txt
 (anemll-env) (base) karminski@kurumi Anemll % python ./tests/chat_full.py --meta /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/meta.yaml --d /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1

 Loaded parameters from /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/meta.yaml:
  Context Length: 1024
  Batch Size: 64
  Num Chunks: 8
  Models Directory: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1
  Embeddings: DeepHermes_embeddings
  LM Head: DeepHermes_lm_head_lut6
  FFN: DeepHermes_FFN_PF_lut6_chunk_01of08

 Using model directory: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1
 Context length: 1024
 Using tokenizer path: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1

 Loading models...

 Loading embeddings model...
 Found model at: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_embeddings.mlmodelc
 Loading from: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_embeddings.mlmodelc
 Embeddings model loaded successfully

 Warning: No metadata found in model

 Using parameters:
  Context Length: 1024
  State Length: 1024
  Prefill Batch Size: 64
  LUT Bits: 4
  Number of Chunks: 8

 Overriding batch size from args: 64

 Overriding num chunks from args: 8

 Loading LM head model...
 Found model at: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_lm_head_lut6.mlmodelc
 Loading from: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_lm_head_lut6.mlmodelc
 LM head model loaded successfully

 Loading FFN+PREFILL model(s)...
 Found model at: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc

 Detected chunked FFN+PREFILL model (8 chunks)

 Loading FFN+PREFILL chunk: DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc
 --------------------------------
 Loading compiled model with function name: infer
 Loading compiled model with path: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc
 Loading compiled model with compute_unit: ComputeUnit.CPU_ONLY
 Error loading model from /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc:
 Error details: {
    NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
 }
 Error loading chunk /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc: {
    NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
 }

 Error loading models: {
    NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
 }

 Please ensure all model files exist and are accessible.
 Expected files:
  Embeddings: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_embeddings
  LM Head: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_lm_head_lut6
  FFN: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08

 Error: {
    NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
 }
 Traceback (most recent call last):
  File "/Volumes/WORK_2/works/apps/Anemll/./tests/chat_full.py", line 999, in main
    embed_model, ffn_models, lmhead_model, metadata = load_models(args, metadata)
  File "/Volumes/WORK_2/works/apps/Anemll/./tests/chat_full.py", line 450, in load_models
    "infer": load_model(chunk_path, function_name="infer"),
  File "/Volumes/WORK_2/works/apps/Anemll/./tests/chat_full.py", line 187, in load_model
    return ct.models.CompiledMLModel(str(path), compute_unit)
  File "/Volumes/WORK_2/works/apps/Anemll/anemll-env/lib/python3.10/site-packages/coremltools/models/_compiled_model.py", line 127, in __init__
    self._proxy = _MLModelProxy(
 RuntimeError: {
    NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
 }
	(anemll-env) (base) karminski@kurumi Anemll % python ./tests/chat_full.py --meta /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/meta.yaml --d /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1

	Loaded parameters from /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/meta.yaml:
	Context Length: 1024
	Batch Size: 64
	Num Chunks: 8
	Models Directory: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1
	Embeddings: DeepHermes_embeddings
	LM Head: DeepHermes_lm_head_lut6
	FFN: DeepHermes_FFN_PF_lut6_chunk_01of08

	Using model directory: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1
	Context length: 1024
	Using tokenizer path: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1

	Loading models...

	Loading embeddings model...
	Found model at: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_embeddings.mlmodelc
	Loading from: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_embeddings.mlmodelc
	Embeddings model loaded successfully

	Warning: No metadata found in model

	Using parameters:
	Context Length: 1024
	State Length: 1024
	Prefill Batch Size: 64
	LUT Bits: 4
	Number of Chunks: 8

	Overriding batch size from args: 64

	Overriding num chunks from args: 8

	Loading LM head model...
	Found model at: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_lm_head_lut6.mlmodelc
	Loading from: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_lm_head_lut6.mlmodelc
	LM head model loaded successfully

	Loading FFN+PREFILL model(s)...
	Found model at: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc

	Detected chunked FFN+PREFILL model (8 chunks)

	Loading FFN+PREFILL chunk: DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc
	--------------------------------
	Loading compiled model with function name: infer
	Loading compiled model with path: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc
	Loading compiled model with compute_unit: ComputeUnit.CPU_ONLY
	Error loading model from /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc:
	Error details: {
	NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
	}
	Error loading chunk /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08.mlmodelc: {
	NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
	}

	Error loading models: {
	NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
	}

	Please ensure all model files exist and are accessible.
	Expected files:
	Embeddings: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_embeddings
	LM Head: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_lm_head_lut6
	FFN: /Volumes/WORK_2/works/huggingface.co/anemll/anemll-DeepHermes-3-Llama-3-8B-Preview-ctx1024_0.1.1/DeepHermes_FFN_PF_lut6_chunk_01of08

	Error: {
	NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
	}
	Traceback (most recent call last):
	File "/Volumes/WORK_2/works/apps/Anemll/./tests/chat_full.py", line 999, in main
	embed_model, ffn_models, lmhead_model, metadata = load_models(args, metadata)
	File "/Volumes/WORK_2/works/apps/Anemll/./tests/chat_full.py", line 450, in load_models
	"infer": load_model(chunk_path, function_name="infer"),
	File "/Volumes/WORK_2/works/apps/Anemll/./tests/chat_full.py", line 187, in load_model
	return ct.models.CompiledMLModel(str(path), compute_unit)
	File "/Volumes/WORK_2/works/apps/Anemll/anemll-env/lib/python3.10/site-packages/coremltools/models/_compiled_model.py", line 127, in __init__
	self._proxy = _MLModelProxy(
	RuntimeError: {
	NSLocalizedDescription = "This MLModel doesn't support the multi-function description sytnax.";
	}