Arki05/Grok-1-GGUF · Typo in Q2 metadatas?

Mar 23

I was trying to load this in llama.cpp after patching in your PR, but it failed to load.
so i tried to run gguf-split --merge (even if in thery shouldn't be needed anymore) and i noticed it was searching for "grok-1-Q2_K-spli-00002-of-00009.gguf" instead of "grok-1-Q2_K-split-00002-of-00009.gguf" (t missing in "split"), and so on.

Not a big deal actually, still works by renaming the files.

Arki05

Owner Mar 23

I've successfully replicated the issue you encountered. It appears to be a minor regression introduced in a recent update llama_model_loader: support multiple split/shard GGUFs #6187, rather than an issue with the metadata in the Quant.

Solution:

There's already a new Rull Request (PR #6192) that corrects the file naming discrepancy. Here's a brief overview of the fix:

Git Diff: The fix involves a small adjustment in the llama_model_loader to ensure correct file naming.

diff --git a/llama.cpp b/llama.cpp
index a98425cc..47ae6faa 100644
--- a/llama.cpp
+++ b/llama.cpp
@@ -2923,7 +2923,7 @@ struct llama_model_loader {
             }
 
             if (trace > 0) {
-                LLAMA_LOG_INFO("%s: loading additional %d GGUFs\n", __func__, n_split);
+                LLAMA_LOG_INFO("%s: loading additional %d GGUFs\n", __func__, n_split -1);
             }
 
             char split_path[PATH_MAX] = {0};
@@ -15149,7 +15149,7 @@ int llama_split_prefix(char * dest, size_t maxlen, const char * split_path, int
     // check if dest ends with postfix
     int size_prefix = str_split_path.size() - str_postfix.size();
     if (size_prefix > 0 && str_split_path.find(str_postfix, size_prefix) != std::string::npos) {
-        snprintf(dest, std::min((size_t) size_prefix, maxlen), "%s", split_path);
+        snprintf(dest, std::min((size_t) size_prefix + 1, maxlen), "%s", split_path);
         return size_prefix;
     }

Quick-Fix Branch: For your convenience, I've created a quick-fix branch in my fork. This branch is rebased onto the current main version of llama.cpp and includes the necessary fix. You can find the quick-fix branch here: Quick-Fix Branch.

Merging the split files should no longer be necessary, and llama.cpp should now work correctly with split files. If you encounter any further issues or have additional feedback, please don't hesitate to reach out.

Arki05 changed discussion status to closed Mar 23