Creative Writing Control Vectors Integration for ExLlamaV2

This project provides a wrapper to integrate jukofyork's creative writing control vectors with ExLlamaV2. While ExLlamaV2 does not natively support control vectors, this wrapper enables loading and injecting GGUF control vectors into the model for dynamic text generation control.

Overview

Wrapper for using control vectors with ExLlamaV2
Supports loading control vectors from GGUF format
Injects vectors directly into ExLlamaV2 inference
Enables dynamic text generation control

Usage

Download model in ExLlamaV2 format
Create a "-vectors" directory next to model directory
Download the control vectors from jukofyork's repository and place them in the "-vectors" directory.
Run inference with the --control_vectors (-vc) parameter.

Example command:

python test_inference.py -m Meta-Llama-3-70B-Instruct-8bpw \
  -p "<prompt>" \
  --control_vectors language:simple:0.5,optimism:optimism:0.5

Directory Structure

Ensure your directory structure follows this format to correctly load the control vectors:

models/
  ├── Meta-Llama-3-70B-Instruct-8bpw/
  │   └── model files...
  └── Meta-Llama-3-70B-Instruct-8bpw-vectors/
      ├── llama-3:70b-language__debias.gguf
      ├── llama-3:70b-language__simple.gguf
      ├── llama-3:70b-language__ornate.gguf 
      └── ...

Limitations

Proof of concept implementation
May impact model performance
Limited testing with different vector combinations
No guarantee of exact equivalence to llama.cpp behavior

Acknowledgments

Control vectors from jukofyork's creative-writing-control-vectors-v3.0
ExLlamaV2 by turboderp