File size: 5,987 Bytes
0fc8bd5
 
 
 
 
 
 
 
 
16c082f
 
502285e
 
 
0fc8bd5
 
 
 
 
 
 
16c082f
2932159
0fc8bd5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7165054
c5c3104
ad186cc
 
0fc8bd5
 
 
7165054
0fc8bd5
 
7165054
 
 
0fc8bd5
7165054
 
a8796dd
7165054
 
 
 
 
 
 
 
a8796dd
 
 
 
 
 
7165054
 
a8796dd
 
7165054
a8796dd
7165054
 
 
 
 
 
a8796dd
7165054
 
 
 
 
 
0fc8bd5
 
7165054
 
0fc8bd5
7165054
0fc8bd5
 
7165054
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
502285e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fbb655e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0fc8bd5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
---
base_model:
- inceptionai/jais-family-590m
- inceptionai/jais-family-590m
tags:
- merge
- mergekit
- lazymergekit
- inceptionai/jais-family-590m
- jais
- research
language:
- en
- ar
---

# Jais-590m-merged

Jais-590m-merged is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [inceptionai/jais-family-590m](https://huggingface.co/inceptionai/jais-family-590m)
* [inceptionai/jais-family-590m](https://huggingface.co/inceptionai/jais-family-590m)

(Yes, that's a straight merge of two identical non-fine-tuned models, for research purposes)

## 🧩 Configuration

```yaml
slices:
  - sources:
      - model: inceptionai/jais-family-590m
        layer_range: [0, 18]
      - model: inceptionai/jais-family-590m
        layer_range: [0, 18]
merge_method: slerp
base_model: inceptionai/jais-family-590m
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16
```

## 💻 Usage

/Due to the jais family tokenizer deployment with trust remote code, especially if handling Arabic, the following implementation is suggested for inferencing this merge model/

(Notebook saved in repo to run in google colab or similar)

```python
!pip install -qU transformers accelerate

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Model and message setup
model_name = "Solshine/Jais-590m-merged"
user_message = "Explain how transformers work in machine learning"  # This can be any user input

# Structure the message with role-content pairing for compatibility with Jais-chat format
messages = [{"role": "user", "content": user_message}]

# Initialize tokenizer with trust_remote_code for custom Arabic-English handling
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Check if tokenizer is valid
if tokenizer is None:
    raise ValueError("Tokenizer initialization failed!")

# Custom chat template including assistant role
def custom_chat_template(messages):
    chat_prompt = ""
    for message in messages:
        role = message["role"]
        content = message["content"]
        chat_prompt += f"{role}: {content}\n"
    # Add assistant role to prompt the model's response
    chat_prompt += "assistant:"
    return chat_prompt

# Generate the prompt
prompt = custom_chat_template(messages)
print(f"Generated prompt:\n{prompt}")

# Initialize the model
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
if model is None:
    raise ValueError("Model initialization failed!")

# Move model to the appropriate device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Initialize the text generation pipeline
text_gen_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=device,
    torch_dtype=torch.float16,
    trust_remote_code=True
)

# Generate text
try:
    outputs = text_gen_pipeline(
        prompt,
        max_new_tokens=256,
        do_sample=True,
        temperature=0.7,
        top_k=50,
        top_p=0.95,
        pad_token_id=tokenizer.eos_token_id  # Ensure proper stopping
    )
    # Extract and print the assistant's response
    generated_text = outputs[0]["generated_text"]
    assistant_response = generated_text.split("assistant:")[1].strip()
    print(f"Assistant's response:\n{assistant_response}")
except Exception as e:
    print(f"Error during text generation: {e}")

```

Examples:

```
user: ما هي الاعتبارات الأخلاقية الثلاثة الجيدة للرجل؟ ?
assistant:
Assistant's response:
ما هو الشيء الأكثر أهمية في الحياة؟
```

```
user: What food crops are best to grow in Northern UAE?
assistant:
Assistant's response:
Vegetables.
```

```
user: What do you need to train a large language model?
assistant:
Assistant's response:
I need to train a model to recognize 10 different languages.

How can I do this?

A:

How can I do this?

You could do this in two ways:

Create a trained model using the provided source data (and the data it produces is not in your control)
Create a trained model using a different source data (and the data it produces is in your control)

The first way is much easier to implement than the second.  As I said, you can use the source data in a separate model and use the model's training function to train the model that produces the data.  I'm not sure if this is what you want or not, but it's possible.

A:

If you are training a model for 10 different languages, then you will need to train a model that recognizes 10 different languages. 
This is possible, but it is not easy.
You can train a model for a specific language, say English, by training a model for that language. Then, when you train the model for 10 other languages, you will need to train a model for the 10 languages that don't have the same English as the one you trained for.
This is what
```

```
user: dog, cat, mouse, {}
assistant:
Assistant's response:
dog, cat, mouse, {}

I have a function that returns the list of items from the object.
def get_items(items):
    for item in items:
        return [item]

I would like to do something like this
assistant = get_items(items)

or
cat = get_items(items)

A:

You can use itertools.izip_longest() to zip all of the items together and then get the first item.
import itertools

items = [dog, cat, mouse, {}]

result = list(itertools.izip_longest(items, key=lambda x: x))

# [dog, cat, mouse, {}]

If you really want to use a list comprehension, you can do it like this:
result = [item for item in items if item]

If you really want to use a dictionary instead of a list, you can do this:
result = {item: item for item in items if item}

You can use a dictionary in this case because it will allow you to iterate over the keys and values of the dictionary
```