Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
91c72e7
Create push-important-models.yml
younesbelkada Feb 23, 2024
580c7f8
merge
younesbelkada Jun 10, 2024
7e7810a
Merge remote-tracking branch 'upstream/main'
younesbelkada May 21, 2025
ed2f8f3
feat: add falcon-h1
younesbelkada May 21, 2025
303a7f8
fixup
younesbelkada May 21, 2025
6f292cf
address comment
younesbelkada May 21, 2025
6688c9e
fix
younesbelkada May 21, 2025
e044445
fix copies
younesbelkada May 21, 2025
b167ede
fix copies
younesbelkada May 21, 2025
df485a3
fix
younesbelkada May 21, 2025
332b143
fix
younesbelkada May 21, 2025
f3c21a8
fix
younesbelkada May 21, 2025
387e4af
fix
younesbelkada May 21, 2025
efe9108
fix copies
younesbelkada May 21, 2025
1c6a4c5
fix
younesbelkada May 21, 2025
250ca80
fix copies
ArthurZucker May 21, 2025
a62e45b
fix test import to at least trigget the cis
ArthurZucker May 21, 2025
2178c00
yups
ArthurZucker May 21, 2025
7c2c331
update
ArthurZucker May 21, 2025
c1162ae
fix make fix copies
ArthurZucker May 21, 2025
817f146
fix inits?
ArthurZucker May 21, 2025
f1257e3
fix style
ArthurZucker May 21, 2025
184491d
skip annoying test
ArthurZucker May 21, 2025
e2493d8
add integration test for Falcon H1
dhiaEddineRhaiem May 21, 2025
a3dbbe4
fix copies
younesbelkada May 21, 2025
e4dcb70
Merge branch 'add-falcon-h1' of https://github.com/younesbelkada/tran…
dhiaEddineRhaiem May 21, 2025
a4d5141
fix
younesbelkada May 21, 2025
0a30bee
fix typo
dhiaEddineRhaiem May 21, 2025
4392b31
Merge branch 'main' into add-falcon-h1
dhiaEddineRhaiem May 21, 2025
e542fc1
make style
dhiaEddineRhaiem May 21, 2025
c3389b0
fix slow path generations
dhiaEddineRhaiem May 23, 2025
aec328f
clean debug traces
dhiaEddineRhaiem May 23, 2025
5c460c5
debug
dhiaEddineRhaiem May 23, 2025
6dcecc2
remove debug traces final confirmation
dhiaEddineRhaiem May 23, 2025
3e62752
clean debug traces final
dhiaEddineRhaiem May 23, 2025
44e9b30
Merge remote-tracking branch 'upstream/main' into fix-slow-path
dhiaEddineRhaiem May 23, 2025
2dfc906
fix format and lineup
dhiaEddineRhaiem May 23, 2025
f61c5bb
make style
dhiaEddineRhaiem May 23, 2025
1764b36
debug
dhiaEddineRhaiem May 24, 2025
5b2829a
Update src/transformers/models/falcon_h1/modular_falcon_h1.py
dhiaEddineRhaiem May 24, 2025
44f0c2d
adress comments
dhiaEddineRhaiem May 24, 2025
0d43778
Merge branch 'fix-slow-path' of https://github.com/younesbelkada/tran…
dhiaEddineRhaiem May 24, 2025
30efac4
fix fix-copies
dhiaEddineRhaiem May 24, 2025
588da11
fix integration test
dhiaEddineRhaiem May 26, 2025
c2b59bd
Merge pull request #7 from ydshieh/fix-slow-path
ydshieh May 26, 2025
35ee36a
another update (#8)
ydshieh May 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions src/transformers/models/falcon_h1/modeling_falcon_h1.py
Original file line number Diff line number Diff line change
Expand Up @@ -610,9 +610,10 @@ def cuda_kernels_forward(
):
# 1. Gated MLP's linear projection
hidden_states = apply_mask_to_padding_states(hidden_states, attention_mask)
# Add Multipliers
hidden_states = hidden_states * self.ssm_in_multiplier
projected_states = self.in_proj(hidden_states)
projected_states = projected_states * self.mup_vector
projected_states = projected_states * self.mup_vector # ADD Mup Multipliers
d_to_remove = 2 * self.intermediate_size + 2 * self.n_groups * self.ssm_state_size + self.num_heads

# Set up dimensions for reshapes later
Expand Down Expand Up @@ -806,10 +807,13 @@ def torch_forward(

# 1. Gated MLP's linear projection
input_states = apply_mask_to_padding_states(input_states, attention_mask)
# Add Multipliers
input_states = input_states * self.ssm_in_multiplier
projected_states = self.in_proj(input_states)
gate, hidden_states_B_C, dt = projected_states.split(
[self.intermediate_size, self.conv_dim, self.num_heads], dim=-1
)
projected_states = projected_states * self.mup_vector # ADD Mup Multipliers
gate, hidden_states_B_C, dt = projected_states.split([
self.intermediate_size, self.conv_dim, self.num_heads
], dim=-1)

use_precomputed_states = (
cache_params is not None
Expand Down Expand Up @@ -920,8 +924,8 @@ def torch_forward(
hidden_states = hidden_states.reshape(batch_size, seq_len, -1, self.head_dim).float()
B = B.reshape(batch_size, seq_len, -1, self.ssm_state_size).float()
C = C.reshape(batch_size, seq_len, -1, self.ssm_state_size).float()
B = B.repeat(1, 1, self.num_heads // self.n_groups, 1)
C = C.repeat(1, 1, self.num_heads // self.n_groups, 1)
B = B.repeat_interleave(self.num_heads // self.n_groups, dim=2, output_size=self.num_heads)
C = C.repeat_interleave(self.num_heads // self.n_groups, dim=2, output_size=self.num_heads)
pad_size = (self.chunk_size - seq_len % self.chunk_size) % self.chunk_size

D_residual = self.D[..., None] * pad_tensor_by_size(hidden_states, pad_size)
Expand Down
16 changes: 10 additions & 6 deletions src/transformers/models/falcon_h1/modular_falcon_h1.py
Original file line number Diff line number Diff line change
Expand Up @@ -421,9 +421,10 @@ def cuda_kernels_forward(
):
# 1. Gated MLP's linear projection
hidden_states = apply_mask_to_padding_states(hidden_states, attention_mask)
# Add Multipliers
hidden_states = hidden_states * self.ssm_in_multiplier
projected_states = self.in_proj(hidden_states)
projected_states = projected_states * self.mup_vector
projected_states = projected_states * self.mup_vector # ADD Mup Multipliers
d_to_remove = 2 * self.intermediate_size + 2 * self.n_groups * self.ssm_state_size + self.num_heads

# Set up dimensions for reshapes later
Expand Down Expand Up @@ -617,10 +618,13 @@ def torch_forward(

# 1. Gated MLP's linear projection
input_states = apply_mask_to_padding_states(input_states, attention_mask)
# Add Multipliers
input_states = input_states * self.ssm_in_multiplier
projected_states = self.in_proj(input_states)
gate, hidden_states_B_C, dt = projected_states.split(
[self.intermediate_size, self.conv_dim, self.num_heads], dim=-1
)
projected_states = projected_states * self.mup_vector # ADD Mup Multipliers
gate, hidden_states_B_C, dt = projected_states.split([
self.intermediate_size, self.conv_dim, self.num_heads
], dim=-1)

use_precomputed_states = (
cache_params is not None
Expand Down Expand Up @@ -731,8 +735,8 @@ def torch_forward(
hidden_states = hidden_states.reshape(batch_size, seq_len, -1, self.head_dim).float()
B = B.reshape(batch_size, seq_len, -1, self.ssm_state_size).float()
C = C.reshape(batch_size, seq_len, -1, self.ssm_state_size).float()
B = B.repeat(1, 1, self.num_heads // self.n_groups, 1)
C = C.repeat(1, 1, self.num_heads // self.n_groups, 1)
B = B.repeat_interleave(self.num_heads // self.n_groups, dim=2, output_size=self.num_heads)
C = C.repeat_interleave(self.num_heads // self.n_groups, dim=2, output_size=self.num_heads)
pad_size = (self.chunk_size - seq_len % self.chunk_size) % self.chunk_size

D_residual = self.D[..., None] * pad_tensor_by_size(hidden_states, pad_size)
Expand Down
39 changes: 21 additions & 18 deletions tests/models/falcon_h1/test_modeling_falcon_h1.py
Original file line number Diff line number Diff line change
Expand Up @@ -483,24 +483,27 @@ def test_falcon_h1_hard(self):
"""
An integration test for Falcon-H1.
"""
EXPECTED_TEXT = (
"Tell me about the french revolution.\n"
"The French Revolution (1789–1799) was a period of radical social and political upheaval in France that "
"fundamentally transformed the nation and had profound effects on the rest of Europe and the world. Here are the key aspects of the revolution:\n\n"
"### **Causes**\n"
"1. **Economic Crisis**: France was in severe financial trouble due to costly wars (particularly the American Revolution), extravagant spending by the monarchy, and inefficient taxation.\n"
"2. **Social Inequality**: The rigid class system (the Ancien Régime) divided society into the privileged nobility and clergy (First Estate) and the common people (Third Estate), who bore the brunt of taxation and had few rights.\n"
"3. **Enlightenment Ideas**: Philosophers like Rousseau, Voltaire, and Montesquieu inspired ideas of liberty, equality, and popular sovereignty.\n"
"4. **Settlement of 1789**: The Estates-General convened to address the financial crisis, leading to the Third Estate's assertion of its rights and the eventual formation of the National Assembly.\n\n"
"### **Key Events**\n"
"1. **Opening of the Revolution (1789)**:\n"
"- **Storming of the Bastille**: Symbolic of the fall of royal tyranny.\n"
"- **Declaration of the Rights of Man and of the Citizen**: Proclaimed universal rights to liberty, property, and security.\n"
"- **Creation of the National Assembly**: The Third Estate declared itself the representative body of France.\n\n"
"2. **Radical Phase (1792–1794)**:\n"
"- **Reign of Terror**: Led by Maximilien Robespierre, the Committee of Public Safety enforced radical egalitarianism through the guillotine, executing thousands of perceived enemies of the revolution (monarchists, clergy, aristocrats, and counter-revolutionaries).\n"
"- **Execution of Louis XVI**: The king was guillotined in June 1793, symbolizing the end of the monarchy.\n"
)
EXPECTED_TEXT = """
user
Tell me about the french revolution.
assistant
The French Revolution (1789–1799) was a period of radical social and political upheaval in France that fundamentally transformed the nation and had profound effects on the rest of Europe and the world. Here are the key aspects of the revolution:

### **Causes**
1. **Economic Crisis**: France was in severe financial trouble due to costly wars (particularly the American Revolution), extravagant spending by the monarchy, and inefficient taxation.
2. **Social Inequality**: The rigid class system (the Ancien Régime) divided society into the privileged nobility and clergy (First Estate) and the commoners (Third Estate), who bore the brunt of taxation and had few rights.
3. **Enlightenment Ideas**: Philosophers like Voltaire, Rousseau, and Montesquieu inspired ideas of liberty, equality, and popular sovereignty.
4. **Settlement of 1789**: The Estates-General convened to address the financial crisis, leading to the Third Estate's assertion of its rights and the eventual abolition of the feudal system.

### **Key Events**
1. **Storming of the Bastille (July 14, 1789)**: A symbol of royal tyranny, the Bastille fortress was stormed by revolutionaries, sparking widespread rebellion.
2. **Declaration of the Rights of Man and of the Citizen (August 1789)**: A foundational document proclaiming liberty, equality, and fraternity.
3. **National Assembly and King’s Trial (1791–1792)**: King Louis XVI and his ministers were tried and executed (King Louis was guillotined, Marie Antoinette was banished), marking the end of the monarchy.
4. **Rise of the Jacobins and Reign of Terror (1793–1794)**: Radical leaders like Maximilien Robespierre sought to purge France of counter-revolutionaries, leading to mass executions and widespread fear.
5. **Thermidorian Reaction
"""
# Remove the first char (`\n`) and the consecutive whitespaces caused by the formatting.
EXPECTED_TEXT = EXPECTED_TEXT.strip().replace(" " * 12, "")

model_id = "tiiuae/Falcon-H1-1.5B-Deep-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
Expand Down