Skip to content

Conversation

cmp-nct
Copy link
Owner

@cmp-nct cmp-nct commented Jul 13, 2023

KV cache is now cyclic split into permuted V variant
The ggml_tensor_print function has been completely reworked to output proper 1-4dim tensors with data.

Example:

+======================+======================+======================+======================+
| :0
| V                                [f32 type]
+----------------------+----------------------+----------------------+----------------------+
| Dimensions           | Strides              | Layer id             | Backend              |
| 3                    | 4x16x1024            | 0                    | CPU                  |
+----------------------+----------------------+----------------------+----------------------+
| Elements             | Src0                 | Src1                 | Operation            |
| 4 x 64 x 2           | 4 x 64 x 2           | N/A                  | CONT                 |
+----------------------+----------------------+----------------------+----------------------+
| Transposed:      No  | Permuted:        No  | Contiguous:      Yes | Size:        0.00 MB |
| Src0 name:           | cache_v (view) (permuted)                                          |
+----------------------+----------------------+----------------------+----------------------+

+-------------------------------------------------------------------------------------------+
| Content of src0 "cache_v (view) (permuted)" (3 dim)

+-------------------------------------------------------------------------------------------+
| Content of src0 "cache_v (view) (permuted)" (3 dim)
| Total Elements : [ Row:4   Col:64  Layer:2   ]
+-------------------------------------------------------------------------------------------+
| Row 1: [0.302  , 0.010  ] [-0.238 , 0.680  ] [0.305  , 0.206  ] [-0.013 , 0.436  ] [-0.074 , -0.698 ] [-0.153 , -0.067 ]
| Row 2: [0.091  , 0.199  ] [0.253  , 0.151  ] [-0.557 , 0.089  ] [0.298  , -0.272 ] [-0.149 , 0.232  ] [-0.217 , 0.193  ]
| Row 3: [-0.085 , -0.014 ] [0.225  , 0.089  ] [-0.338 , 0.072  ] [0.416  , -0.186 ] [-0.071 , 0.110  ] [0.467  , 0.497  ]
| Row 4: [-0.336 , 0.471  ] [-0.144 , 0.070  ] [-0.062 , 0.520  ] [0.093  , 0.217  ] [-0.332 , -0.205 ] [0.012  , 0.335  ]
+-------------------------------------------------------------------------------------------+
+-------------------------------------------------------------------------------------------+
| Content of dst "V" (3 dim)

+-------------------------------------------------------------------------------------------+
| Content of dst "V" (3 dim)
| Total Elements : [ Row:4   Col:64  Layer:2   ]
+-------------------------------------------------------------------------------------------+
| Row 1: [0.302  , 0.010  ] [-0.238 , 0.680  ] [0.305  , 0.206  ] [-0.013 , 0.436  ] [-0.074 , -0.698 ] [-0.153 , -0.067 ]
| Row 2: [0.091  , 0.199  ] [0.253  , 0.151  ] [-0.557 , 0.089  ] [0.298  , -0.272 ] [-0.149 , 0.232  ] [-0.217 , 0.193  ]
| Row 3: [-0.085 , -0.014 ] [0.225  , 0.089  ] [-0.338 , 0.072  ] [0.416  , -0.186 ] [-0.071 , 0.110  ] [0.467  , 0.497  ]
| Row 4: [-0.336 , 0.471  ] [-0.144 , 0.070  ] [-0.062 , 0.520  ] [0.093  , 0.217  ] [-0.332 , -0.205 ] [0.012  , 0.335  ]
+-------------------------------------------------------------------------------------------+
+======================+======================+======================+======================+

KV cache is now cyclic split into permuted V variant
The ggml_tensor_print function has been completely reworked to output proper 1-4dim tensors with data.
Example:
```
+======================+======================+======================+======================+
| :0
| V                                [f32 type]
+----------------------+----------------------+----------------------+----------------------+
| Dimensions           | Strides              | Layer id             | Backend              |
| 3                    | 4x16x1024            | 0                    | CPU                  |
+----------------------+----------------------+----------------------+----------------------+
| Elements             | Src0                 | Src1                 | Operation            |
| 4 x 64 x 2           | 4 x 64 x 2           | N/A                  | CONT                 |
+----------------------+----------------------+----------------------+----------------------+
| Transposed:      No  | Permuted:        No  | Contiguous:      Yes | Size:        0.00 MB |
| Src0 name:           | cache_v (view) (permuted)                                          |
+----------------------+----------------------+----------------------+----------------------+

+-------------------------------------------------------------------------------------------+
| Content of src0 "cache_v (view) (permuted)" (3 dim)

+-------------------------------------------------------------------------------------------+
| Content of src0 "cache_v (view) (permuted)" (3 dim)
| Total Elements : [ Row:4   Col:64  Layer:2   ]
+-------------------------------------------------------------------------------------------+
| Row 1: [0.302  , 0.010  ] [-0.238 , 0.680  ] [0.305  , 0.206  ] [-0.013 , 0.436  ] [-0.074 , -0.698 ] [-0.153 , -0.067 ]
| Row 2: [0.091  , 0.199  ] [0.253  , 0.151  ] [-0.557 , 0.089  ] [0.298  , -0.272 ] [-0.149 , 0.232  ] [-0.217 , 0.193  ]
| Row 3: [-0.085 , -0.014 ] [0.225  , 0.089  ] [-0.338 , 0.072  ] [0.416  , -0.186 ] [-0.071 , 0.110  ] [0.467  , 0.497  ]
| Row 4: [-0.336 , 0.471  ] [-0.144 , 0.070  ] [-0.062 , 0.520  ] [0.093  , 0.217  ] [-0.332 , -0.205 ] [0.012  , 0.335  ]
+-------------------------------------------------------------------------------------------+
+-------------------------------------------------------------------------------------------+
| Content of dst "V" (3 dim)

+-------------------------------------------------------------------------------------------+
| Content of dst "V" (3 dim)
| Total Elements : [ Row:4   Col:64  Layer:2   ]
+-------------------------------------------------------------------------------------------+
| Row 1: [0.302  , 0.010  ] [-0.238 , 0.680  ] [0.305  , 0.206  ] [-0.013 , 0.436  ] [-0.074 , -0.698 ] [-0.153 , -0.067 ]
| Row 2: [0.091  , 0.199  ] [0.253  , 0.151  ] [-0.557 , 0.089  ] [0.298  , -0.272 ] [-0.149 , 0.232  ] [-0.217 , 0.193  ]
| Row 3: [-0.085 , -0.014 ] [0.225  , 0.089  ] [-0.338 , 0.072  ] [0.416  , -0.186 ] [-0.071 , 0.110  ] [0.467  , 0.497  ]
| Row 4: [-0.336 , 0.471  ] [-0.144 , 0.070  ] [-0.062 , 0.520  ] [0.093  , 0.217  ] [-0.332 , -0.205 ] [0.012  , 0.335  ]
+-------------------------------------------------------------------------------------------+
+======================+======================+======================+======================+
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant