starcoder2 : change rope type to neox #1

ggerganov · 2024-03-01T13:14:25Z

Did some anecdotal tests and this seems to improve the results. Will cross-check with the reference implementation to confirm that this is correct

make -j && ./main -m models/starcoder2-3b/ggml-model-f16.gguf -p "#python code for efficient implemetation of two_sum\ndef two_sum(arr, target_sum):\n" -n 256 -e --temp 0 -ngl 99 --verbose-prompt

system_info: n_threads = 16 / 24 | AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 | 

main: prompt: '#python code for efficient implemetation of two_sum
def two_sum(arr, target_sum):
'
main: number of tokens in prompt = 26
    40 -> '#'
  2980 -> 'python'
  1361 -> ' code'
   456 -> ' for'
 17505 -> ' efficient'
  1378 -> ' imp'
   293 -> 'le'
  2580 -> 'met'
   387 -> 'ation'
   451 -> ' of'
  3161 -> ' two'
   100 -> '_'
  1055 -> 'sum'
   222 -> '
'
   610 -> 'def'
  3161 -> ' two'
   100 -> '_'
  1055 -> 'sum'
    45 -> '('
   865 -> 'arr'
    49 -> ','
  1780 -> ' target'
   100 -> '_'
  1055 -> 'sum'
   731 -> '):'
   222 -> '
'

sampling: 
	repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
	top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.000
	mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order: 
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature 
generate: n_ctx = 512, n_batch = 512, n_predict = 256, n_keep = 0


#python code for efficient implemetation of two_sum
def two_sum(arr, target_sum):
	length = len(arr)
	for i in range (0, length-1):
		for j in range (i+1, length):
			if arr[i] + arr[j] == target_sum:
				return [i, j]
	return [-1,-1]

#python code for efficient implemetation of three_sum
def three_sum(arr, target_sum):
	length = len(arr)
	for i in range (0, length-2):
		for j in range (i+1, length-1):
			for k in range (j+1, length):
				if arr[i] + arr[j] + arr[k] == target_sum:
					return [i, j, k]
	return [-1,-1,-1]

#python code for efficient implemetation of four_sum
def four_sum(arr, target_sum):
	length = len(arr)
	for i in range (0, length-3):
		for j in range (i+1, length-2):
			for k in range (j+
llama_print_timings:        load time =     224.24 ms
llama_print_timings:      sample time =      33.05 ms /   256 runs   (    0.13 ms per token,  7745.84 tokens per second)
llama_print_timings: prompt eval time =      41.90 ms /    26 tokens (    1.61 ms per token,   620.51 tokens per second)
llama_print_timings:        eval time =    4009.60 ms /   255 runs   (   15.72 ms per token,    63.60 tokens per second)
llama_print_timings:       total time =    4118.97 ms /   281 tokens

pacman100 · 2024-03-01T13:35:53Z

Thank you @ggerganov! 😄

llama : change starcoder2 rope type

9862d59

pacman100 merged commit 15f233b into pacman100:smangrul/add-starcoder2-support Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

starcoder2 : change rope type to neox #1

starcoder2 : change rope type to neox #1

Uh oh!

ggerganov commented Mar 1, 2024 •

edited

Loading

Uh oh!

pacman100 commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

starcoder2 : change rope type to neox #1

starcoder2 : change rope type to neox #1

Uh oh!

Conversation

ggerganov commented Mar 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pacman100 commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov commented Mar 1, 2024 •

edited

Loading