Skip to content
This repository was archived by the owner on Jun 4, 2025. It is now read-only.

Commit a62f034

Browse files
bfineranBenjamin
authored andcommitted
Update base to transformers v4.34.1 (#95)
(previous commits) * Add recipe_name to default file names * Upgrade to transformers release V4.30.2 (#62) * Update trainer and model flows to accommodate sparseml Disable FP16 on QAT start (#12) * Override LRScheduler when using LRModifiers * Disable FP16 on QAT start * keep wrapped scaler object for training after disabling Using QATMatMul in DistilBERT model class (#41) Removed double quantization of output of context layer. (#45) Fix DataParallel validation forward signatures (#47) * Fix: DataParallel validation forward signatures * Update: generalize forward_fn selection Best model after epoch (#46) fix sclaer check for non fp16 mode in trainer (#38) Mobilebert QAT (#55) * Remove duplicate quantization of vocabulary. enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9) * Utils and auxillary changes update Zoo stub loading for SparseZoo 1.1 refactor (#54) add flag to signal NM integration is active (#32) Add recipe_name to file names * Fix errors introduced in manual cherry-pick upgrade Co-authored-by: Benjamin Fineran <[email protected]> * update build versions for NM fork pypi push (#74) * fix nightly package name (#75) * add make build command (#76) * add GHA workflow files to build nightly and release packages (#77) * add GHA workflow files to build nightly and release packages * fix name --------- Co-authored-by: dhuang <[email protected]> * bump up version to 1.6.0 (#79) Co-authored-by: dhuang <[email protected]> --------- Co-authored-by: Konstantin <[email protected]> Co-authored-by: Konstantin Gulin <[email protected]> Co-authored-by: dhuangnm <[email protected]> Co-authored-by: dhuang <[email protected]> minor improvements for build workflow files (#83) Co-authored-by: dhuang <[email protected]> fix minor issue (#84) Co-authored-by: dhuang <[email protected]> OPT with quantizable MatMuls (#85) fix a minor issue for release build (#86) Co-authored-by: dhuang <[email protected]> update version in version.py Testmo (#91) * improve GHA workflow files to build nightly and release, and report status to testmo * clean up * report exit code * Assign value to exit_code --------- Co-authored-by: dhuang <[email protected]> Update trainer.py - fix DistributedSampler import (#93) DistributedSampler is used but not imported in `trainer.py` Research/llama/bmm quantization (#94) * Quantize attention matmuls * Quantize attention matmuls bump base transformers version
1 parent acc394c commit a62f034

File tree

17 files changed

+492
-30
lines changed

17 files changed

+492
-30
lines changed
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
name: build-nightly
2+
run-name: ${{ github.workflow }} is to create nightly wheel file for pypi
3+
on:
4+
push:
5+
branches:
6+
- 'main'
7+
schedule:
8+
- cron: '0 0 * * *'
9+
workflow_dispatch:
10+
11+
12+
jobs:
13+
14+
BUILD-TRANSFORMERS-NIGHTLY:
15+
16+
uses: ./.github/workflows/util.yml
17+
with:
18+
runs_on: ubuntu-22.04
19+
run_id: ${{ github.run_id }}
20+
build_type: nightly
21+
testmo_project_id: 9
22+
secrets: inherit
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
name: build-release
2+
run-name: ${{ github.workflow }} is to create release wheel file for pypi
3+
on:
4+
push:
5+
branches:
6+
- 'release/[0-9]+.[0-9]+'
7+
workflow_dispatch:
8+
9+
jobs:
10+
11+
BUILD-TRANSFORMERS-RELEASE:
12+
13+
uses: ./.github/workflows/util.yml
14+
with:
15+
runs_on: ubuntu-22.04
16+
run_id: ${{ github.run_id }}
17+
build_type: release
18+
testmo_project_id: 9
19+
secrets: inherit

.github/workflows/util.yml

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
name: report-to-testmo
2+
on:
3+
workflow_call:
4+
inputs:
5+
runs_on:
6+
description: "runner label specifying instance running the job"
7+
type: string
8+
required: true
9+
run_id:
10+
description: "run id provided by GHA"
11+
required: true
12+
type: string
13+
build_type:
14+
description: "build type: nightly or release"
15+
type: string
16+
required: true
17+
testmo_project_id:
18+
description: "testmo project id"
19+
type: string
20+
required: true
21+
22+
jobs:
23+
24+
BUILD:
25+
runs-on: ${{ inputs.runs_on }}
26+
outputs:
27+
status: ${{ steps.build.outputs.status }}
28+
commitid: ${{ steps.build.outputs.commitid }}
29+
permissions:
30+
id-token: write
31+
contents: read
32+
steps:
33+
34+
- name: repo checkout
35+
uses: actions/checkout@v3
36+
37+
- name: s3
38+
uses: aws-actions/configure-aws-credentials@v2
39+
with:
40+
role-to-assume: ${{ secrets.AWS_WEBIDENTITY_FOR_GITHUB_ACTIONS }}
41+
aws-region: us-east-1
42+
43+
- name: build
44+
id: build
45+
run: |
46+
pwd
47+
sudo apt-get -y install python3-pip
48+
pip3 --version
49+
sudo pip3 install virtualenv
50+
virtualenv venv
51+
source venv/bin/activate
52+
pip install -e .
53+
if [[ "${{ inputs.build_type }}" = release ]]; then
54+
sed -i 's/is_release = False/is_release = True/g' src/${{ github.event.repository.name }}/version.py
55+
fi
56+
status=$(make -B build || echo 'FAILED')
57+
deactivate
58+
echo "=========== Build log ==========="
59+
echo "${status}"
60+
echo "commitid=${GITHUB_SHA:0:7}" >> "$GITHUB_OUTPUT"
61+
echo "=========== Build status ==========="
62+
if [[ "${status}" = "FAILED" ]]; then
63+
echo "${{ github.event.repository.name }} build failed"
64+
echo "status=failed" >> "$GITHUB_OUTPUT"
65+
exit 1
66+
else
67+
echo "${{ github.event.repository.name }} build success"
68+
fi
69+
echo "=========== Generated build ==========="
70+
ls dist/
71+
echo "=========== Copy build to S3 ==========="
72+
aws s3 cp dist/*.whl s3://nm-github-actions/${{ github.event.repository.name }}/
73+
if [ $? -eq 0 ]; then
74+
echo "ok: copied to s3://nm-github-actions/${{ github.event.repository.name }}/"
75+
echo "status=success" >> "$GITHUB_OUTPUT"
76+
else
77+
echo "failed: copied to s3://nm-github-actions/${{ github.event.repository.name }}/"
78+
echo "status=failed" >> "$GITHUB_OUTPUT"
79+
exit 1
80+
fi
81+
oldDate=`date --date='-2 month' +%Y%m%d`
82+
oldWhl=`(aws s3 ls s3://nm-github-actions/${{ github.event.repository.name }}/ | grep nightly | grep "${oldDate}") || echo "notfound"`
83+
if [[ "${oldWhl}" != 'notfound' ]]; then
84+
for oldwhl in $(echo "${oldWhl}" | awk '{print $4}')
85+
do
86+
echo "Remove old build ${oldwhl} in S3"
87+
aws s3 rm s3://nm-github-actions/${{ github.event.repository.name }}/${oldwhl}
88+
done
89+
fi
90+
91+
TESTMO:
92+
if: success() || failure()
93+
needs: BUILD
94+
runs-on: ${{ inputs.runs_on }}
95+
steps:
96+
97+
- id: report
98+
run: |
99+
echo "node: $(node -v)"
100+
echo "npm: $(npm -v)"
101+
echo "Installing testmo cli..."
102+
sudo npm install -g @testmo/testmo-cli
103+
export TESTMO_TOKEN=${{ secrets.TESTMO_TEST_TOKEN }}
104+
TESTMO_URL="https://neuralmagic.testmo.net"
105+
todaytime=`date +%Y%m%d`
106+
name="${{ github.event.repository.name }} ${{ inputs.build_type }} ${todaytime} ${{ needs.BUILD.outputs.commitid }} RunID:${{ inputs.run_id }}"
107+
echo "========== Build info ==========="
108+
echo "name: ${name}"
109+
echo "build status: ${{ needs.BUILD.outputs.status }}"
110+
echo "<status>${{ needs.BUILD.outputs.status }}</status>" > result.xml
111+
exit_code=1
112+
if [[ "${{ needs.BUILD.outputs.status }}" = "success" ]]; then
113+
exit_code=0
114+
fi
115+
echo "echo \"GHA job ${{ needs.BUILD.outputs.status }}: https://github.com/neuralmagic/${{ github.event.repository.name }}/actions/runs/${{ inputs.run_id }}\"; exit ${exit_code}" > result.sh
116+
echo "========== Report to testmo ==========="
117+
echo "testmo automation:run:submit \\"
118+
echo " --instance ${TESTMO_URL} \\"
119+
echo " --project-id ${{ inputs.testmo_project_id }} \\"
120+
echo " --name ${name} \\"
121+
echo " --source ${{ github.event.repository.name }} \\"
122+
echo " --results result.xml"
123+
testmo automation:run:submit \
124+
--instance "${TESTMO_URL}" \
125+
--project-id ${{ inputs.testmo_project_id }} \
126+
--name "${name}" \
127+
--source "${{ github.event.repository.name }}" \
128+
--results result.xml \
129+
-- bash result.sh

Makefile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.PHONY: deps_table_update modified_only_fixup extra_style_checks quality style fixup fix-copies test test-examples
1+
.PHONY: deps_table_update modified_only_fixup extra_style_checks quality style fixup fix-copies test test-examples build
22

33
# make sure to test the local checkout in scripts and not the pre-installed one (don't use quotes!)
44
export PYTHONPATH = src
@@ -119,3 +119,7 @@ build-release:
119119
python setup.py bdist_wheel
120120
python setup.py sdist
121121
python utils/check_build.py
122+
123+
# neuralmagic: creates wheel file
124+
build:
125+
python3 setup.py sdist bdist_wheel

setup.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -423,17 +423,22 @@ def run(self):
423423
deps["tqdm"], # progress bars in model download and training scripts
424424
]
425425

426+
# default variable to be overwritten by the version.py file
427+
version = "unknown"
428+
# load and overwrite version and release info from version.py
429+
exec(open(os.path.join("src", "transformers", "version.py")).read())
430+
426431
setup(
427-
name="transformers",
428-
version="4.34.1", # expected format is one of x.y.z.dev0, or x.y.z.rc1 or x.y.z (no to dashes, yes to dots)
432+
name="nm-transformers" if is_release else "nm-transformers-nightly",
433+
version=version, # major.minor.patch to match NM repos, fourth entry is either transformers base version or nightly date
429434
author="The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)",
430435
author_email="[email protected]",
431436
description="State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow",
432437
long_description=open("README.md", "r", encoding="utf-8").read(),
433438
long_description_content_type="text/markdown",
434439
keywords="NLP vision speech deep learning transformer pytorch tensorflow jax BERT GPT-2 Wav2Vec2 ViT",
435440
license="Apache 2.0 License",
436-
url="https://github.com/huggingface/transformers",
441+
url="https://github.com/neuralmagic/transformers",
437442
package_dir={"": "src"},
438443
packages=find_packages("src"),
439444
include_package_data=True,

src/transformers/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
# to defer the actual importing for when the objects are requested. This way `import transformers` provides the names
1919
# in the namespace without actually importing anything (and especially none of the backends).
2020

21-
__version__ = "4.34.1"
21+
from .version import *
2222

2323
from typing import TYPE_CHECKING
2424

src/transformers/hf_argparser.py

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,16 @@
2323
from pathlib import Path
2424
from typing import Any, Callable, Dict, Iterable, List, Literal, NewType, Optional, Tuple, Union, get_type_hints
2525

26+
import os
2627
import yaml
2728

29+
from sparsezoo import Model
30+
31+
from .utils.logging import get_logger
32+
33+
34+
logger = get_logger(__name__)
35+
2836

2937
DataClass = NewType("DataClass", Any)
3038
DataClassType = NewType("DataClassType", Any)
@@ -341,12 +349,17 @@ def parse_args_into_dataclasses(
341349
# additional namespace.
342350
outputs.append(namespace)
343351
if return_remaining_strings:
344-
return (*outputs, remaining_args)
352+
return tuple(
353+
*[_download_dataclass_zoo_stub_files(output) for output in outputs],
354+
remaining_args,
355+
)
345356
else:
346357
if remaining_args:
347358
raise ValueError(f"Some specified arguments are not used by the HfArgumentParser: {remaining_args}")
348359

349-
return (*outputs,)
360+
return tuple(
361+
[_download_dataclass_zoo_stub_files(output) for output in outputs]
362+
)
350363

351364
def parse_dict(self, args: Dict[str, Any], allow_extra_keys: bool = False) -> Tuple[DataClass, ...]:
352365
"""
@@ -374,7 +387,9 @@ def parse_dict(self, args: Dict[str, Any], allow_extra_keys: bool = False) -> Tu
374387
outputs.append(obj)
375388
if not allow_extra_keys and unused_keys:
376389
raise ValueError(f"Some keys are not used by the HfArgumentParser: {sorted(unused_keys)}")
377-
return tuple(outputs)
390+
return tuple(
391+
[_download_dataclass_zoo_stub_files(output) for output in outputs]
392+
)
378393

379394
def parse_json_file(self, json_file: str, allow_extra_keys: bool = False) -> Tuple[DataClass, ...]:
380395
"""
@@ -417,3 +432,28 @@ def parse_yaml_file(self, yaml_file: str, allow_extra_keys: bool = False) -> Tup
417432
"""
418433
outputs = self.parse_dict(yaml.safe_load(Path(yaml_file).read_text()), allow_extra_keys=allow_extra_keys)
419434
return tuple(outputs)
435+
436+
def _download_dataclass_zoo_stub_files(data_class: DataClass):
437+
for name, val in data_class.__dict__.items():
438+
if not isinstance(val, str) or "recipe" in name or not val.startswith("zoo:"):
439+
continue
440+
441+
logger.info(f"Downloading framework files for SparseZoo stub: {val}")
442+
443+
zoo_model = Model(val)
444+
framework_file_paths = [file.path for file in zoo_model.training.default.files]
445+
assert framework_file_paths, "Unable to download any framework files for SparseZoo stub {val}"
446+
framework_file_names = [os.path.basename(path) for path in framework_file_paths]
447+
if "pytorch_model.bin" not in framework_file_names or ("config.json" not in framework_file_names):
448+
raise RuntimeError(
449+
"Unable to find 'pytorch_model.bin' and 'config.json' in framework "
450+
f"files downloaded from {val}. Found {framework_file_names}. Check "
451+
"if the given stub is for a transformers repo model"
452+
)
453+
framework_dir_path = Path(framework_file_paths[0]).parent.absolute()
454+
455+
logger.info(f"Overwriting argument {name} to downloaded {framework_dir_path}")
456+
457+
data_class.__dict__[name] = str(framework_dir_path)
458+
459+
return data_class

src/transformers/models/bert/modeling_bert.py

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,22 @@ def forward(
241241
return embeddings
242242

243243

244+
class QATMatMul(nn.Module):
245+
def __init__(self):
246+
super().__init__()
247+
248+
# behaves like normal torch.matmul unless a SparseML QuantizationModifier
249+
# is initialized
250+
self.wrap_qat = True
251+
self.qat_wrapper_kwargs = {
252+
"num_inputs": 2,
253+
"input_qconfigs": ["asymmetric", "symmetric"],
254+
}
255+
256+
def forward(self, a: torch.Tensor, b: torch.Tensor):
257+
return torch.matmul(a, b)
258+
259+
244260
class BertSelfAttention(nn.Module):
245261
def __init__(self, config, position_embedding_type=None):
246262
super().__init__()
@@ -258,6 +274,11 @@ def __init__(self, config, position_embedding_type=None):
258274
self.key = nn.Linear(config.hidden_size, self.all_head_size)
259275
self.value = nn.Linear(config.hidden_size, self.all_head_size)
260276

277+
# non-parameterized matmuls will behave as normal torch.matmul ops unless
278+
# Quantization-Aware-Training is invoked
279+
self.attention_scores_matmul = QATMatMul()
280+
self.context_layer_matmul = QATMatMul()
281+
261282
self.dropout = nn.Dropout(config.attention_probs_dropout_prob)
262283
self.position_embedding_type = position_embedding_type or getattr(
263284
config, "position_embedding_type", "absolute"
@@ -322,7 +343,7 @@ def forward(
322343
past_key_value = (key_layer, value_layer)
323344

324345
# Take the dot product between "query" and "key" to get the raw attention scores.
325-
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
346+
attention_scores = self.attention_scores_matmul(query_layer, key_layer.transpose(-1, -2))
326347

327348
if self.position_embedding_type == "relative_key" or self.position_embedding_type == "relative_key_query":
328349
query_length, key_length = query_layer.shape[2], key_layer.shape[2]
@@ -362,7 +383,7 @@ def forward(
362383
if head_mask is not None:
363384
attention_probs = attention_probs * head_mask
364385

365-
context_layer = torch.matmul(attention_probs, value_layer)
386+
context_layer = self.context_layer_matmul(attention_probs, value_layer)
366387

367388
context_layer = context_layer.permute(0, 2, 1, 3).contiguous()
368389
new_context_layer_shape = context_layer.size()[:-2] + (self.all_head_size,)

0 commit comments

Comments
 (0)