Skip to content

Conversation

mgoldey
Copy link
Contributor

@mgoldey mgoldey commented Mar 23, 2020

This PR presents an example seq2seq use case and bug fixes necessary for this to execute with reasonable accuracy.

The utils_seq2seq.py file defines the data format for training data, and the run_seq2seq.py file takes training, development, and test data and produces a model. The README.md discusses how to execute this toy problem. The specific toy problem in use here is formatting a date string to the American style, which is a trivial example. On my local setup using GPUs, this example executes within 5 minutes. Production models should include more learnings.

I welcome feedback about how to strengthen performance here and the best route to increase testing.

This relies on a few bug fixes which have been incorporated in this branch

@mgoldey
Copy link
Contributor Author

mgoldey commented Mar 23, 2020

As a non-blocking question, I do note that a lot of the examples use argparse to parse comparatively long lists of arguments. I've maintained the extant style in this PR to avoid causing noise and confusion Would it be acceptable if I broke with this style to use a JSON file to store all the arguments for an experiment?

@patrickvonplaten
Copy link
Contributor

Hi @mgoldey, sorry for only responding now. Thanks a lot for adding a seq2seq example :-) I will take a look early next week and maybe we can have a quick chat how to merge this PR and #3383.

@mgoldey
Copy link
Contributor Author

mgoldey commented Mar 27, 2020

That sounds good. I'm still tweaking things on my end for accuracy and improved logic as I get more familiar with the code base here. I'll see if I can rebase of #3383 by then, depending on my other workload. Feel free to reach out via google hangouts if you're comfortable.

@mgoldey mgoldey changed the title seq2seq example [WIP] seq2seq example Mar 27, 2020
@mgoldey mgoldey changed the base branch from master to clean_encoder_decocer_modeling April 23, 2020 15:55
features = []

cls_token, sep_token, pad_token = cls_token = (
tokenizer.cls_token,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I've been using your example to guide my own setup of a seq2seq model (thank you!) Is this line in your pull request a typo? Think it's just supposed to be cls_token, sep_token, pad_token = ( ...

formatted_tokens += [pad_token_label_id]
segment_ids += [cls_token_segment_id]
# gpt2 has no cls_token
elif model_type not in ["gpt2"]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is supposed to be

elif model_type in ['gpt2']

instead of "not in"

@patrickvonplaten
Copy link
Contributor

Sorry, to answer only now! I'll will soon add a Encoder-Decoder google colab that shows how to use seq2seq

@mgoldey
Copy link
Contributor Author

mgoldey commented Jun 3, 2020

Thanks - fine to close. We've moved forward without using seq2seq due to poor overall accuracy with the scale of data in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants