Skip to content

Conversation

@mart-r
Copy link
Collaborator

@mart-r mart-r commented Sep 30, 2025

This is the working_with_cogstack replacement.

It will use medcat_den under the hood for centralised model storage.

  • Convert relevant parts from working_with_cogstack
  • Add workflows for scripts and notebooks
  • Depend on PyPI-based medcat-den
  • Test against changes on core lib
  • Distribution + relevant documentation

@tomolopolis
Copy link
Member

@alhendrickson
Copy link
Collaborator

Hey can you add some detail on why some of the files were left out?

EG I see that 1_create_model and 2_train_model aren't included.

One guess is that these are both completely covered in the tutorials as well.

Copy link
Collaborator

@alhendrickson alhendrickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved - just wondering if this is the compete set of files or if you should add the others. Would probably prefer to move unnecessary ones, as we can delete later, over accidentally missing some useful ones, so that we dont have to go through the other repo as much

@mart-r
Copy link
Collaborator Author

mart-r commented Oct 28, 2025

Hey can you add some detail on why some of the files were left out?

EG I see that 1_create_model and 2_train_model aren't included.

One guess is that these are both completely covered in the tutorials as well.

The creation of new models isn't really something most end users do. The idea was to only keep things that are actually relevant to the majority of the people who used working_with_cogstack. Plus, this is mostly in cogstack-ops/medcat-snomed-model-creation now.

I renamed the training to finetune_models and didn't include unsupervised training parts because - again - it's not something most users seem to be using. They just build with supervised training on top of our base models. The self-supervised training also mostly happens in cogstack-ops/medcat-snomed-model-creation now.

With that said, there is some meta cat training that I didn't port over. And I can't quite recall why that is. I'll see if I've got something documented.

@mart-r
Copy link
Collaborator Author

mart-r commented Oct 28, 2025

@alhendrickson
I added a MetaCAT example as well from WWC.

The other one doesn't really fit in the way that it's laid out.

Copy link
Collaborator

@alhendrickson alhendrickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, thanks for explaining

@mart-r mart-r merged commit 45c6fed into main Oct 29, 2025
10 checks passed
@mart-r mart-r deleted the feat/medcat-scripts/CU-869anj8ub-add-medcat-scripts branch October 29, 2025 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants