-
Notifications
You must be signed in to change notification settings - Fork 348
Update SAM AMG README with more descriptive install instructions #1337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5a9f577
41725a9
617a7c2
c2c690e
6f869af
966d03d
0d407a0
ad70c00
11de15f
0b20b67
3dcb54f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,12 +41,32 @@ The 'ao' mode is a copy of the baseline with modifications to make the code more | |
### 0. Download checkpoints and install requirements | ||
|
||
``` | ||
pip install -r requirements.txt | ||
# From the top-level "ao" directory | ||
|
||
# If necessary, create and activate a virtual environment | ||
# Ex: | ||
python -m venv venv && source venv/bin/activate | ||
|
||
# Install requirements for this example | ||
pip install -r examples/sam2_amg_server/requirements.txt | ||
|
||
# If you have an older version of torch in your current environment, uninstall it first | ||
pip uninstall torch | ||
|
||
# Install torch nightly | ||
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124 | ||
|
||
# Build ao from source for now | ||
python setup.py develop | ||
philipbutler marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# On your mark, get set... | ||
cd examples/sam2_amg_server/ | ||
``` | ||
|
||
Download `sam2.1_hiera_large.pt` from https://github.com/facebookresearch/sam2?tab=readme-ov-file#download-checkpoints and put it into `~/checkpoints/sam2` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The download part I guess doesn't apply anymore now that the user can get this via the model type argument. |
||
|
||
### 1. Create a random subset of 1000 images | ||
Using images with corresponding mask annotations, like from the Segment Anything Video (SA-V) [Dataset](https://github.com/facebookresearch/sam2/tree/main/sav_dataset#download-the-dataset) is suggested, to later compare any drop in accuracy using `--furious` (using `torch.float16`). | ||
``` | ||
find sav_val -type f > sav_val_image_paths | ||
shuf -n 1000 sav_val_image_paths > sav_val_image_paths_shuf_1000 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,18 +2,18 @@ | |
|
||
# Model | ||
model: | ||
_target_: sam2.modeling.sam2_base.SAM2Base | ||
_target_: torchao._models.sam2.modeling.sam2_base.SAM2Base | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, these are the sam2 configs. There's sam2 and then there's sam2.1. I had to switched to sam2.1, but maybe sam2 is still useful? I updated the sam2.1 configs in b2e42ff#diff-12caf984b30e7028df7fb867a984c5ccad3fde0a0783df2af26db9cef860fbe5 , but in any case it doesn't hurt to have sam2 support as well. |
||
image_encoder: | ||
_target_: sam2.modeling.backbones.image_encoder.ImageEncoder | ||
_target_: torchao._models.sam2.modeling.backbones.image_encoder.ImageEncoder | ||
scalp: 1 | ||
trunk: | ||
_target_: sam2.modeling.backbones.hieradet.Hiera | ||
_target_: torchao._models.sam2.modeling.backbones.hieradet.Hiera | ||
embed_dim: 112 | ||
num_heads: 2 | ||
neck: | ||
_target_: sam2.modeling.backbones.image_encoder.FpnNeck | ||
_target_: torchao._models.sam2.modeling.backbones.image_encoder.FpnNeck | ||
position_encoding: | ||
_target_: sam2.modeling.position_encoding.PositionEmbeddingSine | ||
_target_: torchao._models.sam2.modeling.position_encoding.PositionEmbeddingSine | ||
num_pos_feats: 256 | ||
normalize: true | ||
scale: null | ||
|
@@ -24,17 +24,17 @@ model: | |
fpn_interp_model: nearest | ||
|
||
memory_attention: | ||
_target_: sam2.modeling.memory_attention.MemoryAttention | ||
_target_: torchao._models.sam2.modeling.memory_attention.MemoryAttention | ||
d_model: 256 | ||
pos_enc_at_input: true | ||
layer: | ||
_target_: sam2.modeling.memory_attention.MemoryAttentionLayer | ||
_target_: torchao._models.sam2.modeling.memory_attention.MemoryAttentionLayer | ||
activation: relu | ||
dim_feedforward: 2048 | ||
dropout: 0.1 | ||
pos_enc_at_attn: false | ||
self_attention: | ||
_target_: sam2.modeling.sam.transformer.RoPEAttention | ||
_target_: torchao._models.sam2.modeling.sam.transformer.RoPEAttention | ||
rope_theta: 10000.0 | ||
feat_sizes: [32, 32] | ||
embedding_dim: 256 | ||
|
@@ -45,7 +45,7 @@ model: | |
pos_enc_at_cross_attn_keys: true | ||
pos_enc_at_cross_attn_queries: false | ||
cross_attention: | ||
_target_: sam2.modeling.sam.transformer.RoPEAttention | ||
_target_: torchao._models.sam2.modeling.sam.transformer.RoPEAttention | ||
rope_theta: 10000.0 | ||
feat_sizes: [32, 32] | ||
rope_k_repeat: True | ||
|
@@ -57,23 +57,23 @@ model: | |
num_layers: 4 | ||
|
||
memory_encoder: | ||
_target_: sam2.modeling.memory_encoder.MemoryEncoder | ||
_target_: torchao._models.sam2.modeling.memory_encoder.MemoryEncoder | ||
out_dim: 64 | ||
position_encoding: | ||
_target_: sam2.modeling.position_encoding.PositionEmbeddingSine | ||
_target_: torchao._models.sam2.modeling.position_encoding.PositionEmbeddingSine | ||
num_pos_feats: 64 | ||
normalize: true | ||
scale: null | ||
temperature: 10000 | ||
mask_downsampler: | ||
_target_: sam2.modeling.memory_encoder.MaskDownSampler | ||
_target_: torchao._models.sam2.modeling.memory_encoder.MaskDownSampler | ||
kernel_size: 3 | ||
stride: 2 | ||
padding: 1 | ||
fuser: | ||
_target_: sam2.modeling.memory_encoder.Fuser | ||
_target_: torchao._models.sam2.modeling.memory_encoder.Fuser | ||
layer: | ||
_target_: sam2.modeling.memory_encoder.CXBlock | ||
_target_: torchao._models.sam2.modeling.memory_encoder.CXBlock | ||
dim: 256 | ||
kernel_size: 7 | ||
padding: 3 | ||
|
Uh oh!
There was an error while loading. Please reload this page.