MultiModal Material Estimation

A multimodal material estimation project that utilizes both audio and visual information for material classification.

Quick Start

1. Pull Docker Image

Pull the Docker image from Docker Hub:

docker pull timttu/multimodal-material-estimation:latest

Alternatively, you can download the image from the Docker Hub repository (see links below).

2. Run Docker Container

After pulling the image, run the container with GPU support:

docker run -it --gpus all timttu/multimodal-material-estimation:latest /bin/bash

3. Using Checkpoints

Once inside the Docker container, you can choose from different checkpoint options:

Option A: Original Training Checkpoint

If you want to use the original checkpoint from the initial training, you can find it at the following path:

/MultiModalMaterialEstimation/ckpt.pth

This checkpoint achieves approximately 90% accuracy.

Option B: Optimized Checkpoint (Recommended)

For better performance, you can use the optimized checkpoint that has been fine-tuned with different weight configurations:

/workspace/checkpoints/model_ckpt_finetune.pth

This checkpoint achieves approximately 92% accuracy through weight optimization and fine-tuning.

Usage

Training

To train the model:

python train.py --config config.json

Testing

To test the model with a specific checkpoint:

python test.py --config config_test.json --ckpt_path [checkpoint_path]

Project Structure

train.py: Model training script
test.py: Model testing script
dataset_utils.py: Dataset processing utilities
config.json: Training configuration file
config_test.json: Testing configuration file

Dependencies

The project runs in a Docker container with all necessary dependencies pre-installed:

PyTorch 1.10.0
CUDA 11.3
Transformers
OpenAI Whisper
CLIP
Other related dependencies

Requirements

Docker installed on your system
NVIDIA Docker runtime (for GPU support)
Use the --gpus all flag when running the container to enable GPU support

Links

Docker Image Repository: Docker Hub - MultiModal Material Estimation
Original Checkpoint Path: /MultiModalMaterialEstimation/ckpt.pth (approximately 90% accuracy)
Optimized Checkpoint Path: /workspace/checkpoints/model_ckpt_finetune.pth (approximately 92% accuracy)

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
only_visual		only_visual
AudioMaterialPrediction.ipynb		AudioMaterialPrediction.ipynb
Dockerfile		Dockerfile
README.md		README.md
aud.wav		aud.wav
config.json		config.json
config_test.json		config_test.json
dataset_utils.py		dataset_utils.py
greatest_hits.ipynb		greatest_hits.ipynb
img.jpg		img.jpg
preprocessing.sh		preprocessing.sh
process_greatest_hits.py		process_greatest_hits.py
requirements.txt		requirements.txt
setup_after_docker.sh		setup_after_docker.sh
test.py		test.py
test_set_times_info.json		test_set_times_info.json
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MultiModal Material Estimation

Quick Start

1. Pull Docker Image

2. Run Docker Container

3. Using Checkpoints

Option A: Original Training Checkpoint

Option B: Optimized Checkpoint (Recommended)

Usage

Training

Testing

Project Structure

Dependencies

Requirements

Links

About

Uh oh!

Releases

Packages

Languages

ExistentialRobotics/MultiModalMaterialEstimation

Folders and files

Latest commit

History

Repository files navigation

MultiModal Material Estimation

Quick Start

1. Pull Docker Image

2. Run Docker Container

3. Using Checkpoints

Option A: Original Training Checkpoint

Option B: Optimized Checkpoint (Recommended)

Usage

Training

Testing

Project Structure

Dependencies

Requirements

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages