Video
YouTube Presentation
Full walkthrough of the assignment structure, decisions, and results.
Open Presentation Update this URL before publishing.Assignment 1
Assignment 1 compares baseline and advanced methods across image, text, and multimodal settings, with emphasis on dataset understanding, model design, and evaluation quality.
ImageResNet50 vs ViT on Caltech-256
TextLSTM vs Transformer encoder
MultimodalZero-shot CLIP vs few-shot adaptation
Replace the placeholder URLs below with your final public links.
Video
Full walkthrough of the assignment structure, decisions, and results.
Open Presentation Update this URL before publishing.Video
Recorded demo of the application and experiment outputs in action.
Open Demo Update this URL before publishing.App
Unified Assignment 1 app with image, text, and multimodal experiment views.
Open Streamlit App Update this URL before publishing.Open a track to review dataset EDA, model backbone, methodology, and evaluation pages.
Track 01
Caltech-256 benchmark comparing ResNet50 and Vision Transformer behavior.
Open Image PagesTrack 02
Sequence modeling study contrasting LSTM baselines with transformer encoders.
Open Text PagesTrack 03
CLIP-based zero-shot and few-shot experiments with report-driven error analysis.
Open Multimodal PagesMethod Frame
Each track compares a conventional baseline against a more expressive model family.
Evaluation Frame
Results are not limited to one score. The report surfaces class-level behavior and confusion structure.
Delivery Frame
The assignment combines static report pages with an interactive Streamlit experience.
| Track | Baseline | Advanced | Main Review Lens |
|---|---|---|---|
| Image | ResNet50 | Vision Transformer | Representation quality, class separation, and confusion stability. |
| Text | LSTM | Transformer Encoder | Sequence understanding, minority-class behavior, and recall trade-offs. |
| Multimodal | Zero-shot CLIP | Few-shot adaptation | Data efficiency, prompt sensitivity, and practical inference quality. |
Use this page as the entry point to understand the scope and choose a track.
Each track page continues into EDA, backbone, methodology, and results sections.
Use the external links section for presentation, demo, and deployed application access.