Roy Ganz

About Me

I am a Ph.D. Electrical Engineer student at Technion, researching Deep Learning and Computer Vision under the supervision of Prof. Michael Elad. My interests include adversarial attacks, robustness, and generative models. I am also a computer vision research intern at Amazon.

Prior to my Ph.D. studies, I obtained my B.Sc. in Electrical Engineering from the Technion (cum laude) and worked as a chip design intern at Apple.

Publications

Docvlm: Make your vlm an efficient reader

Mor Shpigel Nacson, Aviad Aberdam, Roy Ganz, Elad Ben Avraham, Alona Golts, Yair Kittenplon, Shai Mazor, Ron Litman

CVPR 2025, in IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Navve Wasserman, Noam Rotstein, Roy Ganz, Ron Kimmel

CVPR 2025, in IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[Code]

Class-Conditioned Transformation for Enhanced Robust Image Classification

Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad, Alex M Bronstein

WACV 2025, in IEEE/CVF Winter Conference on Applications of Computer Vision.

[Code]

Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness

Maayan Ehrenberg, Roy Ganz, Nir Rosenfeld

ICLR 2025, Proceedings of the 2025 International Conference on Learning Representations.

[Code]

Question aware vision transformer for multimodal reasoning

Roy Ganz, Yair Kittenplon, Aviad Aberdam, Elad Ben Avraham, Oren Nuriel, Shai Mazor, Ron Litman

CVPR 2024 SPOTLIGHT, in IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[Code]

GRAM: Global Reasoning for Multi-Page VQA

Tsachi Blau, Sharon Fogel, Roi Ronen, Alona Golts, Roy Ganz, Elad Ben Avraham, Aviad Aberdam, Shahar Tsiper, Ron Litman

CVPR 2024, in IEEE/CVF Conference on Computer Vision and Pattern Recognition.

CLIPAG: Towards Generator-Free Text-to-Image Generation

Roy Ganz, Michael Elad

WACV 2024, in IEEE/CVF Winter Conference on Applications of Computer Vision.

[Code]

FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions

Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel

WACV 2024, in IEEE/CVF Winter Conference on Applications of Computer Vision.

[Code]

Do Perceptually Aligned Gradients Imply Adversarial Robustness?

Roy Ganz, Bahjat Kawar, Michael Elad

ICML 2023 ORAL PRESENTATION, in International Conference on Machine Learning.

[Code]

Towards Models that Can See and Read

Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman

ICCV 2023, in International Conference on Computer Vision.

CLIPTER: Looking at the Bigger Picture in Scene Text Recognition

Aviad Aberdam, David Bensaïd, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman

ICCV 2023, in International Conference on Computer Vision.

Classifier Robustness Enhancement Via Test-Time Transformation

Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad, Alex Bronstein

Preprint, arXiv:2303.15409.

Enhancing diffusion-based image synthesis with robust classifier guidance

Bahjat Kawar, Roy Ganz, Michael Elad

TMLR 2023, in Transactions on Machine Learning Research.

[Code]

BIGRoC: Boosting Image Generation via a Robust Classifier

Roy Ganz, Michael Elad

TMLR 2023, in Transactions on Machine Learning Research.

[Code]

Threat model-agnostic adversarial defense using diffusion models

Tsachi Blau, Roy Ganz, Bahjat Kawar, Alex Bronstein, Michael Elad

Preprint, arXiv:2207.08089.

Multimodal semi-supervised learning for text recognition

Aviad Aberdam, Roy Ganz, Shai Mazor, Ron Litman

Preprint, arXiv:2205.03873.

Improved Image Generation via Sparse Modeling

Roy Ganz, Michael Elad

ICLR Workshop on Deep Generative Models for Highly Structured Data.

Contact

Feel free to reach out to me through the following platforms: