|
Hector Garcia Rodriguez
I am a ELLIS PhD student at Marcus and Anna Rohrbach's Multimodal AI Lab. I am co-advised by Hervé Jégou.
Previously, I was a research engineer on efficient deep learning at Huawei Zurich. I completed an MSc Machine Learning at University College London, graduating on the Dean's List (top 5%). During my MSc thesis, I was advised by Timoleon Moraitis and Pontus Stenetorp. Previously, I interned as a Software Development Engineer in Amazon Web Services, and obtained a BSc Theoretical Physics from UCL with first-class honours.
Scholar  / 
LinkedIn  / 
Twitter  / 
Github
|
|
|
Research
I'm interested in multimodal representation learning: improving efficiency and reliability using adaptable networks with adjustable compute budgets, and using more contextualised representations for sequential decision making tasks.
|
|
|
Chrono: A Simple Blueprint for Representing Time in MLLMs
Hector Garcia Rodriguez*,
Boris Meinardus*,
Anil Batra,
Anna Rohrbach,
Marcus Rohrbach
arXiv preprint (under review)
arXiv /
pdf /
code
We enable MLLMs to understand time in videos by timestamping.
This achieves state-of-the-art on moment-retrieval (Charades-STA, QVHighlights, ActivityNet Captions) and grounded video QA (NExT-GQA), in both zero-shot (GPT-4o) and fine-tuned (BLIP-2) settings.
|
|
|
Hebbian Deep Learning Without Feedback
Adrien Journé,
Hector Garcia Rodriguez,
Qinghai Guo,
Timoleon Moraitis
ICLR notable-top-25% (spotlight), 2023
arXiv /
code /
talk
We train deep ConvNets with an unsupervised Hebbian soft winner-take-all algorithm, multilayer SoftHebb.
It sets SOTA results in image classification in CIFAR-10, STL-10 and ImageNet for other biologically plausible networks.
SoftHebb increases biological compatibility, parallelisation and performance of state-of-the-art bio-plausible learning.
|
|