Yih-Dar Shieh

Biography

Yih-Dar Shieh is a Machine Learning Engineer at Hugging Face, where he has been a core maintainer of the Transformers library since February 2022. With over 1,370 merged commits and 1,530+ pull requests in huggingface/transformers, he is one of the most prolific contributors to the project. His work spans three major areas: adding vision and multimodal model implementations (TFViT, TFCLIPModel, FlaxVisionEncoderDecoderModel, Kosmos-2), building and maintaining CI/CD testing infrastructure (tiny model creation, PR comment CI, failure reporting, CircleCI and GitHub Actions workflows), and ensuring cross-framework parity between PyTorch, TensorFlow, and JAX/Flax models. Before joining Hugging Face, he was an AI Engineer at Biggerpan (2018-2021) working on NLP intent/entity classification. He holds a Ph.D. in Mathematics (number theory) from Aix-Marseille University (2015), with a dissertation on 'Arithmetic Aspects of Point Counting and Frobenius Distributions' supervised by David Kohel and Gilles Lachaud, and an engineering degree in Computer Science from Polytech Marseille (2018). He is based in Paris, France.

Vision TransformersMultimodal ModelsCI/CD InfrastructureTensorFlow-PyTorch ParityJAX/Flax ModelsEncoder-Decoder ArchitecturesCLIPObject-Level Grounding (Kosmos-2)Gradient AccumulationTiny Model TestingOpen Source ML InfrastructureDiffusion Models

Timeline

15 Research15 total

2026

2026-03Research

Surpassed 1,370 merged commits in huggingface/transformers, maintaining CI infrastructure and fixing model issues daily

2025

2025-10Research

Removed head masking block in vision models, continuing vision model maintenance

2024

2024-01Research

Co-authored 'Fixing Gradient Accumulation' blog post on Hugging Face, addressing mathematical equivalence issues with full batch training

2023

2023-02Research

Major rework of Transformers pipeline testing by removing PipelineTestCaseMeta

2023-10Research

Merged Kosmos-2 (Microsoft's grounding multimodal LLM) into Transformers, released in v4.35

2022

2022-02Research

Joined Hugging Face open source team as Machine Learning Engineer, focusing on CI reliability and testing infrastructure

2022-09Research

Fixed critical bugs in Diffusers library: upsample/downsample and cross-attention logic in diffusion architectures

2021

2021-08Research

Added TFEncoderDecoderModel with cross-attention support to Transformers

2021-08Research

Added FlaxVisionEncoderDecoderModel to Transformers, enabling Flax-based image captioning

2021-09Research

Added TFViTModel (TensorFlow Vision Transformer) to Transformers

2021-10Research

Added TFCLIPModel (TensorFlow CLIP) to Transformers

2021-10Research

Added TFVisionEncoderDecoderModel to Transformers

2020

2020-09Research

First merged PR in huggingface/transformers: Fix mixed precision issue in TF DistilBert

2018

2018-01Research

Completed engineering degree in Computer Science from Polytech Marseille; joined Biggerpan as AI Engineer building NLP models

2015

2015-01Research

Completed Ph.D. in Mathematics (number theory) at Aix-Marseille University, dissertation on point counting and Frobenius distributions

Key Contributions

Kosmos-2 in Transformers

Implemented Microsoft's Kosmos-2 grounding multimodal LLM in Hugging Face Transformers (v4.35), enabling object-level image-text interaction via bounding boxes. Acknowledged by Microsoft for the HuggingFace implementation and online demo.

TF/Flax Vision Model Ports

Added TFViTModel, TFCLIPModel, TFVisionEncoderDecoderModel, and FlaxVisionEncoderDecoderModel to Transformers, bringing vision and multimodal capabilities to TensorFlow and JAX/Flax frameworks.

Transformers CI/CD Infrastructure

Built and maintains the CI testing infrastructure for huggingface/transformers: PR comment CI feedback, new failure reporting, CircleCI and GitHub Actions workflows, tiny model creation scripts, and cross-framework equivalence tests.

Fixing Gradient Accumulation

Co-authored the influential Hugging Face blog post identifying and fixing a bug where gradient accumulation was not mathematically equivalent to full batch training across popular ML frameworks.

Diffusers Bug Fixes

Fixed critical upsampling, downsampling, and cross-attention bugs in the Hugging Face Diffusers library's core architecture.

Cross-Framework Parity

Systematically fixed discrepancies between PyTorch, TensorFlow, and Flax model implementations across dozens of model architectures, including loss calculation, hidden states, and attention outputs.

Flax Image Captioning (ViT-GPT2)

Created the Flax image captioning example and published the ViT-GPT2 proof-of-concept model for the FlaxVisionEncoderDecoder framework, demonstrating vision-language generation.

Notable Quotes

“

Today is my 1st day at Hugging Face open source team! My main focus is on the reliability of the ecosystem, the testing and the production readiness - tools that are used & loved by a large community and +10000 organizations.

LinkedIn, Feb 2022·Source

“

Very proud (and surprised) to see the demo of my work on (Flax) Vision Encoder Decoder model being featured.

LinkedIn, Oct 2021·Source

“

Microsoft KOSMOS-2 model is now available in Transformers v 4.35! It is a grounding multimodal large language model (MLLM) which enables interacting the text and image at the object level (via bounding boxes).

LinkedIn, Nov 2023·Source

“

Joining an open source startup has converted me from a silent user to an active contributor.

LinkedIn, Jun 2022·Source

12 sources(click to expand)

ydshieh (Yih-Dar) - GitHub profile Yih-Dar SHIEH - Hugging Face profile Yih-Dar SHIEH - LinkedIn Yih-Dar Shieh - ML Engineer at Hugging Face (The Org)Yih-Dar Shieh - Mathematics Genealogy Project (Ph.D. details)Yih-Dar Shieh - Semantic Scholar (academic publications)Kosmos-2 model documentation - Hugging Face Transformers Add Kosmos-2 model - Pull Request #24709 Fixing Gradient Accumulation - Hugging Face Blog ydshieh/kosmos-2-patch14-224 - HuggingFace model card ydshieh/vit-gpt2-coco-en - Flax ViT-GPT2 image captioning model LinkedIn: First day at Hugging Face announcement

Research generated March 19, 2026

Builders & Technical Leaders/Yih-Dar Shieh

All Profiles